AutoHotkey Homepage AutoHotkey Community
Let's help each other out
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

How can I "peek" into a video file's source code?
Goto page 1, 2  Next
 
Reply to topic    AutoHotkey Community Forum Index -> General Chat
View previous topic :: View next topic  
Author Message
Benny-D



Joined: 29 Feb 2008
Posts: 865

PostPosted: Sun Aug 09, 2009 9:09 pm    Post subject: How can I "peek" into a video file's source code? Reply with quote

This question will probably sound silly to you, but my curiosity seems to be overcoming the fear of looking silly, so here I go:

Let’s say I have a video file (.wmv for example), how can I "peek" into its source code? I mean in computers everything boils down to 1's and 0's, right? Or at least to some apparantly meaningless string of some characters that (in my example) can be turned into a video, right? So how can I see that "meaningless" source code of a video file?
Back to top
View user's profile Send private message
tidbit



Joined: 09 Mar 2008
Posts: 1807
Location: Minnesota, USA

PostPosted: Sun Aug 09, 2009 10:34 pm    Post subject: Reply with quote

open it in a text editor.
anything larger then 100KB you might want to open in something other then notepad. such as notepad++.

that's the best I know.

programmers code (c, c++, python...etc) boils down to machine code (not readable by humans). machine code boils down to binary code.
What you see in a text editor MIGHT be machine code. I don't know. I doubt it. but it could be.

_________________
rawr. be very afraid
*poke*
Note: My name is all lowercase for a reason.
Even monkeys fall from trees. - Japanese proverb
Back to top
View user's profile Send private message
Z_Gecko
Guest





PostPosted: Mon Aug 10, 2009 12:26 am    Post subject: Reply with quote

i would suggest a HEX-Editor,
e.g. Tiny Hexer
Back to top
Benny-D



Joined: 29 Feb 2008
Posts: 865

PostPosted: Mon Aug 10, 2009 11:19 am    Post subject: Reply with quote

WOW!!! tidbit and Z_Gecko, I've tried both notepad++ and Tiny Hexer and it works!!! Thank YOU!!!

Just 2 small folow-up questions:

1) Whenever I open the code either in notepad++ or in Tiny Hexer, I find that the code contains a lot of Chinese characters. Is it because I am right now in Taiwan and am using a Chinese platform or it would be like this with English platform, too?

2) Also, is it possible somehow to convert that code (with all those Chinese characters and some other special characters) into some text that I could easily transport withoout any looses into "Word" or notepad? It seems that Tiny Hexer is doing this job by converting all the code characters into Hex numbers, but I don'tknow how to get that Hex part out of the Tiny Hex - copying and pasting doesn't help.
Back to top
View user's profile Send private message
tidbit



Joined: 09 Mar 2008
Posts: 1807
Location: Minnesota, USA

PostPosted: Mon Aug 10, 2009 2:25 pm    Post subject: Reply with quote

study what a hex editor is/does.

no, you cannot "convert" the words. it's the files data (note: you can open any file in a text editor or hex editor). it most likely looks like it does because it is compressed and/or encrypted. even if it wasn't It would still wouldn't be readable by humans.
_________________
rawr. be very afraid
*poke*
Note: My name is all lowercase for a reason.
Even monkeys fall from trees. - Japanese proverb
Back to top
View user's profile Send private message
Benny-D



Joined: 29 Feb 2008
Posts: 865

PostPosted: Mon Aug 10, 2009 4:19 pm    Post subject: Reply with quote

tidbit wrote:
even if it wasn't It would still wouldn't be readable by humans.
But I don't need it to be readable by humans - all I want is to find out whether it is possible to take that code (the one that I see in notepad++ or in Tiny Hexer ) and to somehow transform it into another code WITHOUT ANY LOSSES (perhaps by simply replacing each of that code’s characters with some unique combination of letters and numbers, for example, each $ would be replaced by 1a, each & by 2e4, etc. And each Chinese character that I see there would be also replaced by some combination of digits and letters. Thus, each character in that code would be replaced by a unique combination of alphanumeric characters. And as for a string of alphanumeric characters, it can easily be saved in a Notepad or in the "Word" document without any loss). So, let's say, by simply using something like AHK’s Replace function, is that sort of idea executable?

tidbit wrote:
study what a hex editor is/does
For sure I will. I am sorry if I am being too naive and ignorant. But let's forget about the hex editor for a moment. What about just notepad++ ? As far as I can see from the description in Wiki, it is still an editor (a "source code editor", as Wiki says). Thus, according to my logic, if it's an editor, it is used for editing something, and that "something" is a piece of some source code! Thus, such a simple task like replacing each of that code’s elements (characters) by some combination of numbers and letters should be more than possible. Well, such is my logic here.

In fact, I don't quite understand how it is so that the source code that we can see in the notepad++ or in Tiny Hexer is, as you say, not readable by humans, and at the same time some people can still edit it! And also, what's the point of displaying a code, if it's totally not readable and can not be edited? Just to make sure that some code exists?!...

Well, I am sorry, I am her probably too blind and too ignorant.

Anyway, if the answer is still "impossible", then how can I get a hold of the very basics of my video file - the binary code - that simple string of 1's and 0's, which is, as far as I have been taught, the very foundation of any computer program or file.

Thank You very much for Your time, as well as thanks in advance to all those who will take some time to respond here.
Back to top
View user's profile Send private message
tidbit



Joined: 09 Mar 2008
Posts: 1807
Location: Minnesota, USA

PostPosted: Mon Aug 10, 2009 5:27 pm    Post subject: Reply with quote

if you replaced all & with 2e4, it would most likely break. same with all characters.
unless you then decode it. even then it may still not work again.

only way to find out is to try.

to get hold of the binary code would probably be next to impossible.
unless you are very very skilled at decompiling and/or hacking.

maybe there are apps that do it? I doubt it. but go look on google Smile
_________________
rawr. be very afraid
*poke*
Note: My name is all lowercase for a reason.
Even monkeys fall from trees. - Japanese proverb
Back to top
View user's profile Send private message
hd0202



Joined: 13 Aug 2006
Posts: 265
Location: Germany

PostPosted: Tue Aug 11, 2009 7:16 am    Post subject: Re: How can I "peek" into a video file's source co Reply with quote

search for "encoding" in Scripts & Functions

Hubert
Back to top
View user's profile Send private message
engunneer



Joined: 30 Aug 2005
Posts: 8255
Location: Maywood, IL

PostPosted: Tue Aug 11, 2009 12:47 pm    Post subject: Reply with quote

I'm curious - why do you want to move the data into a word file (btw, word files are horribly complex internally compared to just a text file)
_________________

(Common Answers)
Back to top
View user's profile Send private message Visit poster's website
Benny-D



Joined: 29 Feb 2008
Posts: 865

PostPosted: Tue Aug 11, 2009 9:29 pm    Post subject: Reply with quote

tidbit wrote:
if you replaced all & with 2e4, it would most likely break. same with all characters.
unless you then decode it.
Sure, I meant both coding and decoding. I am sorry I didn’t mention decoding in my posts – that’s a big mistake of mine.

tidbit wrote:
…unless you then decode it. even then it may still not work again.
Yes, I understand that, and that’s the thing I am afraid of the most.


tidbit wrote:
only way to find out is to try
I agree. I am very thankful to you for letting me know about Notepad++.

Just now I made this kind of try: I opened my video file (.wmv) in Notepad++, looked through that code, saved it as a .txt file, renamed it, made a copy of it, renamed the copy, and tried to open that copy with “Windows Media Player”… Can you just imagine, I did see the video!!!! I am so excited about it!

Well, truth be told, I did notice that the code that is in .txt DOES differ from what I see in Notepad++. I noticed that Notepad++ has a very interesting feature – if you highlight some of the symbols in the code (especially those little squares that are so common there), they turn into some other symbols (as if those symbols were “hiding” behind the squares). In the .txt file, however, it doesn’t happen – if I highlight squares, they still remain squares. So, here I have a question: Does every symbol in .txt file stand for itself? I mean, does what we see in a .txt file depict exactly what is contained in it or we may see sometimes a little square, for example, but in fact it is a Greek omega in one place and a dollar sing in another?

tidbit wrote:
to get hold of the binary code would probably be next to impossible.
unless you are very skilled at decompiling and/or hacking.
The situation seems to be quite ironical and paradoxical – people now can use computers and even write some pieces of software while the very basics of the programming is almost impossible to get hold of, even if it’s just a matter of having a look!

I heard that now it’s MSDOS’ turn to experience the same kind of fate – going into abeyance. Less and less people know how to use MSDOS (I am one of those who don’t).

It is reminiscent of that time when the semiconductor devices were becoming popular. I quickly learned then how to use transistors and diodes, but I was constantly bothered by this question in me: “How come I know how to use semiconductors, while I know nothing about the electronic tubes? Am I missing out on something very important here?”

Later, however, when the microcircuits appeared, I found that there were many people who were quite skillful at using microcircuits, while they knew nothing about transistors and diodes.


Last edited by Benny-D on Tue Aug 11, 2009 9:50 pm; edited 4 times in total
Back to top
View user's profile Send private message
Benny-D



Joined: 29 Feb 2008
Posts: 865

PostPosted: Tue Aug 11, 2009 9:34 pm    Post subject: Reply with quote

hd0202 wrote:
search for "encoding" in Scripts & Functions
Hi Hubert! Thanks for this piece of advice. I did some search on “encoding” and among other threads found this one: http://www.autohotkey.com/forum/topic43958.html&highlight=encoding

Here are two quotes from there:
ManaUser wrote:
It's worth a try, but by my understanding, AutoHotkey will parse it as ANSI anyway. Meaning you just get junk results if you try using non-ascii characters.
Roman wrote:
I think you're right, this sample illustrates that unfortunately neither ANSII nor UTF-8 allow many extended characters, even though you can see them fine in your notepad: e.g. German ß ü ö ä
This is probably where my whole excitement is going to be stopped. The thing is I have a lot of Chinese characters in my code and many of them are not supported by either ANSII or UTF-8. As far as I know, they are only supported by Unicode, but AHK doesn’t support Unicode, which means that I will not be able to replace them with the help of AHK – I would have to learn some other programming language instead. (Here is a question in point: Will AHK one day switch to Unicode?) I have already found this “dead end” once before when I wanted to display some Russian and Japanese text in GUI: http://www.autohotkey.com/forum/viewtopic.php?t=31307&highlight=
Back to top
View user's profile Send private message
Benny-D



Joined: 29 Feb 2008
Posts: 865

PostPosted: Tue Aug 11, 2009 9:35 pm    Post subject: Reply with quote

engunneer wrote:
I'm curious - why do you want to move the data into a word file?

engunneer wrote:
word files are horribly complex internally compared to just a text file

Hello engunneer!!!

You are right, when while trying to store that code into a “Word” document I discovered that the number-of-pages indicator was showing me something around 25 000 pages (!) I just decided to cancel that task Very Happy .

But I don’t insist on the “Word”, though – simple .txt files are okay with me, too.

The whole idea was to firstly display the code in "what-I-see-is-what-I-really-have-in-there" mode and then to be able to replace each element of its code with its unique equivalent (i.e. to code it) in such way so that I would later be able to perform the inverse task without any losses (i.e. decode it and still be able to see the video).


Last edited by Benny-D on Wed Aug 12, 2009 6:50 am; edited 2 times in total
Back to top
View user's profile Send private message
corrupt



Joined: 29 Dec 2004
Posts: 2485

PostPosted: Wed Aug 12, 2009 1:15 am    Post subject: Re: How can I "peek" into a video file's source co Reply with quote

Benny-D wrote:
This question will probably sound silly to you, but my curiosity seems to be overcoming the fear of looking silly, so here I go:

Let’s say I have a video file (.wmv for example), how can I "peek" into its source code? I mean in computers everything boils down to 1's and 0's, right? Or at least to some apparantly meaningless string of some characters that (in my example) can be turned into a video, right? So how can I see that "meaningless" source code of a video file?
I'm curious... what's the main point of the exercise? What are you trying to accomplish when you change the "code" then later revert it to the original?

I"m not sure that "source code" is an accurate term in this case... A video file is not typically an application that has executable code but typically a data file containing data in a specific format. For more information on the data format of a video file you would need to research the video format used. Here's a bit of information on .wmv http://www.digitalpreservation.gov/formats/fdd/fdd000091.shtml

Opening a data file that isn't intended to be edited in a text editor may have side effects when editing, saving, etc... depending on how the data in the file is interpreted by the editor (non-printable characters, how end of line is determined, etc...). Changing the file name, extension may not prevent the file from playing back if the data is still in tact if the player software attempts to verify/determine the file format when/if the file format doesn't match the expected for the extension provided.
Back to top
View user's profile Send private message Visit poster's website
Benny-D



Joined: 29 Feb 2008
Posts: 865

PostPosted: Wed Aug 12, 2009 3:27 pm    Post subject: Reply with quote

corrupt wrote:
I’m not sure that "source code" is an accurate term in this case... A video file is not typically an application that has executable code but typically a data file containing data in a specific format.

Hello, corrupt !!!

Thank you for your input here. Honestly, I didn’t even know that the term “source code” was only related to applications and not to data files. My wrong usage of this term is caused by my poor understanding of it and by a very limited knowledge of this whole area.

So, what term should it be instead? What word should I use when I refer to that “code” that I see in the Notepad++ every time I open a .wmv video file with it?
corrupt wrote:
I'm curious... what's the main point of the exercise? What are you trying to accomplish when you change the "code" then later revert it to the original?

Well, the main point here is exactly in what you have just described – in, firstly, trying to change that “code” (I am sorry I don’t know what word I should use instead of “code”) by replacing each of its elements with their unique equivalents (chosen by me) and then, secondly, in reverting it back to the original so that there would be no losses, i.e. I still want to be able to see the video when I open it with Windows Media Player.

I am just curious here how much I’ll be able to “touch the ground” here – to see whether that “code” that I see in the .txt file (after I save a Notepad++ as a .txt file) is basic and “freely convertible”.

I don’t know if I am making myself clear enough here. Let me use this example. Let’s say the “code” is this short string consisting of only 5 elements: q £ ¥ 3 ¥ (in the reality, of course, the string is thousands times bigger). It is quite possible that the third and the fifth elements in the string are twins since they both are represented in the editor by the same symbol (¥). So, what I want to do here is replace q with, say, A, £ with B, each ¥ with C, and 3 with D. Then I want to perform the inverse operation – replace A with q, B with £, each C with ¥, and D with 3. Then I will try opening the resulting code with Windows Media Player and see if it still shows me some video. If the result is negative (I don’t see any video), then it means that the third and the fifth elements of the “code”, even though they were represented by the same symbol, are in fact two different values, thus, I didn’t “touch the ground”.

These words of yours strongly suggest that the final result of my experiment will be negative, which is quite sad:
corrupt wrote:
Opening a data file that isn't intended to be edited in a text editor may have side effects when editing, saving, etc... depending on how the data in the file is interpreted by the editor (non-printable characters, how end of line is determined, etc...).


By the way, I, of course, may be missing something here, but as for non-printable characters and the ends of lines, it seems like all of them are made quite evident in Notepad++ by small pointers in the form of arrows.

corrupt wrote:
Changing the file name, extension may not prevent the file from playing back if the data is still in tact if the player software attempts to verify/determine the file format when/if the file format doesn't match the expected for the extension provided.
I am sorry, but I didn’t quite understand this last sentence in your comment. Could you, please, paraphrase it for me.
Back to top
View user's profile Send private message
SoLong&Thx4AllTheFish



Joined: 27 May 2007
Posts: 4999

PostPosted: Wed Aug 12, 2009 3:56 pm    Post subject: Reply with quote

Read this, it has some useful info:
http://en.wikipedia.org/wiki/Binary_file

AHK has limited capabilities with binary files (search the forum, or see the links in Murp|e post http://www.autohotkey.com/forum/viewtopic.php?t=47342

In the end you will have little use for doing what you want, the video file does not contain useful information as such apart from sections of plain text which may or may not include copyright, compression, codec, used software info etc.
_________________
AHK Wiki FAQ
TF : Text files & strings lib, TF Forum
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    AutoHotkey Community Forum Index -> General Chat All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You can post new topics in this forum
You can reply to topics in this forum


Powered by phpBB © 2001, 2005 phpBB Group