| View previous topic :: View next topic |
| Author |
Message |
BoBo Guest
|
Posted: Tue Feb 28, 2006 12:26 pm Post subject: AHK Geek Challenge - Reverse Engineering Index.dat |
|
|
That's far ahead of what I'm able to accomplish on my own so I've decided to throw it into the ring of fire to be lazslodized, shimanoved, toralfoniqued, philhoed, ... aka tweaked to death
[Reverse Engineering Index.dat]
[Index.dat]
Target: to convert your/an index.dat(s) to another format (txt [,csv,html,xml]) |
|
| Back to top |
|
 |
PhiLho
Joined: 27 Dec 2005 Posts: 6721 Location: France (near Paris)
|
Posted: Tue Feb 28, 2006 1:27 pm Post subject: |
|
|
I am flatted to be in the list of hackers...
This is an interesting challenge, but being a Firefox user, I have little interest in this project... And waaaay too much projects already in my pipeline. So I decline the invitation.  _________________
vPhiLho := RegExReplace("Philippe Lhoste", "^(\w{3})\w*\s+\b(\w{3})\w*$", "$1$2") |
|
| Back to top |
|
 |
BoBo Guest
|
Posted: Wed Mar 01, 2006 6:01 pm Post subject: |
|
|
Hu? Looks like I'm the only one who's interested to know what Bill Gates collects within IE's index.dat file ??  |
|
| Back to top |
|
 |
kapege.de
Joined: 07 Feb 2005 Posts: 186 Location: Munich, Germany
|
Posted: Thu Mar 02, 2006 10:33 am Post subject: |
|
|
| BoBo wrote: | Hu? Looks like I'm the only one who's interested to know what Bill Gates collects within IE's index.dat file ??  |
No. But the answere is easy: everything possible!  _________________ Peter
Wisenheiming for beginners: KaPeGe (German only, sorry) |
|
| Back to top |
|
 |
Titan
Joined: 11 Aug 2004 Posts: 5068 Location: imaginationland
|
Posted: Thu Mar 02, 2006 11:40 am Post subject: |
|
|
I recommend CCleaner: | CCleaner Introduction wrote: | CCleaner is a freeware system optimization and privacy tool. It removes unused files from your system - allowing Windows to run faster and freeing up valuable hard disk space. It also cleans traces of your online activities such as your Internet history [including the notorious Index.dat]. But the best part is that it's fast (normally taking less that a second to run) and contains NO Spyware or Adware!  |
As for a pure AHK script solution, I doubt that it's going to be very easy because not only is it hex/binary, but I haven't come across any other program that can deal with it. _________________
RegExReplace("irc.freenode.net/ahk", "^(?=(.(?=[\0-r\[]*((?<=\.).))))(?:[c-\x73]{2,8}(\S))+((2)|\b[^\2-]){2}\D++$", "$u3$1$3$4$2") |
|
| Back to top |
|
 |
BoBo Guest
|
Posted: Thu Mar 02, 2006 12:55 pm Post subject: |
|
|
The thing isn't about to replace/delete it (yes, I've relized those several apps which were mentioned at the wiki ). It's about what its named after: INDEX. That file contains a huge collection of data (eg. untyped cause pasted URLs) which could be of use anyhow. Isn't it? |
|
| Back to top |
|
 |
BoBo Guest
|
Posted: Fri Mar 03, 2006 7:44 am Post subject: |
|
|
| Quote: | What is in Index.dat files?
As already mentioned, index.dat files are binary files. Their content can be seen only with binary (hex) editor. We will examine an index.dat file from the Internet cache (Temporary Internet Files). First, let's take a look to the index.dat file header:
Actually the index.dat header is much larger but this is the most important part of it. The first thing is the version of the index.dat file (Client UrlCache MMF Ver 4.7) - this particular file is from Internet Explorer version 4 but the index.dat file format is very similar in Internet Explorer 5.x and 6.
The next important thing in the header are the names of the four subfolders in which are located the cached files from the Internet (they are not in the header when the index.dat file is for cookies and history but UserData index.dat files also have such subfolders). These subfolders are located in the same folder as the index.dat file and in this case their names are 49EDE5UVC, GHIZ8LMVB, EBWNUZWLB and G48NSH4S. On your PC these folders can be more than four (depending on the size of the index.dat file) and their names will be different.
The real content of the index.dat files usually starts at byte offset 4000h or 5000h from the beginning of the file. Index.dat file is composed of many records of four different types: HASH, URL, LEAK and REDR.
HASH records are the largest but they don't contain any privacy sensitive information. The are just hash indexes of the contents of the index.dat file. If the file is larger there can be many such records.
The vast majority of the index.dat records are of types URL, LEAK and REDR. They have fairly similar layout. Look at this sample URL record.
As you can see there is a lot of information here. First, there is encoded date and time of the loading of this picture (icon_hardware.gif) from the Internet. The date and time are encoded in binary format in the second row of the dump. Next, there is http://www.aceshardware.com/site/images/icon_hardware.gif, which is the full URL of the loaded file. The name of the local copy of the file (which is in one of the four subfolders of the index.dat folder) is icon_hardware.gif. The next thing is the full HTTP header of the response of the Web server:
HTTP/1.0 200 OK
ETag: "AAAAOl01l7Q"
Content-Type: image/gif
Content-Length: 1234
X-Cache: MISS from proxy.office.devolti.com
The last but not least bit of information in the record is the name of the user account: Administrator. Obviously all this information can be potentially dangerous because it tells us who and when accessed given Internet page and what was the response of the Web server. If you clean the Internet cache (Temporary Internet Files) then the cached files are deleted but most of the index.dat file records are left almost untouched. The same is true for the history and cookies.
The empty space of index.dat files is filled with junk (most often zeros but it can also be various meaningless sequences) or in some areas - with "magic" sequence 0BADF00Dh (BAD FOOD). Obviously Microsoft developers are not without a sense of humor. BAD FOOD parts of the file are deleted records of other kinds and they aren't privacy threat. |
Friendly provided by [mil in corporated] |
|
| Back to top |
|
 |
|