A frequent question when dealing with browser forensics is ‘Does the Hit Count value mean that the user visited site ‘x’, on ‘y’ occasions?’ Most browsers record a ‘Hit Count’ value in one or more of the files they use to track browser activity, and it is important that an analyst understands any potential pitfalls associated with the accuracy, or otherwise, of this value.

We recently received a support request from an analyst who was analysing Internet Explorer data. They had found a record relating to a Bing Images search, which showed a hit count of 911. The particular search string was significant, and very damning had it actually been used 911 times. The analyst wanted to know if the hit count value could be relied upon.

The following experiment was carried out in order to establish how this surprisingly high hit count value could have been generated. In order to obtain a data set which contained as little extraneous data as possible, a brand new VMWare virtual machine was created. The machine was setup from the Microsoft Windows XP SP3 installation disc, which installed Internet Explorer v 6.0.2900.5512.xpsp.080413-2111 by default. Two user accounts were created on the machine – one to be used as an Admin account, for installing software etc; and the other to be used as the ‘browsing’ account. This separation of the accounts further assisted with minimising the possibility of any unwanted data being present within the ‘browsing’ account. Using the Admin account, the version of Internet Explorer in use on the virtual machine was upgraded to IE v 8.0.6001.18702. The ‘browsing’ account was then used for the first time. Starting Internet Explorer immediately directed the user to the MSN homepage. The address ‘’ was typed into the address bar, which led to the Bing search engine homepage. The ‘Images’ tab was clicked. This Auto Suggested a search criterion of ‘Beautiful Britain’, as can be seen in the figure below:



Figure 1

The term ‘aston martin’ was then typed into the search box, as shown below:


Figure 2

None of the images were clicked or zoomed, nor was the result screen scrolled. Internet Explorer was closed, and the browsing account logged off. The Admin account was used to extract the browser data for processing in NetAnalysis. The below image shows some of the results. Both of these entries are from Master History INDEX.DAT files:


Figure 3


As can be seen, both entries show a hit count of 5. Both of these pages were visited only once, so it is immediately apparent that the hit count value maintained by Internet Explorer may not be an accurate count of how many times a particular page has been visited. However, this still did not explain how Internet Explorer had produced a hit count of 911.

The virtual machine was started again, and the browsing account logged on. The previous steps were repeated; typing ‘’ into the URL bar; visiting the Bing homepage; and clicking on the ‘Images’ tab. Once again, Bing Auto Suggested the search criterion of ‘Beautiful Britain’, and displayed the same thumbnail results page. The search criterion ‘aston martin’ was again typed into the search box and the same thumbnail results page was produced. None of the images were clicked or zoomed. The results page was scrolled using the side scroll bar, which generated more thumbnails as it went. Internet Explorer was closed, and the browsing account logged off. The Admin account was used to extract the browser data for processing in NetAnalysis. The below image shows some of the results. Both of these entries are again from Master History INDEX.DAT files:



Figure 4

As can be seen, the ‘Beautiful Britain’ search now has a hit count of 13 – it is not at all clear how Internet Explorer determined this figure. Moreover, the ‘aston martin’ search now shows a hit count of 511. This page was not visited 511 times, nor were 511 of the thumbnail images clicked. The contents of the INDEX.DAT for the local cache folders (Content.IE5) were checked to see how many records were held relating to thumbnails that had been cached. The results were as follows:


Figure 5

So it does not even appear that there are 511 thumbnails held in the local cache. The result page was scrolled quickly, so the user did not see a large proportion of the thumbnail images.

In conclusion, it is apparent that the ‘Hit Count’ maintained by Internet Explorer cannot be relied upon. Although this experiment involved a quite specific process relating solely to image searches carried out on one particular search engine, the disparity between results and reality makes it clear that unquestioning acceptance of what Internet Explorer is recording as a ‘Hit Count’ could lead to significant errors if presented in evidence.

To complete the experiment, two further identical Virtual Machines were created. On one, the Google Chrome browser (v 15.0.874.106 m) was installed and used. On the other, the Mozilla Firefox browser (v 8.0) was installed and used. The same steps were repeated: typing ‘’ into the URL bar; visiting the Bing homepage; and clicking on the ‘Images’ tab. The results from these processes are shown below:



Figure 6




Figure 7

It is apparent that both of these browsers seem to maintain a more accurate ‘Hit Count’.

Internet Explorer Data

As forensic examiners will be aware, Microsoft Internet Explorer stores cached data within randomly assigned folders. This behaviour was designed to prevent Internet data being stored in predictable locations on the local system in order to foil a number of attack types. Prior to the release of Internet Explorer v9.0.2, cookies were an exception to this behaviour and their location was insufficiently random in many cases.

Cookie Files

Generally, for Vista and Windows 7, cookie files are stored in the location shown below:


The cookie filename format was the user’s login name, the @ symbol and then a partial hostname for the domain of the cookie.

Cookie Files with Standard Name

With sufficient information about a user’s environment, an attacker might have been able to establish the location of any given cookie and use this information in an attack.

Random Cookie Filenames

To mitigate the threat, Internet Explorer 9.0.2 now names the cookie files using a randomly-generated alphanumeric string. Older cookies are not renamed during the upgrade, but are instead renamed as soon as any update to the cookie data occurs. The image below shows an updated cookie folder containing the new files.

Random Cookie Names

This change will have no impact on dealing with the examination of cookie data; however. it will no longer be possible to identify which domain a cookie belongs to from just the file name.

NetAnalysis showing Random Cookie Names

NetAnalysis showing Random Cookie Names

Introduction to Userdata

Internet Explorer 8+ user data persistence is a function which allows online forms to save a small file to the system with information about values entered in a particular form.  This allows the user to retrieve a half filled web based form when they revisit.

Persistence creates new opportunities for website authors.  Information that persists beyond a single page without support from the server, or within the finite scope of cookies, can increase the speed of navigation and content authoring.

The folder structure where the data is actually stored is very much like the standard Internet Explorer cache folder structure. Inside the cache folder you will find the files containing data attributed to the associated website.

{user}\AppData\Roaming\Microsoft\Internet Explorer\UserData\

To demonstrate how this works, we have created a page which allows you to save a string to the local drive.

Once you have saved some string data using the above page, open the UserData INDEX.DAT file in NetAnalysis and review the entries.  Selecting F8 will bring up the search/filter dialogue.  Change the field name to ‘Type’ and enter ‘userdata’ in the filter text box.  When this is executed, you should find an entry as shown below:

If you navigate to the corresponding folder, you will find an XML file which contains the string you entered into the website.  This is shown below.


The Internet Explorer disk cache is a storage folder for temporary Internet files that are written to the hard disk when a user views page from the Internet.Internet Explorer uses a persistent cache and therefore has to download all of the content of a page (such as graphics, sound files or video) before it can be rendered and displayed to the user.Even when the cache is set to zero percent, Internet Explorer requires a persistent cache for the current session. The persistent cache requires 4 MB or 1 percent of the logical drive size, whichever is greater.

Disk Cache Storage Location

The disk cache location varies across operating systems and can usually be found in the following default locations:

Digital Detective NetAnalysis Cache Location Windows 98

Figure 1

Digital Detective NetAnalysis Cache Location Windows 2K and XP

Figure 2

Digital Detective NetAnalysis Cache Location Windows Vista and 7

Figure 3

To identify the correct location of the cache for each user, the registry hive for the particular user must be examined. You cannot rely upon the live cache folder being in the default location. Figure 4 shows the registry key containing the cache folder location. Users have been known to move their cache location in an attempt to hide browsing activity.

Digital Detective NetAnalysis Cache Examination Shell Folder Location

Figure 4

The “User Shell Folders” subkey stores the paths to Windows Explorer folders for each user of the computer. The entries in this subkey can appear in both the “Shell Folders” subkey and the “User Shell Folders” and in both HKEY_LOCAL_MACHINE and HKEY_CURRENT_USER. The entries that appear in user “User Shell Folders” take precedence over those in “Shell Folders”. The entries that appear in HKEY_CURRENT_USER take precedence over those in HKEY_LOCAL_MACHINE.

Temporary Internet Files Folder Structure

Within the Temporary Internet Files folder, you will find a Content.IE5 folder.  This folder is the main disk cache for Internet Explorer.  Outlook Express also writes data to this location as is explained later in this article.Inside the Content.IE5 folder, you will find a minimum of 4 cache folders.  The file names for these cache folders comprise of 8 random characters.  When further cache space is required, Internet Explorer will add additional folders in multiples of 4.  Figure 5 shows the layout of a typical Internet Explorer disk cache.

Digital Detective Windows Forensic Analysis Microsoft Internet Explorer Cache Folders

Figure 5

Content.IE5 INDEX.DAT File

The cache INDEX.DAT file is a database of cache entries.  It holds information relating to individual cached items so that the browser can check whether the resource needs to be updated (eTag) and information relating to the location of the cached item.  It also stores the HTTP (Hypertext Transfer Protocol) response header for the resource.At the start of the INDEX.DAT file, you will find a zero based array holding the names of the cache folders.  Each individual URL record contains an index value which refers to a specific folder in the array.  This allows Internet Explorer to correctly identify the location of a cached item.  This can be seen in Figure 6 below.

Digital Detective NetAnalysis INDEX.DAT Forensic Analysis

Figure 6

As files are saved to the cache, Internet Explorer uses a specific naming convention as shown in Figure 7.  The only exception to this rule is when data is written to this location by Outlook Express.

Digital Detective Forensic Analysis of Internet Explorer Cache Naming Convention

Figure 7

As pages and resources are cached, they can easily end up in different folders. Internet Explorer attempts to keep the volume of data and number of cached items across each cache folder as level as possible. As Internet Explorer writes files to the cache folders, it checks to see if a file with the same name already exists.  This is frequently the case when web developers do not use imaginative or descriptive names for their files.  If the file already exists within the folder, Internet Explorer will increment the counter.  If no file exists, the counter portion of the file name is set to 1. Files with the same naming structure within a cache folder do not necessarily belong to the same web site.  Also, multiple visits to the same web site can easily result in files with similar naming conventions being spread across all the cache folders.  Cached resources can also become orphaned when the INDEX.DAT entry is not written to disk (such as when Internet Explorer crashes before the entries are written).  Figure 8 shows a typical cache folder.  There are two files within this folder which would have had the original name “images.jpg”.  Internet Explorer has renamed them “images[1].jpg” and “images[2].jpg”.

Digital Detective Internet Explorer Forensic Cache Analysis Folder Structure

Figure 8

Outlook Express Email Client

Microsoft Outlook Express also uses the Content.IE5 folder as a temporary cache.  When a user selects an email message within the client, Outlook Express reads the data from the corresponding DBX file and caches the components of the email so that it can be rendered on screen for the user as an email message. The structure of the DBX files is such that the email message is broken down into blocks and has to be rebuilt before it can be rendered.  If the message contains any attachments, they are stored in Base64 format and stored within the file.  The attachments also have to be extracted, decoded and cached prior to rendering on screen.As the message is rebuilt, Outlook Express saves the different elements of the message to the disk cache as a temporary file.  The naming convention is different to Internet Explorer.  In the past, some forensic examiners have not been aware of this and have incorrectly attributed data in the cache to a visit to a web page when it in fact was there as the result of viewing an email message. The file structure is shown in Figure 9.  The asterisk values represent hexadecimal digits.  Figure 9 also contains some example file names.

Digital Detective Forensic Analysis of Internet Explorer Cache Outlook Express Temporary File Naming Convention

Figure 9


Microsoft Internet Explorer maintains a number of INDEX.DAT files to record visits to web sites as well as to maintain cache and cookie data.  In this article, we will look at the Daily and Weekly files.

Daily INDEX.DAT Entries

The Daily INDEX.DAT file maintains a Daily record of visited pages.  This INDEX.DAT file has an unusual HOST record entry which helps the investigator analyse the pattern of visits to a particular web site.

The HOST record entry is used by Internet Explorer to display the hierarchical history structure when showing the user which web sites have been visited.  Each record contains a number of timestamps with the important data being stored in a FILETIME structure.  This timestamp structure contains a 64-bit value representing the number of 100-nanosecond intervals since 1st January 1601 (UTC).  The Digital Detective DCode utility can be used to convert these and other timestamp formats.

On the first daily visit to a particular web site, Internet Explorer creates a HOST entry in the INDEX.DAT record.  In effect, this entry represents the first visit to a particular HOST on specific day.  With further visits to the same web site, the HOST entry remains unchanged.  Examining the entries for the Daily INDEX.DAT will show when a web site was first and last visited during the period.  Figure 1 below shows an example of this when using the HOST filter view in NetAnalysis® v1 to look for visits to the Digital Detective web site.


Figure 1

Daily INDEX.DAT Timestamps

The Last Visited timestamp information is stored as two 64-bit FILETIMES located at offset 0x08 and 0x10 (Decimal 8,16).  They are stored as UTC and Local time values.  As there is no requirement to alter these timestamps, they are presented in an unaltered state in NetAnalysis® v1 as the “Last Visited [UTC]” and “Last Visited [Local]” columns.  Figure 2 and 3 summarise these timestamp values.


Figure 2


Figure 3

Establishing the Time Zone ActiveBias

As the URL records contain UTC and Local timestamps, it is possible to establish the Time Zone ActiveBias by establishing the time difference between both timestamps.  We discussed in a previous article on manually establish the system Time Zone settings.  The calculated ActiveBias information is represented in NetAnalysis® v1 by the ActiveBias column as shown in Figure 4.


Figure 4

NetAnalysis further uses this information to confirm the selected Time Zone is correct.  If the Time Zone ActiveBias is in conflict with the Time Zone setting in NetAnalysis®, the resulting timestamps may not be represented accurately.  The calculated ActiveBias is logged to the Audit Log as shown in Figure 5.


Figure 5

If NetAnalysis® detects that the Time Zone settings for the current forensic investigation are not correct, a warning dialogue will be shown immediately after the data has been imported.  Figure 4 shows the warning dialogue.


Figure 4

Examination of the ActiveBias column will show which entries are in conflict with the Time Zone Settings.

Weekly INDEX.DAT Entries

At the commencement of a new browsing week, the content from the Daily INDEX.DAT files is archived into a single Weekly INDEX.DAT file.  The actual timestamp information within the binary file changes for this file type when compared to the other files.

When the Weekly INDEX.DAT file is generated, the file created timestamp is saved at offset 0x10 of every URL record.  This is different to the other INDEX.DAT records as this location usually represents the Last Visited UTC Timestamp.  Many applications (including some software which claim to be for forensic purposes) get this wrong and misrepresent this timestamp as the “Last Visited Date”.

This timestamp is in FILETIME format and is saved as a UTC value.  This timestamp is presented within NetAnalysis in the “Date Index Created [UTC]” column.

The last visited timestamp is saved at offset 0x08 within the record as a LOCAL timestamp.  This is unusual, as FILETIME timestamps are normally saved as UTC values and the other INDEX.DAT files all contain a Last Visited timestamp with a UTC value.  With this timestamp, NetAnalysis takes the unaltered LOCAL time and saves it to the “Last Visited [Local]” column.  Unfortunately, the Last Visited UTC FILETIME value which was present in the Daily INDEX.DAT is not saved within the record and therefore has to be converted from a Local timestamp.

To calculate the UTC timestamp for the “Last Visited [UTC]” column, NetAnalysis takes the LOCAL timestamp at record offset 0x08 and converts it to UTC.  This conversion is calculated using the Time Zone value set in NetAnalysis prior to importing any data.  In doing so, dynamic daylight settings are also taken into account (as well as any year on year differences).

If a Weekly record is imported with the “No Time Zone Date/Time Adjustment” setting activated, NetAnalysis will show the LOCAL Last Visited timestamp but will not attempt to calculate the UTC timestamp.  In this case, the “Last Visited [UTC]” column will remain empty.  The “Last Visited [Local]” timestamp for Weekly entries is not changed or affected by NetAnalysis Time Zone settings.  It is left in an unaltered state.

Weekly INDEX.DAT Timestamps

The timestamp representation in NetAnalysis is shown in Figure 5 and 6 below.


Figure 5


Figure 6

Useful Links

Internet Explorer introduced a number of new security and privacy features in version 8.  One of these new features was InPrivate Filtering.  InPrivate Filtering helps prevent the web sites a user visits from automatically sending details about the visit to other content providers.


PrivacIE entries are not from InPrivate browsing sessions.

When a user visits a web site, it automatically shares standard computer information with that web site.  If the web site contains content provided by a third-party web site (for example a map, advertisement, or web measurement tools such as a web bug or scripts) some information about the user may be automatically sent to the content provider.

This type of arrangement can have several benefits. It lets the user conveniently access third-party content.  The presence of advertising on a web site may let the web site provide access to premium content at no charge.  However, there can be an impact on user privacy as a result because it is possible for the content providers to track the user across multiple web sites.

When InPrivate Filtering is used, some web sites might be prevented from automatically sharing details about the visit with the providers whose content is displayed.  As a result, some content might be automatically blocked (such as weather information or advertisements).  The user can manually allow blocked content by adjusting the InPrivate Filtering settings.

PrivacIE Entries

When performing a forensic examination of entries from Internet Explorer, you may come across numerous PrivacIE Type entries.  These entries are URL records for third-party content providers.

If the user has InPrivate filtering on, the PrivacIE entries relate to third party content.  They are not InPrivate browsing records.  InPrivate Filtering helps prevent website content providers from collecting information about sites the user has visited.

For example, the Digital Detective web sites use Google Analytics to track information about visitors to our sites.  Google tracks the data and provides webmasters with analytical tools to review the results.

If you switch InPrivate filtering on and visit the Digital Detective Website, you will find two PrivacIE  entries which get generated from the visit.*/__utm.gif*/ga.js

The first entry is a web bug which takes the Google Analytic information as a command line parameter.  The second entry is a Google Analytic script.

Removing InPrivate Filter Entries

The user can remove the (PrivacIE) entries relating to InPrivate Filtering in the same way as clearing the History or Cache.  In the Delete Browsing History dialogue, there is an entry for InPrivate Filtering as shown below.
Internet Explorer can also be configured not to record this information, even if the functionality is active.


Redirects provide a way to transport the browser from one point on a web site to another.  Redirects are most commonly used to translate references to outdated web pages to new, updated ones.  They can be instigated either by the browser, in which case they are referred to as client-side redirect, or by the server, in which case they are referred to as server-side redirects.

Virtually all webmasters at some point will discover that not all links to their site end up where they were intended.  For example, if the author of another web site places a link to your site, but misspells the hyperlink, the result will be a 404 error every time that link is selected.

Another common occurrence is site reorganisations.  As you move files around, rename them or even delete them entirely, you will find visitors will receive 404 errors.  This is because other sites, search engines and directories often link to those pages.  It is an impossible task to be able to change all the hyperlinks to your site.

If the webmaster determines that someone has linked to a non-existent page, there are ways of redirecting the user to the correct page.

Browser / Client Side Redirects

If you are a webmaster and you wish to recapture the traffic from the lost/moved/deleted pages, there are several things that can be done.  One of the easiest ways to add a redirect is through a special Meta tag called “refresh” to direct their visitors to the new page.

To do this they create a blank page which contains the tag as shown below (it is placed in the header). The tag simply says “wait some time then go to a named page”.  In this example, the browser will wait one second.


If you wish to use ASP, then the following code inside an ASP page will perform the same function.


Server Side Redirects

There are many ways to implement server-side redirects depending on the web server being used.  One of the most common is to use the .htaccess file, which is supported by the Apache web server.  .htaccess supports a directive called redirect.  This directive transparently changes the URL to a new URL.   However, this is not the usual method for implementing Server-Side redirects as not all hosts support this.

In IIS7, creating server side redirect is a simple task.  Log on to the Internet Information Services (IIS) Manager and select the file or folder you wish to create a redirect for.  Clicking on the HTTP Redirect button will launch the screen shown below.  This allows the webmaster to select the URL to be redirected to, and also which type of redirect is required.


This information is then saved to a web.config file in the folder where the redirect is to be launched from.

Redirect Evidence in Microsoft Internet Explorer

Embedded within the CACHE INDEX.DAT file are numerous Internet records that have REDR as the record header.  This header is a REDIRECT entry and is evidence of a SERVER-SIDE redirect.  Client Side redirects are NOT recorded within the INDEX.DAT files as REDR records.  The REDR entry is a URL that has been visited and the server has responded with an HTTP 300 response which tells the browser the page is in a different location.  This entry reflects the URL which caused the redirection.  In NetAnalysis, the entries are marked as Type: Redirect as shown below.


In previous versions of NetAnalysis, we were not able to show where the user was redirected to.  Following research and testing, we identified a methodology for resolving redirect entries.  However, it is only possible to show the resulting redirected URL if the data still exists within the INDEX.DAT record.  There is a marker to indicate whether the entry is live or deleted.  It is also only possible to show resulting redirect entries from live INDEX.DAT files.  It is not possible to do this with the data recovered by HstEx.

This is a significant development in the analysis of Internet browser artefacts as previously, the invetigator did not have this information.  At the time of writing this Article, NetAnalysis v1.50+ is the only software available to extract this important evidence.

In addition to identifying the URL where the user was redirected to, it is also possible to identify the date/time this action occurred.  This also was not previously possible.  The status column informs the examiner whether the redirect entry is intact or not.  If it is intact and from a live INDEX.DAT file, there will be date/time information.  As mentioned previously, this data is not available for overwritten or deleted redirect entries.


The image above shows NetAnalysis and two Server Side Redirect entries.  In this case, item number one is the original URL which caused the server to respond with a server side redirect HTPP response.  This is the standard URL record which is shown in the URL column.  In addition, as this is a REDR record which is intact, NetAnalysis has the Redirect URL (item number two) in the corresponding column.  The Type column reflects the fact it is a redirect entry, as does the IE Type column (Internet Explorer Record Type).

As this Redirect entry is intact, the Last Visited time stamp can be extracted.