Items related to Microsoft Internet Explorer Browser

Introduction

A frequent question when dealing with browser forensics is ‘Does the Hit Count value mean that the user visited site ‘x’, on ‘y’ occasions?’ Most browsers record a ‘Hit Count’ value in one or more of the files they use to track browser activity, and it is important that an analyst understands any potential pitfalls associated with the accuracy, or otherwise, of this value.

We recently received a support request from an analyst who was analysing Internet Explorer data. They had found a record relating to a Bing Images search, which showed a hit count of 911. The particular search string was significant, and very damning had it actually been used 911 times. The analyst wanted to know if the hit count value could be relied upon.

The following experiment was carried out in order to establish how this surprisingly high hit count value could have been generated. In order to obtain a data set which contained as little extraneous data as possible, a brand new VMWare virtual machine was created. The machine was setup from the Microsoft Windows XP SP3 installation disc, which installed Internet Explorer v 6.0.2900.5512.xpsp.080413-2111 by default. Two user accounts were created on the machine – one to be used as an Admin account, for installing software etc; and the other to be used as the ‘browsing’ account. This separation of the accounts further assisted with minimising the possibility of any unwanted data being present within the ‘browsing’ account. Using the Admin account, the version of Internet Explorer in use on the virtual machine was upgraded to IE v 8.0.6001.18702. The ‘browsing’ account was then used for the first time. Starting Internet Explorer immediately directed the user to the MSN homepage. The address ‘www.bing.com’ was typed into the address bar, which led to the Bing search engine homepage. The ‘Images’ tab was clicked. This Auto Suggested a search criterion of ‘Beautiful Britain’, as can be seen in the figure below:

 

IE_2520Bing_2520Images_2520search_2520-_2520aston_2520martin_25202_thumb

Figure 1

The term ‘aston martin’ was then typed into the search box, as shown below:

 

Figure 2

None of the images were clicked or zoomed, nor was the result screen scrolled. Internet Explorer was closed, and the browsing account logged off. The Admin account was used to extract the browser data for processing in NetAnalysis. The below image shows some of the results. Both of these entries are from Master History INDEX.DAT files:

 

Figure 3

 

As can be seen, both entries show a hit count of 5. Both of these pages were visited only once, so it is immediately apparent that the hit count value maintained by Internet Explorer may not be an accurate count of how many times a particular page has been visited. However, this still did not explain how Internet Explorer had produced a hit count of 911.

The virtual machine was started again, and the browsing account logged on. The previous steps were repeated; typing ‘www.bing.com’ into the URL bar; visiting the Bing homepage; and clicking on the ‘Images’ tab. Once again, Bing Auto Suggested the search criterion of ‘Beautiful Britain’, and displayed the same thumbnail results page. The search criterion ‘aston martin’ was again typed into the search box and the same thumbnail results page was produced. None of the images were clicked or zoomed. The results page was scrolled using the side scroll bar, which generated more thumbnails as it went. Internet Explorer was closed, and the browsing account logged off. The Admin account was used to extract the browser data for processing in NetAnalysis. The below image shows some of the results. Both of these entries are again from Master History INDEX.DAT files:

 

Figure_204_20-_20NetAnalysis_20showing_20511_20hit_20count_thumb

Figure 4

As can be seen, the ‘Beautiful Britain’ search now has a hit count of 13 – it is not at all clear how Internet Explorer determined this figure. Moreover, the ‘aston martin’ search now shows a hit count of 511. This page was not visited 511 times, nor were 511 of the thumbnail images clicked. The contents of the INDEX.DAT for the local cache folders (Content.IE5) were checked to see how many records were held relating to thumbnails that had been cached. The results were as follows:

 

Figure 5

So it does not even appear that there are 511 thumbnails held in the local cache. The result page was scrolled quickly, so the user did not see a large proportion of the thumbnail images.

In conclusion, it is apparent that the ‘Hit Count’ maintained by Internet Explorer cannot be relied upon. Although this experiment involved a quite specific process relating solely to image searches carried out on one particular search engine, the disparity between results and reality makes it clear that unquestioning acceptance of what Internet Explorer is recording as a ‘Hit Count’ could lead to significant errors if presented in evidence.

To complete the experiment, two further identical Virtual Machines were created. On one, the Google Chrome browser (v 15.0.874.106 m) was installed and used. On the other, the Mozilla Firefox browser (v 8.0) was installed and used. The same steps were repeated: typing ‘www.bing.com’ into the URL bar; visiting the Bing homepage; and clicking on the ‘Images’ tab. The results from these processes are shown below:

Chrome:

Figure_206_20-_20NetAnalysis_20with_20Google_20Chrome_20Search_thumb

Figure 6

 

Firefox:

Figure_207_20-_20NetANalysis_20with_20Mozilla_20Firefox_20Search_thumb

Figure 7

It is apparent that both of these browsers seem to maintain a more accurate ‘Hit Count’.

Internet Explorer Data

As forensic examiners will be aware, Microsoft Internet Explorer stores cached data within randomly assigned folders. This behaviour was designed to prevent Internet data being stored in predictable locations on the local system in order to foil a number of attack types. Prior to the release of Internet Explorer v9.0.2, cookies were an exception to this behaviour and their location was insufficiently random in many cases.

Cookie Files

Generally, for Vista and Windows 7, cookie files are stored in the location shown below:

\AppData\Roaming\Microsoft\Windows\Cookies

The cookie filename format was the user’s login name, the @ symbol and then a partial hostname for the domain of the cookie.

Cookie Files with Standard Name

With sufficient information about a user’s environment, an attacker might have been able to establish the location of any given cookie and use this information in an attack.

Random Cookie Filenames

To mitigate the threat, Internet Explorer 9.0.2 now names the cookie files using a randomly-generated alphanumeric string. Older cookies are not renamed during the upgrade, but are instead renamed as soon as any update to the cookie data occurs. The image below shows an updated cookie folder containing the new files.

Random Cookie Names

This change will have no impact on dealing with the examination of cookie data; however. it will no longer be possible to identify which domain a cookie belongs to from just the file name.

NetAnalysis showing Random Cookie Names

NetAnalysis showing Random Cookie Names

Introduction to Userdata

Internet Explorer 8+ user data persistence is a function which allows online forms to save a small file to the system with information about values entered in a particular form.  This allows the user to retrieve a half filled web based form when they revisit.

Persistence creates new opportunities for website authors.  Information that persists beyond a single page without support from the server, or within the finite scope of cookies, can increase the speed of navigation and content authoring.

The folder structure where the data is actually stored is very much like the standard Internet Explorer cache folder structure. Inside the cache folder you will find the files containing data attributed to the associated website.

{user}\AppData\Roaming\Microsoft\Internet Explorer\UserData\

To demonstrate how this works, we have created a page which allows you to save a string to the local drive.

Once you have saved some string data using the above page, open the UserData INDEX.DAT file in NetAnalysis and review the entries.  Selecting F8 will bring up the search/filter dialogue.  Change the field name to ‘Type’ and enter ‘userdata’ in the filter text box.  When this is executed, you should find an entry as shown below:

If you navigate to the corresponding folder, you will find an XML file which contains the string you entered into the website.  This is shown below.

Introduction

The Internet Explorer disk cache is a storage folder for temporary Internet files that are written to the hard disk when a user views page from the Internet.Internet Explorer uses a persistent cache and therefore has to download all of the content of a page (such as graphics, sound files or video) before it can be rendered and displayed to the user.Even when the cache is set to zero percent, Internet Explorer requires a persistent cache for the current session. The persistent cache requires 4 MB or 1 percent of the logical drive size, whichever is greater.

Disk Cache Storage Location

The disk cache location varies across operating systems and can usually be found in the following default locations:

Digital Detective NetAnalysis Cache Location Windows 98

Figure 1

Digital Detective NetAnalysis Cache Location Windows 2K and XP

Figure 2

Digital Detective NetAnalysis Cache Location Windows Vista and 7

Figure 3

To identify the correct location of the cache for each user, the registry hive for the particular user must be examined. You cannot rely upon the live cache folder being in the default location. Figure 4 shows the registry key containing the cache folder location. Users have been known to move their cache location in an attempt to hide browsing activity.

Digital Detective NetAnalysis Cache Examination Shell Folder Location

Figure 4

The “User Shell Folders” subkey stores the paths to Windows Explorer folders for each user of the computer. The entries in this subkey can appear in both the “Shell Folders” subkey and the “User Shell Folders” and in both HKEY_LOCAL_MACHINE and HKEY_CURRENT_USER. The entries that appear in user “User Shell Folders” take precedence over those in “Shell Folders”. The entries that appear in HKEY_CURRENT_USER take precedence over those in HKEY_LOCAL_MACHINE.

Temporary Internet Files Folder Structure

Within the Temporary Internet Files folder, you will find a Content.IE5 folder.  This folder is the main disk cache for Internet Explorer.  Outlook Express also writes data to this location as is explained later in this article.Inside the Content.IE5 folder, you will find a minimum of 4 cache folders.  The file names for these cache folders comprise of 8 random characters.  When further cache space is required, Internet Explorer will add additional folders in multiples of 4.  Figure 5 shows the layout of a typical Internet Explorer disk cache.

Digital Detective Windows Forensic Analysis Microsoft Internet Explorer Cache Folders

Figure 5

Content.IE5 INDEX.DAT File

The cache INDEX.DAT file is a database of cache entries.  It holds information relating to individual cached items so that the browser can check whether the resource needs to be updated (eTag) and information relating to the location of the cached item.  It also stores the HTTP (Hypertext Transfer Protocol) response header for the resource.At the start of the INDEX.DAT file, you will find a zero based array holding the names of the cache folders.  Each individual URL record contains an index value which refers to a specific folder in the array.  This allows Internet Explorer to correctly identify the location of a cached item.  This can be seen in Figure 6 below.

Digital Detective NetAnalysis INDEX.DAT Forensic Analysis

Figure 6

As files are saved to the cache, Internet Explorer uses a specific naming convention as shown in Figure 7.  The only exception to this rule is when data is written to this location by Outlook Express.

Digital Detective Forensic Analysis of Internet Explorer Cache Naming Convention

Figure 7

As pages and resources are cached, they can easily end up in different folders. Internet Explorer attempts to keep the volume of data and number of cached items across each cache folder as level as possible. As Internet Explorer writes files to the cache folders, it checks to see if a file with the same name already exists.  This is frequently the case when web developers do not use imaginative or descriptive names for their files.  If the file already exists within the folder, Internet Explorer will increment the counter.  If no file exists, the counter portion of the file name is set to 1. Files with the same naming structure within a cache folder do not necessarily belong to the same web site.  Also, multiple visits to the same web site can easily result in files with similar naming conventions being spread across all the cache folders.  Cached resources can also become orphaned when the INDEX.DAT entry is not written to disk (such as when Internet Explorer crashes before the entries are written).  Figure 8 shows a typical cache folder.  There are two files within this folder which would have had the original name “images.jpg”.  Internet Explorer has renamed them “images[1].jpg” and “images[2].jpg”.

Digital Detective Internet Explorer Forensic Cache Analysis Folder Structure

Figure 8

Outlook Express Email Client

Microsoft Outlook Express also uses the Content.IE5 folder as a temporary cache.  When a user selects an email message within the client, Outlook Express reads the data from the corresponding DBX file and caches the components of the email so that it can be rendered on screen for the user as an email message. The structure of the DBX files is such that the email message is broken down into blocks and has to be rebuilt before it can be rendered.  If the message contains any attachments, they are stored in Base64 format and stored within the file.  The attachments also have to be extracted, decoded and cached prior to rendering on screen.As the message is rebuilt, Outlook Express saves the different elements of the message to the disk cache as a temporary file.  The naming convention is different to Internet Explorer.  In the past, some forensic examiners have not been aware of this and have incorrectly attributed data in the cache to a visit to a web page when it in fact was there as the result of viewing an email message. The file structure is shown in Figure 9.  The asterisk values represent hexadecimal digits.  Figure 9 also contains some example file names.

Digital Detective Forensic Analysis of Internet Explorer Cache Outlook Express Temporary File Naming Convention

Figure 9

Introduction

Microsoft Internet Explorer maintains a number of INDEX.DAT files to record visits to web sites as well as to maintain cache and cookie data.  In this article, we will look at the Daily and Weekly files.

Daily INDEX.DAT Entries

The Daily INDEX.DAT file maintains a Daily record of visited pages.  This INDEX.DAT file has an unusual HOST record entry which helps the investigator analyse the pattern of visits to a particular web site.

The HOST record entry is used by Internet Explorer to display the hierarchical history structure when showing the user which web sites have been visited.  Each record contains a number of timestamps with the important data being stored in a FILETIME structure.  This timestamp structure contains a 64-bit value representing the number of 100-nanosecond intervals since 1st January 1601 (UTC).  The Digital Detective DCode utility can be used to convert these and other timestamp formats.

On the first daily visit to a particular web site, Internet Explorer creates a HOST entry in the INDEX.DAT record.  In effect, this entry represents the first visit to a particular HOST on specific day.  With further visits to the same web site, the HOST entry remains unchanged.  Examining the entries for the Daily INDEX.DAT will show when a web site was first and last visited during the period.  Figure 1 below shows an example of this when using the HOST filter view in NetAnalysis® v1 to look for visits to the Digital Detective web site.

NetAnalysis_Daily_Index.dat_Entries

Figure 1

Daily INDEX.DAT Timestamps

The Last Visited timestamp information is stored as two 64-bit FILETIMES located at offset 0x08 and 0x10 (Decimal 8,16).  They are stored as UTC and Local time values.  As there is no requirement to alter these timestamps, they are presented in an unaltered state in NetAnalysis® v1 as the “Last Visited [UTC]” and “Last Visited [Local]” columns.  Figure 2 and 3 summarise these timestamp values.

Digital_Detective_NetAnalysis_Daily_Timestamp_1

Figure 2

Digital_Detective_NetAnalysis_Daily_Timestamp_2

Figure 3

Establishing the Time Zone ActiveBias

As the URL records contain UTC and Local timestamps, it is possible to establish the Time Zone ActiveBias by establishing the time difference between both timestamps.  We discussed in a previous article on manually establish the system Time Zone settings.  The calculated ActiveBias information is represented in NetAnalysis® v1 by the ActiveBias column as shown in Figure 4.

Digital_Detective_NetAnalysis_ActiveBias_Column

Figure 4

NetAnalysis further uses this information to confirm the selected Time Zone is correct.  If the Time Zone ActiveBias is in conflict with the Time Zone setting in NetAnalysis®, the resulting timestamps may not be represented accurately.  The calculated ActiveBias is logged to the Audit Log as shown in Figure 5.

Digital_Detective_NetAnalysis_Audit_Log

Figure 5

If NetAnalysis® detects that the Time Zone settings for the current forensic investigation are not correct, a warning dialogue will be shown immediately after the data has been imported.  Figure 4 shows the warning dialogue.

Digital_Detective_NetAnalysis_Time_Zone_Warning

Figure 4

Examination of the ActiveBias column will show which entries are in conflict with the Time Zone Settings.

Weekly INDEX.DAT Entries

At the commencement of a new browsing week, the content from the Daily INDEX.DAT files is archived into a single Weekly INDEX.DAT file.  The actual timestamp information within the binary file changes for this file type when compared to the other files.

When the Weekly INDEX.DAT file is generated, the file created timestamp is saved at offset 0x10 of every URL record.  This is different to the other INDEX.DAT records as this location usually represents the Last Visited UTC Timestamp.  Many applications (including some software which claim to be for forensic purposes) get this wrong and misrepresent this timestamp as the “Last Visited Date”.

This timestamp is in FILETIME format and is saved as a UTC value.  This timestamp is presented within NetAnalysis in the “Date Index Created [UTC]” column.

The last visited timestamp is saved at offset 0x08 within the record as a LOCAL timestamp.  This is unusual, as FILETIME timestamps are normally saved as UTC values and the other INDEX.DAT files all contain a Last Visited timestamp with a UTC value.  With this timestamp, NetAnalysis takes the unaltered LOCAL time and saves it to the “Last Visited [Local]” column.  Unfortunately, the Last Visited UTC FILETIME value which was present in the Daily INDEX.DAT is not saved within the record and therefore has to be converted from a Local timestamp.

To calculate the UTC timestamp for the “Last Visited [UTC]” column, NetAnalysis takes the LOCAL timestamp at record offset 0x08 and converts it to UTC.  This conversion is calculated using the Time Zone value set in NetAnalysis prior to importing any data.  In doing so, dynamic daylight settings are also taken into account (as well as any year on year differences).

If a Weekly record is imported with the “No Time Zone Date/Time Adjustment” setting activated, NetAnalysis will show the LOCAL Last Visited timestamp but will not attempt to calculate the UTC timestamp.  In this case, the “Last Visited [UTC]” column will remain empty.  The “Last Visited [Local]” timestamp for Weekly entries is not changed or affected by NetAnalysis Time Zone settings.  It is left in an unaltered state.

Weekly INDEX.DAT Timestamps

The timestamp representation in NetAnalysis is shown in Figure 5 and 6 below.

Digital_Detective_NetAnalysis_Weekly_Timestamp_1

Figure 5

Digital_Detective_NetAnalysis_Weekly_Timestamp_2

Figure 6

Useful Links