Over the past few weeks, there has been worldwide interest in the trial of Casey Anthony which was held in Orlando, Florida. Anthony was indicted on charges of murder following the discovery of the body of her daughter Caylee Marie Anthony in 2008. On Tuesday 5th July 2011, the jury returned a not guilty verdict and she was cleared of murdering her child.
Those of you who have followed this case and listened to the expert testimony may have been intrigued and possibly confused as to some of the alleged facts as the case unfolded.
The digital forensic evidence in this case is of particular interest to me as it involved the recovery and analysis of a Mozilla Firefox history database. The Internet history records within this database turned out to be extremely important to the prosecution case as the existence of Google searches relating to “chloroform” and other possibly relevant records prior to the child’s disappearance could have indicated premeditation. This, of course, could have meant the difference between a conviction for murder in the first degree and manslaughter if found guilty. The State of Florida also has the death penalty as a punishment option for capital crimes.
During a keyword search of Anthony’s computer, a hit was found for the word “chloroform”. The hit was identified in what appeared to be a Mork database belonging to Mozilla Firefox. The file was identified as residing in unallocated clusters, and rather surprisingly, is reported to have been intact. Furthermore, all of the blocks belonging to the file were said to be contiguous.
The Mork database structure used by Mozilla Firefox v1-2 is unusual to say the least. It was originally developed by Netscape for their browser (Netscape v6) and the format was later adopted by Mozilla to be used in Firefox. It is a plain text format which is not easily human readable and is not efficient in its storage structures. For example, a single Unicode character can take many bytes to store. The developers themselves complained it was extremely difficult to parse correctly and from Firefox v3, it was replaced by MozStorage which is based on an SQLite database.
It is a matter of record that our software NetAnalysis (v1.37) was used during the initial examination of this data, and then at a later stage another tool was used. This is, of course, good forensic practice and is often referred to as “dual tool verification”.
Within a Mork database, the timestamp information relating to visits are stored as a micro-second count from an epoch of 1st January 1970 at 00:00:00 hours UTC (Universal Coordinated Time). In NetAnalysis v1.37, the forensic examiner had an option to leave the timestamps as they were recorded in the original evidence or to apply a bias to the UTC value to translate it to a local “Standard Time”. In this older version, there was no option to present the timestamp as a local value adjusted for DST (Daylight Saving Time). This changed in NetAnalysis v1.50 when a further date column was introduced which presented the examiner with UTC and local times adjusted for DST.
According to video footage of the trial testimony, the forensic examiner wanted the output to reflect local time and not standard time and tried another tool. This second tool was unable to recover any records from the Mork file. The forensic examiner then approached the developer during a training course and discussed the issues he was having with the software. The developer of the second tool then reviewed the Mork database over a period of a few nights and corrected the problem. That software then managed to recover 8,557 records (320 less than NetAnalysis was able to recover at the time).
Discrepancies between Forensic Tools
During testimony, the defence picked up on the fact that there were some major differences in the results produced by both tools. The defence assertion was that the initial results produced by NetAnalysis were in fact correct, and that the results from the second tool were flawed. This was discussed at some lengths in the video testimony on 1st July 2011 when the forensic examiner was questioned regarding the differences.
According to CNN, Jose Baez, the lead counsel for the defence said:
“the state’s computer forensic evidence involving chloroform research, a central element of their premeditation argument, was used to mislead the jury and that the flaws in that evidence infected their entire case like a cancer.”
He pointed out the discrepancy between the first analysis the sheriff’s office did that showed one visit to a website about chloroform and an analysis done later with a second program that appeared to show 84 visits. However, according to Baez, the first report showed a progression that made it clear that the 84 visits were actually to MySpace.
This was a major discrepancy with critical digital evidence presented in an extremely serious trial. As the software developer of NetAnalysis, I was extremely anxious to review the raw data and confirm the facts.
The first time I was made aware of this case (and the discrepancy between both tools) was around 9th June 2011. To date, I have not been asked by any party representing the prosecution (or defence) to comment on the discrepancies between both tools. I have however, since the conclusion of the trial, obtained a copy of the recovered “History.dat” Mork database file.
Mork Database File
Using this data, I will walk through the deconstruction of the critical elements of the file and verify the evidence presented during the trial. The file is 3,338,603 bytes in length and contains data from a Mork database.
The block in Figure 1 shows the definition of the database table holding the history data. The definition identifies the fields in each row as: “URL”, “Referrer”, “LastVisitDate”, “FirstVisitDate”, “VisitCount”, “Name”, “Hostname”, “Hidden”, “Typed”, “LastPageVisited”, and “ByteOrder”. Not all of these fields will be present in every history record. Each field is allocated an integer value for identification purposes. For example, the “URL” field has been allocated the value 82.
According to the Mozilla Developers Network, the model is described as:
“The basic Mork content model is a table (or synonymously, a sparse matrix) composed of rows containing cells, where each cell is a member of exactly one column (col). Each cell is one attribute in a row. The name of the attribute is a literal designating the column, and the content of the attribute is the value. The content value of a cell is either a literal (lit) or a reference (ref). Each ref points to a lit or row or table, so a cell can “contain” another shared object by reference.”
Deconstructing the Mork Database
To demonstrate how this works, and to validate the data, we will walk through a couple of examples. As we have no access to the SYSTEM registry hive from the suspect system, we must assume the computer was correctly set to Eastern Time in 2008 during these visits (time zone verification is always one of the first tasks for the forensic examiner prior to examining any time related evidence).
Figure 2 shows a screen shot of NetAnalysis with the data loaded and filtered showing some of the records identified in the testimony from the trial.
The first record (at the bottom of the screen) shows a visit to MySpace on 2008-03-21 15:16:13 (local time). The visit count shows the value as 84. The Mork record for this entry is shown in Figure 3.
The record is enclosed within square brackets and the individual fields for the record are enclosed within round brackets. The data stored within the brackets contain name/value pairs. Moving from left to right, the first block of data “-6E2F” identifies the Mork record ID (record ID values are not unique). The first name/value pair shows (^82^B1). If you refer back to the Mork header in Figure 1, we can see that field 82 refers to the “URL” (Uniform Resource Locator). The data for this field is stored in cell B1. The data cell is enclosed in brackets as shown in Figure 4 (line 47). The cell data shows (B1=http://www.myspace.com/).
Using the same methodology, we can see that field 84 refers to “LastVisitDate” and is stored in cell 27F42 as shown in Figure 5 (2008-03-21 19:16:13 UTC / 2008-03-21 15:16:13 Local Time). This integer represents the number of micro-seconds from the 1st January 1970, 00:00:00 UTC.
Field 85 refers to “FirstVisitDate” and is stored in cell BAF8 as shown in Figure 6 (2007-12-26 20:25:56 UTC / 2007-12-26 20:25:56 15:25:56 Local Time).
Field 88 refers to “Hostname” and is stored in cell 16F as shown in Figure 7.
Field 87 refers to “Name” and is stored in cell DA as shown in Figure 8.
Further examination of the Index in Figure 3 shows field 86. This refers to the “VisitCount” and has been assigned the value 84. This data is actually stored in the Index record and not a separate cell. If an Index record does not have a field 86, then the “VisitCount” is 1. Once the visit count is 2 or above, field 86 is assigned a value. The last field 8A refers to the “Typed” flag and has been assigned the value 1. This is a Boolean field 0 = False and 1 = True.
The data from this record has been gathered together in Figure 9. The Name field relates to the Page Title and is stored in pseudo Unicode format with $00 representing 0x00 values.
According to the testimony during the trial, this record was not recovered by the second tool.
Visit Count Discrepancy
At various times during the trial, the prosecution referred to a visit to a page (“http://www.sci-spot.com/Chemistry/chloroform.htm”) which allegedly took place at 15:16:13 hours (local time) on 21st March 2008. This record was recovered by the second forensic tool and indicated a visit count of 84. This visit was as a result of a Google search for “how to make chloroform”.
This evidence contradicts the data recovered by NetAnalysis which showed a single visit at 19:16:34 hours UTC (15:16:34 hours local time). Figure 9 shows a visit to MySpace, which has been verified manually above, and shows 84 visits as of 21st March 2008 at 15:16:13 hours (local time). This is the record highlighted in NetAnalysis in Figure 2.
The Mork record containing “http://www.sci-spot.com/Chemistry/chloroform.htm” is identified as record 174EF. The Index record from the original file is highlighted and shown in Figure 10 below.
The entire record is contained within square brackets. The highlighted line above shows the full record. The first field 82 (“URL”) is stored in cell 27F4B, as shown in Figure 11.
The second field 84 (“LastVisitDate”) is stored in cell 27F4C, as shown in Figure 12 (2008-03-21 19:16:34 UTC / 2008-03-21 15:16:34 Local Time). Once again, this integer represents the number of micro-seconds from the 1st January 1970, 00:00:00 UTC.
The third field 85 (“FirstVisitDate”) is stored in cell 27F4C. This is the same cell value as for (“LastVisitDate”) and indicates this is the first visit to this web site during the scope of the current recorded history. The First and Last visit times are the same.
The fourth field 83 (“Referrer”) is stored in cell 27F49, as shown in Figure 13.
The referrer field is very interesting from a forensic point of view as it shows the referring page. As the HTTP GET is sent to the web server for a page, the browser also sends the referring page as part of the request. This allows web masters to log the route by which visitors land on their pages. Mozilla Firefox records this information for each record. It is therefore relatively easy to track the actions of a user from page to page. In this case, the referring site was a Google search for “how to make chloroform”. With this information (which NetAnalysis shows in the “Referral URL” Column) there really is no need to “guess” how a user arrived at a specific page.
The fifth field 88 (“Hostname”) is stored in cell 27F4D, as shown in Figure 14.
The last field 87 (“Name”) is stored in cell 27F4E, as shown in Figure 15. The decoded value for this string is “New Page 1”.
Once again, I have gathered together the data for this record and presented it in a table format for easy review. This can be seen in Figure 16.
There are two critical points to make with this record. Firstly, there is no field 86 (“VisitCount”) therefore this URL has only been visited once (not 84 times). This is further corroborated by the fact that field 85 (“FirstVisitDate”) shows the exact same date/time as the “LastVisitDate”. The second point is that the visit was recorded at 15:16:34 hours (local time) and NOT at 15:16:13 hours as was stated during the trial (from the report produced by the second forensic tool).
Validity of the Recovered File
With the release of NetAnalysis v1.50 (current version v1.52), the Mork database parser was completely re-written from scratch (as were the other parsing modules). This was primarily to make the code easier to migrate and maintain and to ensure we were recovering as much data as possible. I tested the current release of NetAnalysis v1.52 against the Casey Anthony data. I know from manually examining the data, there are 9,075 individual Index records. Loading the data into NetAnalysis resulted in 9,060 records being recovered. This initially caused me some concern. However, further examination of the data revealed that there was nothing to be concerned about. There were 15 records which had missing “URL” cells; 14 of these records also had missing “LastVisitDate” cells.
If there are missing data cells within the file, this is a strong indicator that the file is not intact.
There are a number of conclusions to be drawn from the digital evidence presented in this trial; however, I will leave this to the members of the digital forensic community. Forensic tool validation is certainly at the forefront of our thoughts. Whilst it may not be possible to validate a tool, it is possible to validate the results against known data sets. If two forensic tools produce completely different results, this should at least warrant further investigation.