9

How does forensic software detect deleted files?

When a file is deleted, the pointer from MFT in NTFS system is deleted and the file is no longer accessible from the OS. If our disk is fragmented how can software like Autopsy or Recuva detect where each fragment from a single file is located and how can the software order the fragments correctly to "reproduce" the file as it looked before it was deleted?

I've noticed that when you delete something with CCleaner the first thing it says is "Wipe MFT free space", so my guess is that MFT works like a linked list where, when you delete a pointer the nodes still remain in the memory but are no longer accessible and this is how forensic software detect deleted files.

user46850
  • 199
  • 1
  • 3
  • 9
  • See http://security.stackexchange.com/q/8965/5336 as well – Canadian Luke May 22 '14 at 22:07
  • Keep in mind that we also have journaling file systems, so data (and in particular link lists, etc) is often written to the disk in a way that duplicates a lot of information, including pointer nodes. Even if you delete the file, if it has been modified many times it's quite possible that an older version and its linked list exists elsewhere on the disk. – Adam Davis May 22 '14 at 23:39

4 Answers4

11

There are any number of different ways it can be done. In large part, the easiest way is following the link pointers to each of the chunks, but that isn't the only way by any means. (The MFT isn't the only source of those links in many file systems as well.)

At a lower level, it can identify all the chunks and try to match some of them up on content if the files have an internal structure that allows one chunk to be matched to another. That won't work for all files though if the pointers are removed, since some don't have much of a pattern to them, but it will work with enough that it's still a major concern, especially since unless your drive is highly fragmented, even a large file probably isn't more than a few dozen large pieces.

Basically, there are a ton of different ways you can try piecing stuff together based on either the physical structure of the drive (contiguous blocks are generally preferred if available), file system features (such as forward and reverse block links) or file structure features, which vary from file to file.

Short of total destruction, there are any number of possible ways to recover the file. In some cases, simply removing the pointers may be enough, but a truly determined analysis can likely still put the jigsaw together by looking for fragments that make sense together, particularly if they are looking for something in particular.

AJ Henderson
  • 41,896
  • 5
  • 63
  • 110
  • So my guess that the MFT works like a list is correct? – user46850 May 22 '14 at 13:42
  • 1
    I forget if it is the MFT or actually at the end of each cluster, but yes, there is a linked list that ties the clusters used in a file together in many file systems. The exact mechanisms vary by file system. – AJ Henderson May 22 '14 at 13:46
  • Thanks for you answer. So in a fragmented disk simple overwriting the entries in MFT should be enough for large files, instead of overwriting the whole file. Also I would be really happy if you have any link that explain more ways of recovery. – user46850 May 22 '14 at 13:48
  • @user46850 - not necessarily. Some file systems also put forward or forward and reverse pointers at the beginning of each block of clusters used. It depends on the file system. It also depends on the file format and if there is anything that will help piece clusters together even if they aren't linked to each other. – AJ Henderson May 22 '14 at 13:55
  • There is also generally a backup MFT somewhere. Regardless, you can still piece together information so the **only** way to securely delete a file is to change every hex value. – Matthew Peters May 22 '14 at 14:07
  • @MatthewPeters - right, the only way to be sure is to actually destroy the data. It might be enough, but there may be something in the file format (or it may have simply used a contiguous block) that may let it be recovered still. Getting rid of the pointers *MAY* prevent recovery, but it just as likely (if not more likely) may not. – AJ Henderson May 22 '14 at 14:12
  • @Aj, yeah for small files there may not even be a pointer! Also, any kid with just an hour or two of training can recover most deleted files so IMHO unless you do actually rewrite the data, whatever else you do to *delete* a file just provides a false sense of security. – Matthew Peters May 22 '14 at 14:19
  • The decent tools don't care whether you have an MFT or not. If it exists, recovery is faster and potentially more accurate, but without it you can still recover everything that is not overwritten. – Rory Alsop May 22 '14 at 15:14
  • To "ensure" destruction, use a file wiper, so there's nothing left to find (it scrambles all the bits of the file with random bits repeatedly). NTFS doesn't really "delete" a file, it simply sets a "deleted" flag, then marks the file's physical space as "free." Eventually, when another file needs free space, the bits of the file you had would eventually be overwritten. Recuva, etc shows these files as "poor" quality, meaning some data would be corrupt (part of another file). Until then, the pointer to the file is still valid. – phyrfox Jun 17 '14 at 05:01
  • @MatthewPeters, since we are talking about the MFT I guess we are only talking about NTFS, as it is its specific terminology. In NTFS, the so-called "backup MFT" (i.e. the `$MFTMirr` file) only contains a copy of the first 4 entries of the real `$MFT` therefore it does not contain any relevant copy of interesting file records. If the primary records have been deleted, they could be found in the journal but not in the "backup MFT". – Andrea Lazzarotto Apr 17 '16 at 18:14
  • @RoryAlsop, if you don't have file records you can only perform carving (what you call "decent tools" perform a mix of directory tree reconstruction and carving). This approach does **not** work reliably for fragmented files, hence "you can still recover everything" is quite a bold claim. – Andrea Lazzarotto Apr 17 '16 at 18:16
5

A forensic tool such as FTK imager, is essentially a binary data reader and interpreter. Oversimplified, it reads each value and shows you both the hexidecimal (or decimal) absolute value and/or the interpreted value (such as text). Google for more examples and explanations of how FTK imager works.

Notice that a forensic toolkit is merely a tool. Most provide some level of processing to help you determine if what you are seeing is what you want to see.

Something that may help is to understand how file systems work. Here is a well put together book (I know it's old, but it's still relevant). Here is also a brief overview of the NTFS.

Edit: Example Exercise

So here is a super quick and fun way for you to see how all this works. First, I recommend you read through the aforementioned book but regardless you can follow these steps:

  1. Get some form of media to store a file on (I recommend a small 256mb sd card or the like).
  2. Reformat the media (in windows unselect the 'quick format option' and make sure it is formatting in NTFS).
  3. Open the media and create a simple text file with a short name and put your email address in it.
  4. Save the file and check to see if everything looks normal.
  5. Open up the media and just delete the file.
  6. Open FTK Imager, choose 'add evidence item' and select your media.
  7. Now, just look for your email address.
  8. Experiment and learn till your heart's content!

This simple exercise demonstrates how easy it is to find a deleted file (even without it's MFT). I love doing this with my students because you can learn so much by just varying this exercise and if you combine it with a textbook, bam!

Matthew Peters
  • 3,622
  • 4
  • 21
  • 39
5

In NTFS, all of the metadata is stored in the MFT. This includes names, dates, parent folder, etc. the occupied clusters are also stored in there in a structure called data runs. The clusters storing the file data hold only file data and there is no linked list that holds info about the next or previous cluster.

When a file is deleted (assuming a skip of the recycle bin), there is a single bit in the MFT record that gets turned off. The rest of that record stays in place exactly how it was otherwise. The metadata from a deleted file does not get wiped out until a new file needs to occupy that record slot with its metadata.

The MFT is a contiguous block of clusters with records in the size of 1024 bytes. NTFS uses the first unallocated record (from the top) when it creates a new file.

Forensic tools need only start at the top of the MFT and treat each block of 1024 bytes as a record. If the deleted/allocated bit is on, then it is an allocated file. If it is off, then the file has been deleted.

There was mention in another comment of wiping MFT unallocated space, and this is one way of trying to hide metadata. This involves writing over the records in the MFT that have been marked as deleted.

If that metadata is gone, it makes file recovery more difficult, but not impossible.

WMIF
  • 292
  • 1
  • 3
  • «The MFT is a contiguous block» ... _mostly_ contiguous. It could actually be split in 2 or 3 parts and these would be referenced in the `$DATA` attribute of the `$MFT` file record. :) – Andrea Lazzarotto Apr 17 '16 at 18:17
3

I've had experience with forensic software that ignores the filesystem logic and just greps the disk image for byte patterns that correspond to common file headers or specific file contents.

i.e. search for all JPEGs or search for files containing the word "hello".

That is one way of recovering "deleted" files.

user2675345
  • 1,651
  • 9
  • 10
  • Follow up question to this. Considering what AJ Henderson said , if we only deleted the header of a file and the EOF of a file, and if the file is like 1GB , wouldn't it be impossible to find the correct order of fragments ? – user46850 May 22 '14 at 13:56
  • 2
    @user46850, nope this is the beauty of data forensics! – Matthew Peters May 22 '14 at 14:05
  • To clarify, the technique described is called _file carving_. – forest Oct 30 '18 at 03:35