2

I want to write a simple file shredder in c++, i.e. a program that deletes a file so that it is not recoverable. Would the following be a secure and correct way to do that?

(1) Open the file with

ofstream myfile;
myfile.open ("deleteme.txt", ios::out | ios::binary); 

(2) Then run through 10 rounds of something like

unsigned int x = 0;
for (int i =0; i <= filesize -1; i++)
    {
    myfile.seekg (i, ios::beg);
    myfile << (char)x;
       [put new random character in x]
        }

(3) Then simply delete the file

  if( remove( "deleteme.txt" ) != 0 ) {
       cout << "Error deleting file";
       }
  else {cout << "File successfully deleted";
       }

(I guess one could simply also just delete the file in a console.)

Would it add extra security to rename the file before deletion?

Is there anything else that should be done to completely remove a file so that it cannot recovered or is this enough?

EDIT1: Just to clarify, the above code is sime-pseudo. My main concern is if overwriting a file byte by byte and the deleting the file will be a good way to make it irrecoverable.

EDIT2: I am just interested in the best way to do this by using software alone.

Edit3: Adding one more thing: I am mostly interested in a level of security that prevents recovery by methods of software.

Gilles 'SO- stop being evil'
  • 51,415
  • 13
  • 121
  • 180
Thomas
  • 3,861
  • 4
  • 22
  • 26
  • I'm pretty sure that doing that will not make the file totally irrecoverable. Not 100% sure though. –  Jun 07 '12 at 03:10
  • @TerryChia: So you are saying that the file might still be recoverable? How can I improve on this? – Thomas Jun 07 '12 at 03:13
  • You might also consider writing each bit or byte with actual binary data instead of ASCII through the whole thing. – Matt Jun 07 '12 at 04:29
  • 2
    Can you clarify, are you doing this for an exercise, or are you trying to make a real tool? (Because best practice for secure file destruction involves a wider approach than your question covers.) – Graham Hill Jun 07 '12 at 08:44
  • This is "toy code", unsuitable for real use. It writes data character by character, it seeks before each write even though the file position is right where it's seeking, it assumes the file size will fit in an `int`, and so on. So even if the method was sound, the code (at best) just demonstrates the method. – David Schwartz Jun 07 '12 at 08:47
  • @GrahamHill: I am mostly doing this as an exercise for my self (this is not homework). But if it is possible, I would like to actually write a shredder that could function as a real tool. Are you saying that just overwriting the file a number of times wouldn't be "enough"? – Thomas Jun 07 '12 at 13:00
  • @Matt: Yes, I will update the code. I had intended to do that. – Thomas Jun 07 '12 at 13:01
  • @DavidSchwartz: I am still somewhat new to c++, and this was the only way that I could figure out how to overwrite a file. Are there better ways? – Thomas Jun 07 '12 at 13:02
  • @Thomas Secure deletion tools can work well, but a practical risk assessment will often decide the data is worth more than the disk, and so will move up a gear to physical destruction. And of course, if the data is sensitive enough to securely delete, you should probably have encrypted it already. – Graham Hill Jun 07 '12 at 16:03
  • This does not prevent the recovery of the previous version. This just writes random characters into a file. – Ramhound Jun 07 '12 at 19:21
  • @Ramhound: I am not sure that I understand your comment. Please elaborate. – Thomas Jun 07 '12 at 20:44
  • @Thomas - Do some research on operating systems and file versions and get back with me. – Ramhound Jun 08 '12 at 17:38
  • @Ramhound: Ok, do you have a link to something that I could read? – Thomas Jun 08 '12 at 17:39

4 Answers4

6

Sounds like a fun project. I know you said "simple," but here are my thoughts anyway.

The data you're writing the file over with isn't random, and one pass will still leave traces of the original data. It depends on the storage medium. For example, with magnetic devices, there's magnetic force microscopy. Even after ten rounds of the same thing, I'm not sure you're adding any extra benefits.

Renaming and deleting files won't add robust protection because, on most file systems, it simply changes the pointer to the location on the disk where the data is. Deleting a file only removes the pointer, so the disk sees the blocks as "available" even though the file is actually there. (This is why recovery software works.)

Thorough, secure deletes have the following features (software-based, not considering hardware solutions):

  • PRNG (Pseudorandom number generator) which generates random values to write at every byte of the file's allocated space (also see the encryption feature below).

  • Multiple passes (careful here, depends on the medium)

  • A run with all 0s then all 1s couldn't hurt, either. It may help interrupt a predictable, yet subtle, pattern of randomness (because computers aren't truly random).

  • Some file systems are "Copy-on-write", which is like a type of "revision control." These try to avoid overwriting data already in place. That protection would have to be circumvented.

  • RAID devices mirror changes to a disk onto another disk.

  • Fragmented files may start at one sector and finish at an entirely other part of the disk. Alternatively, anti-fragmentation features may keep redundant copies of the data or relocate them in real time.

  • Solid-state drives manage the disk space differently than magnetic drives. Coupled with wear-leveling, there are technical reasons that make secure wiping a little tricky. (See Wikipedia link below)

  • Encrypting a file before wiping it will help fill it with "random" bits... but really the best answer to secure shredding is to encrypt it before it ever hits the disk.

Don't miss this great question which talks about wiping info on solid-state drives, particularly flash drives. Remember that SSDs wear out after being written enough times.

Look at this article on Wikipedia for some more detailed background info.

By the way...

I'd really like to see a shredder that grants some plausible deniability. It would securely erase the file, then plop remnants of a decoy file in its place, perhaps a file chosen by the user that exists elsewhere on the hard drive. It would appear to be the remnants of a copy of that file which at one time was pointed to by the file system, was deleted by the user, and, depending on the size, was potentially being gradually overwritten by regular use.

If a forensics agent were to examine the portion of a disk where a regular shredder was run, it's easy to tell that it was wiped with random data. But if a decoy file was put in its place, I imagine it would be harder to tell.

Matt
  • 3,212
  • 2
  • 21
  • 27
  • 1
    Thanks for the answer. So you are saying that if I want to just focus on the software version (1) overwriting with new "random" data should be done (2) Including overwriting with non-random data as a decoy. So if I do that, will I then get the best that one can do with software alone? How do I check if a program like what I suggest written in c++ will actually overwrite the bytes and not just do "copy on write"? – Thomas Jun 07 '12 at 13:17
  • 1
    I think that would be sufficient, esp. for a "simple" project. Determine the filesystem programmatically. I doubt you'll encounter "copy-on-write" file systems often, if at all, for normal use... one I'd be more concerned with is a journaling file system like EXT3 or EXT4, used frequently by Linux. I forgot to mention that in my answer, but the linked articles should fill you in. Anyway, if the file system you detect uses journaling or revision control, perhaps you can take steps to mitigate any risk... someone else may know better how to do that. – Matt Jun 07 '12 at 13:46
  • Nice exploration of *what's about*! Just: In order to install a decoy, you have to: .1 shred all blocks, .2 fill all block with 0 (0x00) and finaly .3 install a fake. For ensuring not to leave suspect random data in the last block! – F. Hauri - Give Up GitHub Mar 30 '13 at 06:45
3

Secure file deletion is more a factor of the filesystem rather than the application. Several filesystems use copy-on-write mechanics, including most famously ZFS and BTRFS, but also to some extent even NTFS with shadow copies. So overwriting a file with random data just allocates some previously-blank space and fills it with randomness. Somewhat pointless in all practicality.

A secure delete program would have to be filesystem-aware, if not even a filesystem driver itself, in order to directly seek out and overwrite the relevant blocks. And even then there's the possibility that even the filesystem doesn't have the final word as to whether data gets overwritten. For example, LVM snapshots and SSD wear-leveling can prevent you from actually overwriting your data, again simply allocating otherwise blank sectors and writing your random data there instead.

You're much better off, much safer off, wiping the whole drive in one go using DBAN or something similar.

Also, while technically, theoretically, you can recover previously overwritten bits using a magnetic force microscope or what have you, the chances of that even being a possibility, even in the most extreme circumstances, are approximately zero. Every bit would have to be manually, individually reconstructed at a comparatively low rate of success and an extremely high cost per bit if it's even possible at all, which with today's high areal density drives, appears impossibly unlikely. The original magnetic traces even before being overwritten are so weak that modern drives rely heavily on ECC bits just to reliably retrive your data.

See also: http://www.anti-forensics.com/disk-wiping-one-pass-is-enough

user
  • 7,700
  • 2
  • 30
  • 54
tylerl
  • 82,665
  • 26
  • 149
  • 230
2

You shouldn't write your own shredder programmer. Instead, grab an existing well-vetted shredder.

  • For hard-disk shredding, I recommend ATA Secure Erase or DBAN (search this site for details). DBAN is open-source, so you can look at its source.

  • For file shredding, I recommend GNU's shred program (comes with Coreutils). shred is open-source, so you can look at its source, too.

Do understand that file-level shredding is inherently insecure in many circumstances, no matter how carefully written the program is. Given how modern filesystems, overwriting a file with different bytes does not mean that all copies of the original data on disk will necessarily be overwritten. In some cases, overwriting a file may cause the filesystem to make a copy and modify the copy, leaving the original data stored elsewhere in the hard disk (in "free space") where a file-level shredder cannot delete it. In other cases, remnants of the old data may be left around on the hard disk for other reasons.

D.W.
  • 98,860
  • 33
  • 271
  • 588
  • Thanks for the answer. After researching the issue a bit on my own, I have come to the same conclusion: That my suggested program will not really be that secure. But at least it should be better than simply deleting the file with "del finlename" (or rm or whatever). Oon http://www.gnu.org/software/coreutils/manual/html_node/shred-invocation.html that it seems that shred also only overwrites the data like I suggest. Do you have a link to the source code for shred? (I guess I could download the whole coreutils package, but...) – Thomas Jun 08 '12 at 17:37
  • 1
    @Thomas, Yes, I recommend that you download the coreutils package. Alternatively, [look here](http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/shred.c) for the source to shred. – D.W. Jun 08 '12 at 17:42
  • P.S. Be warned that some of the comments in the code [are out of date and no longer accurate about modern disk technology](http://security.stackexchange.com/q/10464/971). – D.W. Jun 08 '12 at 17:44
  • Disagree, you *should* write your own file-shredder so you understand how it works. Look up how the pros do it, and mimic what they're doing. – Mark Buffalo Jan 02 '16 at 02:25
1

As you are aware simply deleting a file leaves data on the filesystem, which may later be overwritten or not.

However, a single overwrite of a file prevents recovering the data in practice, though there's a small theoretical threat of partial data recovery by analyzing bit by bit with an electron microscope. This threat has never been demonstrated in practice to recover any significant amount (e.g., 1 kb+ ~ 8192 bits) of data as opposed to a few bits under idealized conditions. See [1], [2].

The major problem with secure deletion by overwriting is not the number of wipes, its that you do not know exactly how your operating system/hardware is handling your data. E.g., you may have initially saved a file to a given sector, but that block was later determined to be bad, so it mapped the data to a different sector. The problem gets worse with solid state drives (SSD) that due to limitations of flash technology (limited overwrites of a given block; must first zero out entire block before writing; then rewrites entire block), the hardware tries to avoid rewriting sectors whenever possible for optimal operation.

Solution: if your data needs to be kept secret, use full disk encryption. If the encryption key is compromised, write over the entire disk (though data may still exist in bad sectors) or physically destroy the disk.

dr jimbob
  • 38,936
  • 8
  • 92
  • 162