105

After all these articles circulating online about md5 exploits, I am considering switching to another hash algorithm. As far as I know it's always been the algorithm of choice among numerous DBAs. Is it that much of a benefit to use MD5 instead of (SHA1, SHA256, SHA384, SHA512), or is it pure performance issue?

What other hash do you recommend (taking into consideration data-bound applications as the platform)? I'm using salted hashes currently (MD5 salted hashes). Please consider both md5 file hashes and password hashes alike.

Zuly Gonzalez
  • 394
  • 3
  • 21
Tawfik Khalifeh
  • 2,542
  • 6
  • 22
  • 27
  • 6
    You mentioned salted hashes, does that mean you're talking about password hashing? Password hashing requires different properties from normal hashing, which makes SHA-256 almost as bad as MD5 in this context. – CodesInChaos Sep 08 '12 at 18:51
  • I'm using md5 hash to check for critical files integrity before loading them, and salted md5 hashes for passwords. – Tawfik Khalifeh Sep 08 '12 at 19:00
  • 5
    Neither is good choice, but for completely different reasons. – CodesInChaos Sep 08 '12 at 19:01
  • like how much bad, i need to fully understand the situation before recoding the whole part, its a bloody 24 hours at minimum. the code base is like 2K – Tawfik Khalifeh Sep 08 '12 at 19:03
  • 5
    You probably want to read [How to securely hash passwords](http://security.stackexchange.com/questions/211/how-to-securely-hash-passwords). I think it's one of the most important questions on this site. – Brendan Long Sep 09 '12 at 20:50

7 Answers7

125

MD5 for passwords

Using salted md5 for passwords is a bad idea. Not because of MD5's cryptographic weaknesses, but because it's fast. This means that an attacker can try billions of candidate passwords per second on a single GPU.

What you should use are deliberately slow hash constructions, such as scrypt, bcrypt and PBKDF2. Simple salted SHA-2 is not good enough because, like most general purpose hashes, it's fast. Check out How to securely hash passwords? for details on what you should use.

MD5 for file integrity

Using MD5 for file integrity may or may not be a practical problem, depending on your exact usage scenario.

The attacks against MD5 are collision attacks, not pre-image attacks. This means an attacker can produce two files with the same hash, if he has control over both of them. But he can't match the hash of an existing file he didn't influence.

I don't know if the attacks applies to your application, but personally I'd start migrating even if you think it doesn't. It's far too easy to overlook something. Better safe than sorry.

The best solution in this context is SHA-2 (SHA-256) for now. Once SHA-3 gets standardized it will be a good choice too.

CodesInChaos
  • 11,964
  • 2
  • 40
  • 50
  • the numbers are rather scary, [hashcat](http://hashcat.net/hashcat/) can try up to [86.24M combination/s] on 8 threads win 7 64bit (md5 hash), it's like a new era of password cracking at the loose. nice answer... – Tawfik Khalifeh Sep 08 '12 at 19:20
  • 4
    @sarepta hashcat is harmless compared to ocl-hashcat which runs on a GPU. A single GPU can to over 6 billion combinations per second with that. – CodesInChaos Sep 08 '12 at 19:23
  • 1
    A pre-image attack is theoretically possible against MD5, current attacks have a computational complexity of 2^123.4 for full pre-image though. – ewanm89 Sep 08 '12 at 23:05
  • 4
    Keep in mind that *partial* pre-image attacks are possible with MD5. You can take an existing file and alter metadata / append junk and generate a collision against a file you generate entirely. That's how the MD5 SSL certificate collision attack works. – Polynomial Sep 10 '12 at 06:03
  • 4
    @ewanm89 - 2^123.4 is infeasible (even with billions of GPUs calculating billions of MD5 hashes per second for billions of years). Yes its better than 2^128 by a factor of 24, but the distinction is meaningless for real attacks. (But agree with other reasons to avoid MD5). – dr jimbob Sep 10 '12 at 06:24
  • 2
    @Polynomial I wouldn't call that a partial pre-image. It's rather something like a structured collision. – CodesInChaos Sep 10 '12 at 09:41
  • @CodesInChaos Yeah, that's probably a better description. It was about 7am when I wrote that! – Polynomial Sep 10 '12 at 09:50
  • @drjimbob I didn't say it was feasible yet, though we are getting there, I just pointed out it exists, and that was a full pre-image attack. As others have pointed out, partial pre-image is often good enough. At our current rate with GPU and and FPGA hardware accelerating bruteforce and still getting faster all the time it probably won't be all that long before full pre-image becomes feasible. – ewanm89 Sep 10 '12 at 10:25
  • @sarepta My GPU can handle ~279.5M combinations a second for SHA1 and it's a couple of years old now. I hate to think what a nice shiny new GPU can do for the faster MD5... – ewanm89 Sep 10 '12 at 10:29
  • 1
    Using GPUs you can get 33.1Billion hashes a second for MD5. 6 char passwords are instantly toast. – Bradley Kreider Sep 10 '12 at 20:53
  • The authors of the Flame malware managed to generate a chosen-prefix collision in MD5 which is quite scary, I wouldn't recommend this algorithm for any purpose anymore. SHA-3 is standardized by now AFAIK. – buherator Jun 20 '13 at 08:36
  • @buherator I don't think SHA-3 is standardized yet. We know that keccak is the winner of the competition, but we don't know which tweaks NIST will apply to it before it becomes SHA-3. Concerning the insecurity of MD5, not every application relies on collision resistance, so not every application needs to urgently migrate away from MD5. For new projects I certainly wouldn't recommend MD5. – CodesInChaos Jun 20 '13 at 08:45
  • I've seen this reasoning (about speed of md5) a number of times, and while I agree that slower hashing is more secure, I don't understand the practical advice against salting. So, I am salting strings with 20 chars salt, the resulting string is at least 28 chars. Now, an attacker can do billions of hashes per second per GPU, let's say he has millions of GPUs running for an year. What is the chance he will get the original string? I can do math, the probability is practically 0. – ivan Oct 04 '17 at 11:21
  • @ivan A salt is stored alongside the password hash and thus known to the attacker. Password hashing is only the last level defense when an attacker has compromised the server all its secrets (including encryption keys used to encrypt the password hash, or pepper). – CodesInChaos Oct 04 '17 at 11:47
38

To complete @CodesInChaos' answer, MD5 is often used because of Tradition, not because of performance. People who deal with databases are not the same people as those who deal with security. They often see no problem in using weak algorithms (e.g. see the joke of an algorithm that MySQL was using for hashing passwords). They use MD5 because they used to use MD5 and are used to using MD5.

Performance is much more often discussed than measured; and yet, logically, there cannot be a performance issue if there is nothing to measure. Using one core of a basic CPU, you can hash more than 400 MBytes per second with MD5, closer to 300 MB/s with SHA-1, and 150 MB/s with SHA-256. On the other hand, a decent hard disk will yield data at an even lower rate (100 to 120 MB/s would be typical) so the hash function is hardly ever the bottleneck. Consequently, there is no performance issue relatively to hashing in databases.

The usual recommendations, for hash functions, are:

  1. Don't do it. You should not use elementary cryptographic algorithms, but protocols which assemble several algorithms so that they collectively provide some security features (e.g. transfer of data with confidentiality and integrity).

  2. Really, don't do it. For storing passwords (more accurately, password verification tokens), don't make a custom mix of a hash function and salts; use a construction which has been studied specifically for such a use. This normally means bcrypt or PBKDF2.

  3. If a hash function is indeed what does the job, then use SHA-256. Consider using any other function only if some serious problem with SHA-256 (most probably its performance) has been duly detected and measured.

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955
  • I see your answer as: "Don't use hash functions directly -- use a larger system such as TLS for data transfer, certificates for authentication, and/or bcrypt or PBKDF2 for password storage." – Josiah Yoder Oct 16 '19 at 14:05
7

I'm using salted hashes currently (MD5 salted hashes).

If you are salting MD5 hashes, you definitely don't want to be using MD5. It sounds like you need to use PBKDF2 or bcrypt.

As far as I know it's always been the algorithm of choice among numerous DBAs.

That's not a compelling reason.

I have worked with a lot of DBAs that are at least 5 years behind in general technology (not using version control, unformatted perl scripts for everything, etc). They might have been particularly bad DBAs, but I think it comes with the extremely conservative mindset of not changing things.

Bradley Kreider
  • 6,182
  • 2
  • 24
  • 36
6

Just to complement the answers already given (most of which are excellent) we now have a real world example of where a data breach (Ashley Madison) lead to the entire password table being leaked. They used bcrypt with a random salt to hash the passwords. A security researcher decided to take those hashes and brute force them. This was the result

As a result of all this, bcrypt is putting Herculean demands on anyone trying to crack the Ashley Madison dump for at least two reasons. First, 4,096 hashing iterations require huge amounts of computing power. In Pierce's case, bcrypt limited the speed of his four-GPU cracking rig to a paltry 156 guesses per second. Second, because bcrypt hashes are salted, his rig must guess the plaintext of each hash one at a time, rather than all in unison.

"Yes, that's right, 156 hashes per second," Pierce wrote. "To someone who's used to cracking MD5 passwords, this looks pretty disappointing, but it's bcrypt, so I'll take what I can get."

Pierce gave up once he passed the 4,000 mark. To run all six million hashes in Pierce's limited pool against the RockYou passwords would have required a whopping 19,493 years, he estimated. With a total 36 million hashed passwords in the Ashley Madison dump, it would have taken 116,958 years to complete the job.

At the end of the day, the only ones he was able to crack were ridiculously simple or common passwords (like "123456").

Machavity
  • 3,808
  • 1
  • 14
  • 31
  • 1
    Actually, some of them were available as MD5 hashes (either from an older system or something else, I'm not sure). Those were cracked no problem. – Alexander O'Mara Jul 11 '16 at 19:41
0

Yes MD5 is insecure and so is SHA-1, I recommend using SHA-256 if size of the digest is an issue. Remember, if you store it into a BINARY column, it will take less space that if stored into CHAR. Just make sure it is done properly. MD5 is a about 2.3x faster than SHA-256. More benchmarks are at http://www.cryptopp.com/benchmarks.html

Matrix
  • 4,028
  • 14
  • 25
  • 9
    Do not use standard hashes for password storage. They're way too fast. PBKDF2 / bcrypt are the way to go. – Polynomial Sep 10 '12 at 06:05
0

To put it short it is pretty insecure now because of rainbow tables, Rainbow table is a list of MD5 hashes and their matching strings. So basically i would consider other alternatives such as SHA1

Shane
  • 9
  • 2
  • Rainbow tables apply almost equally to all unsalted hashes. Migrating to SHA1 or SHA2 would do almost nothing. The correct approach for password hashing is using a salt together with a deliberately slow password hashing function such as PBKDF2, bcrypt or scrypt. Even with fast unsalted hashes it's often cheaper to run a new GPU search compared to using a rainbow table. – CodesInChaos Jun 20 '13 at 15:15
-1


You are all talking about insecurity, but none of you has given a straight anwser to the question.

I have been working with MD5 and Sha512, both of them where easy to crack once I had put my little hacker expert on it, until it both hit us like a thunder! The most simple and effective way to stop "Mass Attacks" like the ones described above, is setting a limiter of login attemps, meaning :

function checkbrute($user_id, $mysqli) {
 // Get timestamp of current time
   $now = time();
   // All login attempts are counted from the past 2 hours. 
   $valid_attempts = $now - (2 * 60 * 60); 

   if ($stmt = $mysqli->prepare("SELECT time FROM login_attempts WHERE user_id = ? AND time > '$valid_attempts'")) { 
  $stmt->bind_param('i', $user_id); 
  // Execute the prepared query.
  $stmt->execute();
  $stmt->store_result();
  // If there has been more than 5 failed logins
  if($stmt->num_rows > 5) {
     return true;
  } else {
     return false;
  }
}
}


You get my point?
The maximum amount of combinations the hacker wants to try, he is limited to 5 fails per 2 hours. You can even block the user after X attempts.

jonsca
  • 343
  • 1
  • 6
  • 21
Dom
  • 29
  • 8
    You're not accounting for stolen / improperly discarded backups of user databases or user records retrieved en masse through some exploit (say a SQLi). Your answer is pertaining to hacking live databases through and by the server application that is supposed to be the only one accessing such data, one user account at a time, which - if that was the only risk - could be solved much more elegantly and password hashes wouldn't be needed in the first place. Don't believe me? Ask [LinkedIn](http://www.zdnet.com/blog/btl/6-46-million-linkedin-passwords-leaked-online/79290) (among many others) ;) – TildalWave Jun 20 '13 at 08:18
  • 4
    1) I explicitly wrote that MD5 and SHA-2 are not secure as password hashes. 2) There are no known attacks on SHA-512 when used properly. It's a cryptographic hash, not a password hash. 3) You're missing the point of password hashes. The point of a password hash is to protect passwords when your database has been leaked. In that scenario a rate limiter doesn't help at all. – CodesInChaos Jun 20 '13 at 08:28
  • 4
    FWIW limiting login attempts in this ways just makes it very easy to perform denial of service – Eric Grange Dec 17 '15 at 11:19
  • @EricGrange: Obviously you limit them for each IP or whatever, not in total. – Evi1M4chine Sep 09 '17 at 05:01
  • 1
    @Evi1M4chine in the era of IPv6, an attacker can easily (and cheaply) have millions of different IPv6 addresses – Eric Grange Sep 28 '17 at 15:56