52

Scenario: a database of hashed and and salted passwords, including salts for each password, is stolen by a malicious user. Passwords are 6-10 chars long and chosen by non-technical users.

Can this malicious user crack these passwords?

My understanding is that MD5 and SHA-1 are not safe anymore as GPU assisted password recovery tools can calculate billions of these hashes per second per GPU.

What about SHA-256 or SHA-512? Are they safe currently? What about in a few years?

AviD
  • 72,708
  • 22
  • 137
  • 218
Seppo Erviälä
  • 621
  • 1
  • 5
  • 6
  • 1
    [Lifetimes of cryptographic hash functions](http://valerieaurora.org/hash.html) is very interesting. A variation I've heard of is to always use *two* hashing algorithms. That way you can always deprecate one as soon as it's insecure, and you get a time window to migrate all data instances. – l0b0 Jun 22 '11 at 11:18
  • @l0b0 depending on the use case, that can be detrimental to the overall security of the cryptosystem. E.g. using a weak hash as input to a strong hash, will still leave the strong hash with very low entropy (if this matter for your situation). – AviD Jun 27 '11 at 20:52

4 Answers4

63

The question doesn't state how many rounds of hashing are performed. And the whole answer hinges on that point.

All hash functions are unsafe if you use only one iteration. The hash function, whether it is SHA-1, or one of the SHA-2 family, should be repeated thousands of times. I would consider 10,000 iterations the minimum, and 100,000 iterations is not unreasonable, given the low cost of powerful hardware.

Short passwords are also unsafe. 8 characters should be the minimum, even for low value targets (because users reuse the same password for multiple applications).

With a $150 graphics card, you can perform 680 million SHA-1 hash computations per second. If you use only one round of hashing, all 6-character passwords can be tested in a little over 15 minutes (that's assuming all 94 printable ASCII characters are used). Each additional character multiplies the time by 94, so 7 characters requires one day, 8 characters requires 103 days on this setup. Remember, this scenario is a 14-year-old using his GPU, not an organized criminal with real money.

Now consider the effect of performing multiple iterations. If 1,000 iterations of hashing are performed, the 6-character password space takes almost 12 days instead of 15 minutes. A 7-character space takes 3 years. If 20,000 iterations are used, those numbers go up to 8 months and 60 years, respectively. At this point, even short passwords cannot be exhaustively searched; the attacker has to fall back to a dictionary of "most likely" passwords.

nealmcb
  • 20,693
  • 6
  • 71
  • 117
erickson
  • 1,783
  • 11
  • 13
  • It's worth to note that a lot of passwords can be covered with much smaller amount of characters. http://www.r-bloggers.com/character-occurrence-in-passwords/ –  Jun 21 '11 at 16:25
  • But here is my main mistake, I didn't understand that these algorithms are meant to be iterated hundreds of times. –  Jun 21 '11 at 16:26
  • 4
    @Seppo --- **thousands** of times, not hundreds. Even the original PKCS #5 recommend at least 1000 iterations. You should be thinking about tens of thousands. The good thing is, you should be able to update existing password hashes without forcing users to pick new passwords. Obviously if bad guys already have the poorly hashed passwords, it's too late (think about backups, etc.). But if they don't yet, it's not too late to fix things. –  Jun 21 '11 at 16:31
  • 2
    You can also use bcrypt or similar tools which take some type of work factor to slow down the hashing process for you. http://codahale.com/how-to-safely-store-a-password/ –  Jun 21 '11 at 18:42
  • 1
    The iteration count of PBKDF2 is the work factor for a hashing solution. Bcrypt is really not so exotic. They both achieve slowness with a tunable key derivation routine. One key difference is that bcrypt actually uses the derived key to encrypt a well-known plain text, and store the resulting cipher text, while most hashing solutions just store the derived key directly. However, I think both approaches are robust. –  Jun 21 '11 at 19:04
  • 1
    Succinctly, the generic cryptographic hash algorithms are designed to be fast. Protecting a password requires a slow cryptographic hash algorithm. A fast cryptographic hash algorithm, iterated 2^16 times, becomes a slow cryptographic hash algorithm. Then add in other requirements such as salts, etc. – yfeldblum Jun 21 '11 at 23:00
  • 2
    "Each additional charcter multiplies the time by 94, [...] 7 characters requires one day, 8 characters requires 7 years" - shouldn't 8 characters require 94 days? – Nick Johnson Jun 22 '11 at 00:52
  • Also, regarding bcrypt: Isn't the point that bcrypt is equally slow on a GPU (or at least has a much smaller speedup)? We can expect attackers to have GPUs, but most servers don't, so anything that reduces the attacker's advantage seems like a good idea. – Nick Johnson Jun 22 '11 at 00:53
  • @nick: yes - @erickson is wrong about 8 characters: 94^8 / 680 10^6 / 3600 / 24 => 103 days. As for GPUs and bcrypt, from what I've seen there may not be much published work on optimizing bcrypt for them, but no reason to think it would be very hard. – nealmcb Jun 26 '11 at 19:41
  • After getting another confirmation, I fixed it for 8 characters. – nealmcb Jun 27 '11 at 13:00
  • @nealmcb I believe Thomas Pornin has stated that bcrypt was designed to make GPU and hardware bruteforcing slow. Something about mutating some data to force them to hit have to keep hitting main memory? – Bradley Kreider Apr 26 '12 at 17:47
  • The work performed by bcrypt in key deriving a key is the same as PBKDF2, so hardware should affect both equally. You may be thinking of scrypt, which is designed to defeat memory optimization. – erickson Apr 26 '12 at 18:09
  • 1
    @erickson: no, there really is a difference. PBKDF2 with SHA-256 is all 32-bit arithmetic operations; bcrypt is pseudorandom accesses (read and writes) in a 4 kB buffer. This makes bcrypt much harder to optimize on a GPU. Scrypt just builds further on that concept, extending the memory buffer to _megabytes_ (because while bcrypt is hostile to GPU, modern FPGA has small embedded RAM blocks which work just fine for that -- but FPGA suffer on scrypt implementations). – Thomas Pornin Feb 09 '13 at 12:45
  • @ThomasPornin I didn't realize that, thanks. I will have to take a closer look at `bcrypt`. – erickson Feb 09 '13 at 19:35
  • 1
    Sad truth is scrypt & bcrypt do not offer much in terms of practical improvements vs brute force: a round takes more time, but you can in practice do less of them, because you are limited in terms of how long you can take to check a hash. GPU/FPGA/ASIC have been getting a lot better at scrypt, and have a fair margin for improvement before being limited by Moore's Law. CPUs are getting much better at SHA-256, while GPUs/FPGA/ASIC are now limited by Moore's Law for SHA-256. The medium/long term outlook is in favor of SHA-256. – Eric Grange Apr 21 '15 at 12:26
  • That's the first time I've seen that information, looks interesting! Got anything to read up on a more detailed explanation? – Ben May 11 '16 at 19:31
  • QUESTION: So this answer assumes that the attacker *knows* the number of hashing rounds that were used right? So, if you can hide your code, is using a low but obscure number, like say 628 hashing rounds instead of a flat number like 1000 or 10000, a secure alternative? – csstudent1418 Aug 23 '20 at 20:43
  • 1
    @csstudent1418 No, that would not add security. First, it's a form of security through obscurity. You should design your system so that it's secure when the algorithms are exposed and only the keys are secret. But more importantly in this case, the attacker can check with each iteration up to some maximum they choose. So if you pick 628, they'll crack the password on their way to 10000. – erickson Aug 23 '20 at 20:59
17

The salt may be considered public practically by definition.

As for what hash to use and how "safe" it is, I would stick with the recommendations NIST makes for U.S. government agencies. As the Wikipedia article says:

SHA-1 is being retired for most government uses; the U.S. National Institute of Standards and Technology says, "Federal agencies should stop using SHA-1 for...applications that require collision resistance as soon as practical, and must use the SHA-2 family of hash functions for these applications after 2010"

You can speculate about how safe or unsafe various algorithms are, but why bother when you have a spec the U.S. government considers good enough for its own use?

[edit]

By the way, your question appears to reflect a misunderstanding about how these analyses are performed. There is much more to a hash function than its number of bits.

Brute-forcing a 160-bit hash like SHA-1 requires 2^160 effort; that is simply not going to happen, no matter how many "GPUs" you throw at it. In fact it probably will not happen in the lifetime of the universe. (True, brute-forcing a collision requires "only" 2^80 effort. But it also requires 2^80 storage. Good luck with that.)

The reason MD5 and SHA-1 are considered "possibly unsafe" is that weaknesses have been discovered in each algorithm that might reduce the work way, way below the brute-force effort. The reason for switching to a newer algorithm (like SHA-2) is to avoid those known weaknesses; the reason for using 256 bits is to provide some buffer against unknown weaknesses. It has nothing to do with "GPUs"; to resist brute-force attacks, 160 bits is plenty.

Similar statements hold for all cryptography, by the way.

Now, it is impossible to overstate how much more the folks at NSA know about this stuff than you or I do. They make recommendations for U.S. government agencies and U.S. industry via NIST. So unless your adversary is itself a major world government, the NIST recommendations are more than sufficient. Ignore any self-proclaimed "experts" who try to tell you otherwise; NSA is better at what it does than you can possibly imagine.

If your adversary is a major world government, nobody at SO is qualified to help you. And you have much bigger problems than which hash function to use.

[edit 2]

Of course I am assuming you are already using this function as part of a standard password hashing scheme, like PKCS#5 (also known as "PBKDF2"), and your question was purely about what hash function to use.

PBKDF2 recommends a minimum of 1000 iterations, but you can choose whatever you want. The attacker's brute-force effort goes up linearly with that number, so 10000 iterations vs. 1000 means it will take them 10 times longer to brute-force each password.

Nemo
  • 269
  • 1
  • 4
  • 1
    If I can calculate hashes really quickly what stops me from going through all the possible password combinations that are 8 or less characters long? –  Jun 21 '11 at 16:01
  • 1
    Character distribution in passwords isn't random either: http://www.r-bloggers.com/character-occurrence-in-passwords/ –  Jun 21 '11 at 16:01
  • 1
    @Seppo: The salt. You would have to repeat the search for _each_ password you want to crack. Also note that I am assuming you are using a standard like [PKCS#5](http://tools.ietf.org/html/rfc2898), which adds a bunch of iterations to make brute-force attacks difficult even for one password. Basically, my advice is to stick to standard algorithms and modes of operation. –  Jun 21 '11 at 16:03
  • 3
    Brute-forcing a 160-bit secret, where each bits is a separate random variable uniformly distributed across `{0,1}`, takes 2^160 time. Brute-forcing user passwords that have been hashed, where many passwords are 6 letters long, use no capitalization, numerals, or special characters, and are found in the English dictionary, takes significantly less time. – yfeldblum Jun 21 '11 at 23:05
0

The purpose of salt is to protect from attacks using ranbow table. If the attacker knows the salt, he/she can calculate a new rainbow table based on the salt value.

So the short answer would be yes. Salt being stolen will allow attacker to be able to use rainbow table attack against these credentials (although the attacker has to build a unique rainbow table first).

The long answer would be depending on how complex these passwords are. If the user uses a simple password (stackoverflow), then salt being stolen would have a bigger impact on the password. If the user uses a complex password ($TACK0verflow), then the salt being stolen would have a much smaller impact (as they are unlikely to appear in any rainbow table).

Jim
  • 125
  • 2
  • 8
    This is incorrect and reflects a misunderstanding of rainbow tables and salts. – President James K. Polk Jun 21 '11 at 14:23
  • 2
    Aren't you supposed to generate a new salt for each password? –  Jun 21 '11 at 14:50
  • @Seppo Erviälä: If you want to make it correct yes - but often programmers interpretation of a salt is a constant string appended to the password. Hopefully this "salt" is created randomly while installation. Needless to say that those variants would allow to generate a rainbow table for cracking the passwords. At least the attacker has to generate a new rainbow table. – Robert Jun 21 '11 at 15:12
  • 5
    @GregS: one of the great things about our community is that if you have additional or different information you can provide it simply by posting an answer... – NotMe Jun 21 '11 at 15:27
  • @GregS: It would be nice if you can point out where I went wrong. – Jim Jun 21 '11 at 18:09
  • 10
    The only reason to compute a rainbow table is to *reuse* the effort to compute the table. The cost of computing the table is slightly more than the cost of trying all the possible passwords that the table covers. A salt eliminates the benefit of the rainbow table because every password entry has a different salt. – President James K. Polk Jun 21 '11 at 23:37
  • 3
    Essentially, salts protect the file as a whole, but they do not protect one password. Without salt, you break the whole file in one pass over all possible passwords. With *varying* salts, you break one password in a pass over all possibile passwords. Oh, and salts protect against someone noticing that another crypt'd string is the same as theirs on the off chance that they both used the same password. – dannysauer Jun 22 '11 at 23:04
  • 1
    I think Jim's just using the term "rainbow table attack" to mean "brute force search" in the case that unique per-entry salts are used. – Yuliy Jun 23 '11 at 07:12
  • 3
    @Yuliy, if that is how Jim is using the phrase "rainbow table attack", he is using it wrong. That phrase already has an accepted meaning, and it is not the same as "brute-force search". – D.W. Jun 27 '11 at 17:29
0

If the attacker knows the salt, he/she can calculate a new rainbow table based on the salt value.

It depends on the algorithm being used. According to Coda Hale, MD5 and the SHA family of hashing algorithms are general purpose and meant for speed. In short, he goes on to conclude that these are not sufficient for preventing dictionary attacks and that bcrypt is the way to go, as there is more setup time involved in cracking passwords, drastically slowing down an attacker. Here's another interesting article on bcrypt.

What about SHA-256 or SHA-512? Are they safe currently? What about in a few years?

Be warned, there are two sides to this argument. There is the bcrypt side and the SHA-256 and up side. If you believe those in the SHA family side, then I believe you'll be safe for more than a few years with SHA-256, even some argue that SHA-1 with a salt is still good enough, but if you want to be safe then I would stick to SHA-256. There is a forum that also mentions that MD5 still has 5-10 years to be proven "sufficiently unreliable." Unfortunately I can't post the links to it, because I already reached my maximum of two.

Personally, I'm using SHA-256 with a salt, but I'll definitely have to read more into bcrypt and re-evaluate. Hopefully, this helps. This is a type of question you can easily spend countless hours researching, but nonetheless interesting.

  • 1
    Here is the [forum](http://www.forensicfocus.com/index.php?name=Forums&file=viewtopic&t=7326) that mentions MD5 still has 5-10 years. Also, the argument that SHA-1 is [still good enough](http://www.crimulus.com/2010/09/30/the-effectiveness-of-salting-a-sha1-hash-then-storing-it-as-a-sha1-hash/). –  Jun 21 '11 at 15:27
  • 1
    Yay, for loopholes! –  Jun 21 '11 at 15:28
  • SHA-256 and a salt are decent; but, sorry, this advice is just bad. MD5 is easy easy to crack in real time even up to 12 characters. There's really no reason you'd want to push a loophole you already know is going to be deprecated when there's perfectly modern solutions available without much added effort. – That Realtor Programmer Guy Jun 01 '15 at 02:57