40

What's wrong with this code?

$password = "hello";
$password = md5($password);
for ($i = 1; $i < 20; $i++) {
  $password = md5($password);
}

I don't think an attacker with access to the hashes storage would be able to decrypt any password using more than 2 characters.

The attacker would have to decrypt this list of hashes to be able to gain the plaintext password.

69a329523ce1ec88bf63061863d9cb14
0dcd649d4ef5f787e39ddf48d8e625a5
5d6aaee903365197ede6f325eb3716c5
cbe8d0c48ab0ed8d23eacb1621f6c5c3
8fa852c5f5b1d0d6b1cb0fad32596c71
91a84cf929b73800d2ff81da28834c64
45b7d5e4d3fca6a4868d46a941076b72
e5b7d9d10fef132829731255ef644319
b3af6ff5f5c7ae757ca816a6cb62f092
150f3682b2e58d1d0e1f789f9ba06982
3f76626950bf31dbc815c667ca4b2b43
44f4c75517671e12946aab3c8c293f98
442256b098b2d88e93428b08b5155308
7fd8ebc5bdff94f24a10decaa1ab64e8
d04bbc863839b720d932a697f0bf443b
de737c934db23c2d1d1026863e7583a8
d745f6394700c4ab1e9ded0924ea35d2
ce9428b51d3a63431c57a435423877b6
7017f40bdb4f1be1f5fae7dd0fc7b907

With brute force, an attacher should try 3632 (*19) combinations, which is pretty unachievable. Isn't that true?

apaderno
  • 111
  • 5
genesis
  • 718
  • 6
  • 15
  • See also http://stackoverflow.com/questions/420843/need-some-help-understanding-password-salt –  Jul 24 '11 at 14:35
  • 1
    Why is this bad? 1) You didn't design it before you wrote it. 2) There don't appear to be any error checks. 3) You don't provide a check on the strength of the password. 4) You havn't specified the compiler or interpreter options for secure operation. 5) You haven't provided any test vectors. 6) It hasn't been independently reviewed and tested. – this.josh Jul 24 '11 at 23:40
  • 3
    Not 36^32. 16^32. You are looking at hex-digest of the md5 hash, so there are thirty-two characters selected from an 16 character alphabet (0-9,a-f) not 36. – dr jimbob Jul 25 '11 at 14:24
  • @drjimbob: aaah, it'S earier than I thought! Could you add it as an answer? I would like to accept it – genesis Jul 25 '11 at 14:29
  • @genesis, not to nitpick, but I pointed that out in my answer about two hours before dr jimbob's comment. http://security.stackexchange.com/questions/5586/why-do-people-think-that-this-is-bad-way-to-hash-passwords/5627#5627 – user Jul 26 '11 at 08:02
  • because you need a minimum of 21 nested `md5()`s. it's the Golden Rule of Cryptography! (yes i am joking) – geoff Feb 24 '14 at 23:44

15 Answers15

76

The wrong things on your method are:

  • You use way too few iterations (20 is too low, it should be 20000 or more): password processing is still too fast, an attacker with a basic PC will still be able to "try" dozens of millions of passwords per second.
  • There is no salt: an attacker may attack several passwords with very low per-password cost, e.g. with precomputed tables of hashed passwords (in particular rainbow tables).
  • You are in the process of inventing your own cryptography. There is nothing wrong with being inquisitive and trying to understand things, but since there is no sure test for knowing whether a given algorithm is secure or not, inventing your own cryptography is often a recipe for disaster. Don't do it.

What you should do is to use bcrypt; there is a PHP implementation in the Portable PHP password hashing framework.

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955
  • 2
    more iterationgs don't bring higher security. The oftener you hash, the higher is the probability of hash collisions. I'd say using md5 twice would be better than using md5 once, but if you use it 2, 20 or 20,000 times doesn't really matter. Using it twice might be better, as standard cracking websites like http://www.md5cracker.org/ will not work any longer. – Martin Thoma Jul 24 '11 at 16:39
  • 19
    @moose: collisions are not an issue here -- we work on preimage resistance, not collision resistance, and that's good because MD5 is utterly broken with regards to collisions anyway. Reduction of the output space _might_ be viewed as an issue, but even after 2^64 iterations, the space should still have size about 2^64; the weakness here is not MD5, but the password (i.e. the human brain which can memorize a password). The multiple iterations are there to make _dictionary_ attacks more expensive. – Thomas Pornin Jul 24 '11 at 17:16
  • 2
    Actually, the iteration count in today's software is much higher than 20000 - sometimes even in millions. But solid advice never the less. – Nakedible Jul 24 '11 at 18:00
  • 4
    @moose, that's not correct. Collisions are not a worry here; the likelihood of experiencing a collision in this context is negligible (like, less than the chances of being struck by lightning). For the parameters experienced here (iterating thousands, millions, or billions of times), more iterations *do* bring higher security, and the probability of collisions are negligibly small. – D.W. Aug 18 '11 at 08:11
  • @ThomasPornin Bcrypt is the best option, that's understood, but out of curiosity (situation I have no control over) is 50 iterations of SHA-256 with salting effective? Why or why not? Does that fight rainbow tables, and does the 50 iterations increase computational cost? – temporary_user_name Aug 28 '14 at 16:33
  • I'm actually going to contest the "Bcrypt is the best option" claim being thrown about (by @ThomasPornin and others) here. Bcrypt is better than PBKDF2 (which is just iterated salting and hashing) because it requires an amount of memory that isn't *completely* trivial, but it's still easy to parallelize on modern hardware. RAM is cheap these days. Scrypt or the even-newer Argon2 (winner of a three-year competition for best password hashing algorithm) are better because they are memory-hard as well as compute-hard, and the memory-hardness is tunable (Bcrypt's memory cost is low, and fixed). – CBHacking Apr 22 '18 at 02:48
35

Others have described the limitations of this hashing method; I'd like to point out a conceptual mistake in the question:

I don't think that attacker with my DB would be able to decrypt any password with lenght > 2

Attacker would have to decrypt this list of md5 hashes to be able to gain plain-password:

[list of intermediate results]

The mistake here is thinking that the complexity of the intermediate results provides any protection at all against a brute-force dictionary attack. I think the asker is thinking the attack must work backward, starting from the stored hash, and brute-force each intermediate result in turn.

This isn't true at all; reasonable dictionary attacks will start with possible passwords, and attack the whole 20-hash stack at once. Here's the algorithm sketch:

for each candidate password:
    hash 20 times
    compare with stored hash

Using this to check all possible 3-character passwords (assuming printable ASCII) would require only 20 * 95^3 = 17147500 hashes, which is basically trivial. Using SHA-512 instead of MD5, despite having much larger intermediate values, would be more secure only because each hash takes a little longer to compute.

tl;dr a complex hash function can't save you if the password itself doesn't have enough entropy.

Gordon Davisson
  • 2,601
  • 1
  • 17
  • 13
  • 2
    and how does the cracker know he must run the hash 20 times? – Enrique Feb 19 '12 at 23:46
  • 14
    @Enrique: the cracker might not know that, but you shouldn't count on it. This is [Kerckhoffs's principle/Shannon's maxim](http://en.wikipedia.org/wiki/Kerckhoffs%27_principle): the enemy knows the system. – Gordon Davisson Feb 20 '12 at 03:54
  • OK, but then it is a good idea to hash X times just because if the cracker doesn't know X it will be more difficult to him to crack the hash. Am I wrong? – Enrique Feb 20 '12 at 16:08
  • 1
    @Enrique: No, that just forces the cracker to compare after each hash, which doesn't slow him down much. You should hash many times (thousands, not just 20) so the cracker has much more work to check each possible password (see Thomas and oliland's answers). – Gordon Davisson Feb 21 '12 at 02:02
  • 4
    @Enrique, To be clear, MD5 is the wrong hash for the job. Using bcrypt, and adjusting the repetition parameter until the operation takes `1ms` **minimum** is the way to go. (perhaps even `100ms` won't be that big a deal if this just happens once per login) Using bcrypt will prevent the necessity of installing your own repetition logic, and another feature which is very important (unless there are a limited number of users), which is that each password gets its own salt, requiring one brute force job per user, instead of once for all users together. – 700 Software Mar 08 '12 at 18:56
17

20x MD5 is a fast hashing algorithm, meaning it can generate passwords at an astonishing rate.

Please stop using fast hashing algorithms to store passwords. Even with individual salts; if someone has direct (read: offline) access to your database they can be computed very easily.

This article explains why much better than I can:

http://codahale.com/how-to-safely-store-a-password/

The article heavily mentions BCrypt (with a link to a PHP library), but bear in mind there are other slow hashing algorithms that may suit you.

oliland
  • 279
  • 1
  • 3
  • even 19 * 36 ^ 32 combinations + unknown lenght, characters, salt in last md5 hash is bad? – genesis Jul 24 '11 at 15:51
  • 2
    using a reversible symmetric-key algo to encrypt your password is almost as bad as storing data in plaintext. As if you've compromised the server, there is a good chance that that you've compromised the code/key to decipher data. – dvhh Jul 25 '11 at 02:22
  • 1
    @dvhh - a MD5 cannot be reversed. – Ramhound Mar 08 '12 at 18:15
  • @Ramhound old topic, but with today's computers it is possible to compute hash fast enough for a dictionary or brute force attack, there is even precomputed tables to test against non salted passwords. So yes it cannot be reversed, but if you are motivated enough, you can find out the input that match the md5 fast enough. – dvhh Mar 09 '12 at 05:51
  • 1
    @dvhh from the article: "bcrypt is an adaptive password hashing algorithm which uses the Blowfish keying schedule, not a symmetric encryption algorithm." – oliland Aug 07 '13 at 05:22
11

The problem is that this is a pretty obvious "algorithm", and pretty fast to boot.
It's very likely that there's a precomputed rainbow table available for this "algorithm", and even if there isn't, md5 is fast enough to be able to precompute one in a realistic amount of time.

You should always use an individual salt for each password to avoid this kind of attack.

deceze
  • 715
  • 3
  • 12
  • But what would attacker need in order to get my password(s) fast? And how long would it take with rainbow table? – genesis Jul 24 '11 at 14:35
  • 7
    @gen If you have a rainbow table for the "20xMD5" algorithm, looking up a password is instantaneous. – deceze Jul 24 '11 at 14:38
  • @genesis How long it will take to crack one password? Check this: http://md5hashcracker.appspot.com/ – Chris Ghenea Jul 24 '11 at 14:38
  • @deceze: is there such a table? – genesis Jul 24 '11 at 14:39
  • @gen I can't give you a definite "yes", since I don't know, but I think it's very likely. – deceze Jul 24 '11 at 14:40
  • @Chris: Those “crackers” are just using a lookup table of known (*input*,Hash(*input*)) pairs. – Gumbo Jul 24 '11 at 14:40
  • @deceze: A rainbow table of 30+ chars. Don't think it exists. At least not a complete one – PeeHaa Jul 24 '11 at 14:41
  • @Gumbo. Yes, is true. But the table is growing and growing and growing... – Chris Ghenea Jul 24 '11 at 14:43
  • 1. The fact that the algorithm is *"obvious"* is not a security problem. 2. Using a salt is a good idea, but also more iterations are needed (and PBKDF2, bcrypt, or scrypt would be better). – D.W. Aug 18 '11 at 08:12
  • @D.W. That "obvious" actually went together with the next sentence, that it's not unlikely that a rainbow table already exists for such an algorithm. – deceze Aug 18 '11 at 08:14
  • 1
    Even if it weren't "obvious", and even if no rainbow table already existed, it would still be a bad idea. If your password hash algorithm is crummy, the fact that no one else is using it doesn't make it OK to use the crummy algorithm. – D.W. Aug 18 '11 at 08:22
  • @D.W. And I'm in total agreement with that, in case that wasn't clear. – deceze Aug 18 '11 at 08:26
10

There are four problems with just iterating md5 over and over, no matter how many times you do it.

Computing Power over Time

The first big problem here is that as written it doesn't scale over time to remain secure as computers get faster. What is secure today will be cracked in moments on tomorrow's computers.

Modern secure algorithms like bcrypt and scrypt have built in tuning so that the algorithm can be automatically adjusted to be slower as attacking computers get faster. Since bcrypt is also free and it's still just a simple function call for you, there's no good reason not to use it.

Now, you do have the start of a scaling structure built in to your code. It would be easy to refactor that to run the md5 hash an arbitrary number of times, such that you can tune it to be slower over time. But that's not good enough.

Designed for Failure

The second problem is that md5 is a fundamentally poor choice for a cryptographic hash because it was specifically designed to be fast. The purpose of md5 is for quickly verifying or comparing large files. To accomplish this, the hash needs to be able to be computed quickly and efficiently. This means the implementation and design goals of the algorithm are completely at odds with storing passwords. The chances that at some point we will figure out a way to compute an md5 hash that is orders of magnitude faster than what we can do currently is orders of magnitude higher than we will be able to do the same for sha1 or bcrypt.

Degeneration

The third problem is that hashing algorithms in general tend to degenerate as you iterate them. To understand this, take the original text supplied by the user. The conceptual size of this text is unbounded. There are an infinite number of possible values here. After we hash the text once with md5, we're down to 2128 number of possible values... still very large, but no longer unbounded. But let's cycle this again. md5 is good, but it's not perfect. Those 340 undecillion potential inputs will have some collisions, and yield a number of results that is close, but still somewhat less than, 2128. As you continue to iterate you'll find more collisions, until you eventually end up with a number that, while still large, is significantly less than the conceptual space you thought you were working with.

Cycles

Finally, the fourth problem is some of your original potential inputs will results in cycles: value number 12345 hashes to 98743, which hashes to 67321, which hashes back to 12345, and so on. In other words, some inputs will cycle through the same small set of hash values, and iterating them further won't help. In fact, the more times you run the hash, the more likely a given original input will end up in a cycle.

This goes back to the design of md5. A cryptographic hash could be designed to minimize (not entirely eliminate, but at least minimize) the degeneration and cycle phenomenons, but it wasn't a concern at all for md5.

Conclusion

Any one of these reasons alone is enough to not use md5. There are other perfectly good options available, and they generally use the same interface, so choosing a different one isn't hard. In some platforms, it's as easy as changing an enum value you pass to some "createhash" function. Put all three reasons together, and continuing to use md5 is absolutely insane.

Joel Coehoorn
  • 2,136
  • 1
  • 13
  • 14
  • Citation for the claims that MD5 wasn't designed with any concern about degeneration or cycles? MD5 absolutely was designed as a cryptographic hash function. It's considered broken (collisions are easy, though its preimage resistance has held up reasonably well) but claiming that `A cryptographic hash could be designed to minimize the degeneration and cycle phenomenons[sic], but it wasn't a concern at all for md5` needs some backup. – CBHacking Apr 22 '18 at 02:57
10

Aside from what has already been pointed out in the other answers so far, it seems to me like you have a fundamental misunderstanding sitting there right in your question. The output space of a 128-bit hash function such as MD5 is not 36^32 (about 6.3e49), but 2^128 (about 3.4e38). That's 11 orders of magnitude!

Cryptography is hard. If you don't know precisely what you are doing (and in many cases even if you do), you are much, much better off not trying to design something yourself but rather using a ready-made, tried-and-true solution. For a real life example of how terribly wrong things can go when you don't know exactly what you are doing, look up the Debian OpenSSL key debacle. The early version Netscape PRNG is another example. I'm sure there are plenty of others, too, more or less widely publicized.

user
  • 7,700
  • 2
  • 30
  • 54
8

You have a simple one-way hashing function for passwords, seems secure right? You'd have to run through a lot of guesses to brute force such a system. However, consider a bad case (not even a worst case) scenario. You use this level of security with a very high importance site, or even a site with a lot of users.

Then, one day there's a small security snafu or your site has some previously unknown vulnerability that gets exploited and all of your user account data including hashed passwords is now in the hands of "bad guys". The bad guys go to their low-cost PC with a decent GPU in it and modify a previously existing program to generate hashes so that it does 19-levels of md5 hashing. Then they feed it a well-honed dictionary of common and likely passwords as well as random alphanumeric strings of increasing length. Over time the GPU cranks through generating hashes creating a look up table. At any given time the bad guys can check their generated hash lookup table to the list of hashed passwords and, because you didn't use a per-password salt, easily find matches. Over time as the GPU continues to work away more and more passwords are revealed, until only the highest strength passwords remain.

Wedge
  • 323
  • 1
  • 3
6

MD5 is not the best hash to use nowadays for security nowadays; hashes can be computed too quickly (though the biggest flaw with md5 is the ease in generating collisions). Its only 128 bits (16^32 = 2^128 ~ 10^38); try sha-256 (2^256 ~ 10^77; its 2^128 times more keys) or sha-512. Key strengthening is a good practice (e.g., it takes 20 times longer to generate a rainbow table); but still 20 times longer isn't that long (e.g., use 20 machines and it takes just as long), but is done best with a random salt. Would be much better to key strengthen say 100000 times.

$password = "hello";
$salt = random_str(); // generate some relatively short random str
$password = sha256($password);
for($i=1;$i<100000;$i++){
    $password = sha256($salt + $password);
}
$sep = "|";
$password_scheme = "SHA256x100k";
$password = $salt + $sep + $password_scheme + $sep + $password;

Using a pseudo-code language where + concatenates strings and random_str() is a function that generates a short random string. The purpose of the random salt is that if an attacker sees your source code, sees your password-hashes, they need to do a generate a separate rainbow table for each different salt (or one for each password). So now, instead of just having to generate one rainbow table to get all the users passwords, they have to generate a rainbow table to get just one password. Also its a good idea to document the password scheme in the hash, so you may update it as necessary e.g., migrate from SHA256x100k to SHA256x1k if you need to be less CPU intensive or decide to move to a different hash later.

Granted its not the best idea to come up with your own custom crypto-method if you aren't a crypto expert. You definitely leave the opportunity for subtle security holes like timing attacks even with seemingly secure algorithms. bcrypt is probably your best bet.

Note: MD5 is particularly vulnerable to collision attacks, but you don't really have to worry about collisions in preimage attacks (which is the method of attack against password hashes). An attacker got a list of password hashes (h) somehow, and learned the hash routine (md5 x 20 or salted sha256 x 100k), and is trying to get any message m, such that hash_routine(m) = h, to let them into your system.

What collision vulnerability makes you worry about is that if you have a hash(m1) = hash(m2) when m1 != m2; so if you download something where people have checked that file m1 is safe and want to make sure that you actually downloaded m1 you compare hash(m1) to the public md5 hash. If there's a malicious version m2 with the same md5 hash then you cannot be sure m1 is safe by checking its md5 sum.

dr jimbob
  • 38,936
  • 8
  • 92
  • 162
  • 1
    Chances are that the input entropy (in the password) is less than the raw output space of the hash function, so switching hash functions *just* because of the output hash size isn't going to make a dramatic difference in security. – user Jul 26 '11 at 08:07
  • @Michael, I agree. The primary benefit of SHA-256/512 is that its slower. To get to 128-bits of entropy you'd need a complicated password like 22 random characters from a 62 character keyboard; e.g., pvLfXpJ4qbgSEBohcmZyvz; an 8 char password would only have ~47 bits of security (and you'd need 43 chars for 256 bits). Even though its probably precautionary, for some secure applications people could want to use crazy passwords and with the cost of disk space nowadays there's no reason to save the bytes. (E.g., crypt in /etc/shadow tends to use sha-256/512 nowadays). – dr jimbob Jul 26 '11 at 13:49
5

Rainbow tables function because multiple systems use the similar schemes for handling data. While it is well-regarded that adding a salt should be a universal feature of password hashing, adding a number of rounds also assists in defeating rainbow tables. To wit: a rainbow table for any subset of characters in MD5 cannot result in a match for any password in your system except by accidental collision. The reduction function of the rainbow table will convert each hash generated to a string that matches the letter, number, and symbol pattern that the table is designed for. The moment a hash is generated by your method that incudes a symbol outside that range (a virtual certainty as the input to your hash function is the out put of a hash), it will prevent the table from working.

That fact that your system is simple or even publicly known doesn't matter so much as that defeating a rainbow table only requires that your hashes cannot have been created in great number by the method used to generate that table.

Wikipedia has considered the issue of hash strength over time as their user database is well-known, very old, and used various hashing methods that are now insecure. The solution discussed was key stretching across the several methods. To calculate a password there, the hash method version is looked up, and the password calculated according to that. Upon login with an old version hash, it would be updated to the newer version.

Thomas mentioned that you should use more rounds for your hashing. What hasn't been talked about is how you determine how many rounds to use. The answer to this is fuzzy, but it basically boils down to, "How many iterations can I perform for the login load on my system?" If you have 10,000 users logging in every minute, you might be willing to devote a 25% cpu load to that. For that, pick a number of iterations that allows you to do roughly 700 calculations per second.

Jeff Ferland
  • 38,170
  • 9
  • 94
  • 172
  • 1
    The mental exercise to determine whether a rainbow table for 25 hash iterations will cover a system that has 20 hash iterations involves some mental work I'm not willing to do on Sunday morning, but if anybody wants to chime in on the idea, that'd be appreciated. – Jeff Ferland Jul 24 '11 at 15:56
  • 1
    A rainbow table can only pay off if you're to attack a certain password hashing algorithm more than once. Time spent for the reduction function and time lost to generating overlap make sequential guessing more efficient and more likely to succeed (a percentage of hashes never reached by design). – Jeff Ferland Aug 18 '11 at 11:38
  • 1
    Rainbow tables provide benefit when I want to crack passwords on one machine, then go crack passwords on another machine later and use that written-to-disk for time trade-off. If I'm only going to use it once (or even 2-3 times because of the inefficiency of generating chains), then it's not worth it to build the whole table first. The benefit for multiple instances of the same hash only appears when they are attacked at different points in time. What I mentioned in my last comment was looking up a simple lookup that would be used in regular brute forcing. – Jeff Ferland Aug 18 '11 at 23:01
  • 1
    Jeff, OK, I misunderstood earlier what you were trying to say. Sorry. Maybe what you're trying to say is that if no one else has built a rainbow table for this hash method, then I can't re-use someone else's work to precompute the rainbow table. OK, got it. I still think it is misleading to lead off with that arguable advantage when the scheme has so many more-serious security problems. And (separate issue) I'm not certain it is impossible to re-use existing rainbow tables (as you suggest), or that it's a good idea to deploy a security mechanism on the assumption only one system will use it. – D.W. Aug 18 '11 at 23:11
  • Right. I've got to go back and reword my answer in the next few days because it obviously caused confusion (right doesn't matter if a bunch of people in the know don't understand what I said). Good discussion, in a way. – Jeff Ferland Aug 19 '11 at 04:46
4

No, you're wrong. The problem with md5 hashes is that there is a relatively big chance on collisions: there are many strings that output the same hash. And since there are just 36^32 possibilities, which can all be tried in about 35 hours I believe (and it will get a result way sooner because there is a big chance for collisions), it is not considered a good hash anymore. Not to mention that there probably exists a rainbow table for 20 md5 hashes. Also, there is people that say md5 is actually reversible, but I'm not sure about that.

There are two ways to make your passwords harder to hack:

  1. Use a static salt. This basically makes rainbow tables ineffective, because the hacker can no longer use just English words, since your salt (which the hacker hopefully doesn't know) is making up a large part of the string. Try long salts consisting of a lot of special characters;
  2. Use a better, longer hashing algorithm like whirlpool or sha512. This obviously greatly increases the amount of possibilities;
  3. Hash your string multiple (x) times. This is one of the things you've done right: if the hacker knows your salt and the amount of times you hash (so basically has access to your source code and database), this makes sure it takes him x times longer to get results;
  4. Create a string-dependant salt. This is a salt randomly generated for every string you save in the database and stored along with it. This makes sure the hacker has to loop every possibility of the hash for every password, instead of being able to do it once for every passwords you have stored. If you have 500 passwords stored in the database, this way it takes the hacker 500 times longer to hack all passwords.

Hope this helped. :)

Frog
  • 172
  • 1
  • 4
  • Sorry, I didn't think about 20x hashed md5, so it isn't 36 ^ 32 but 19 * 36 ^ 32 + unknown lenght, characters, salt in last md5 hash. Am I right? – genesis Jul 24 '11 at 15:49
  • Do you mean that you think the amount of possibilities increases when you hash more? No that's not true: there are 36^32 possibilities, because that's just the amount of different strings you can get in the 32 characters length. So hashing 20 times doesn't protect against collisions, but it does make sure it takes the hacker way longer to try out all possibilities. Therefore it's the more better. – Frog Jul 24 '11 at 18:22
  • 2
    Ugh. A md5 hash (in hexdigest form) has thirty-two hexadecimal characters (16 choices - 0-9 and a-f). So 16^32 = 2^128 ~ 3.4 x 10^38 unique md5 hashes, not 36^32 (I assume you thought 10 digits, 26 chars). If you have a setup with a million processors that can each generate hashes for billion inputs per second, it would still take you ~10^16 years to create hashes for all 2^128 md5-style initial hashes. It would be much better to just get a list of likely passwords (such a list is much shorter (say 10^12 passwords) and a cluster could generate the necessary rainbow table in seconds). – dr jimbob Jul 25 '11 at 14:21
  • 3
    @Frog, your answer contains a number of errors. Your claim about collisions is not correct: in this application, the chance of a collision is miniscule. Also, there are other errors in your answer (e.g., 36^32 is not correct; using a sha512 or whirlpool doesn't make much of a difference, and the length of the output of the hash function doesn't matter; using a static salt is not a good idea, instead you should use a random salt chosen independently for each password to be hashed). – D.W. Aug 18 '11 at 08:16
  • @D.W.: Even a small improvement makes a difference, so it's always better to use a sha512 or whirlpool instead of md5. I do think using a static salt is a good idea, because that way the hacker needs access to both the source code and the database, while otherwise access to just the database is good enough. – Frog Aug 18 '11 at 11:05
  • 2
    @Frog, no, that is incorrect. There is no meaningful security difference due to a 128-bit hash output vs a 512-bit hash output. It's not even a small improvement; the improvement is statistically indistinguishable from zero. The claim *"This obviously greatly increases the amount of possibilities"* is simply wrong. As for the salt, if you have a per-user salt, there is no added value of a static salt. If you don't have a per-user salt, you are in trouble, and a static salt won't get you out of trouble. So, either way, the static salt is useless. – D.W. Aug 18 '11 at 21:46
3

Generally, there is nothing wrong with hashing a value multiple times to make it more robust against brute force attacks. In fact, this is an already applied technique known as key stretching.

The only objections to your example is that you should use a cryptographically strong hashing algorithm as well as hashing scheme that includes a salt for more entropy and resistance to rainbow table attacks. At best an already proven password storage scheme like crypt that was specifically designed for passwords.

Gumbo
  • 2,003
  • 1
  • 13
  • 18
2

I don't think that attacker ... would be able to

You are assuming you know how attackers work.

You are assuming you know all the tricks attackers use, and that you have successfully defended against each and every one of them.

You are assuming there's nothing in the published research, knowledge that is available to anyone who cares to learn, that could help even a moderately skilled attacker break your security.

You are assuming that the attacker is unwilling even to perform a brute-force attack with stolen compute. (Have you considered the scenario where an attacker is patient and spins up a background job on someone else's server, a server that the attacker cracked into and isn't paying for, for the purpose of cracking your password file over a month-long period?)

This usually is a very bad assumption.

yfeldblum
  • 2,817
  • 21
  • 13
0

I understand this question has already been here for a little while and it has some excellent answers. How ever I think there is one problem that haven't been addressed. Using a strong algorithm (from a library) is the way to go, since those methods have been tested. But here is what i see:

I don't think that attacker with my DB would be able to decrypt any password with lenght > 2

The problem is not creating the most impossible algorithm to de-crypt. The problem is securing your database. For example if the attacker has a hold of your database, i don't think he/she will care much about the passwords in it.

Ibu
  • 101
  • 1
  • 2
    Actually, passwords are often reused between multiple websites. So a password for one website is not only useful for this website itself, but also for other ones. – Paŭlo Ebermann Oct 09 '11 at 12:30
0

Related links:

makerofthings7
  • 50,488
  • 54
  • 253
  • 542
-5

Trouble with your method is that with each call on md5 you are enlarging the collision space of your passwords (no math proof about it, only naive understanding). As cool as this may look, don't try to make your security more complicated than it need.Chaining hashing algo is as sound as using multiple pseudo random to try generate a secure random. You could get simple security that is secure enough if you apply common sense practice.

Anyway usual advice occurs, use salt use an hashing algo that have a large enough collision space. If you want more security you should consider them at other levels, like server security, Managing staff security.

Anyway if you want to apply other advices of using bcrypt ( which is as bad as it requires to get the key to decipher the entire password list [O(n) once you get the password] , against looking up to a rainbow table against each entry [O(n*m)] ), I would at least advise you to use a public key cipher as the key to decrypt your data would not be needed in your input verification (use the key to encrypt the same way as you hash your data).

dvhh
  • 91
  • 1
  • 7
    bcrypt is not a symmetric encryption mechanism. It is a one-way password storage system that happens to use a symmetric block cipher in the implementation. To check a password using bcrypt you do not decipher anything. Chaining hashes is just fine and it doesn't enlarge the collision space. – Cameron Skinner Jul 25 '11 at 06:48