25

Say we have a hash of a password. The password can be considered to be made of of totally random characters and has a fixed length of N. The hash is SHA1(password+salt), where the salt is of length M.

How large must M+N be in order to resist a rainbow-table based bruteforce attack?

  • Scenario #1: Consider the attacker to have access to state-of-the art computational resources and storage space, e.g a government.

  • Scenario #2: Consider the attacker to have more limited resources, ($10K if we want to be more specific) to spend on equipment or cloud-based services.

Related question: How much is the complexity reduced if the total length (M+N) are known to the attacker?

AviD
  • 72,708
  • 22
  • 137
  • 218
mhswende
  • 856
  • 1
  • 7
  • 9
  • Just to point something about *salt* : if the salt is totally hidden and secured, to anyone outside it'll be part of the password. Since a good salt is also random bytes, no one will be able to tell "there are N bytes of password, M bytes of salt". The complexity is only reduced when the attacker have the salt, not just it's lenght. – woliveirajr Mar 23 '12 at 11:52
  • I agree. The distinction in my question between password and salt is rather pointless - but I left it there in order to prevent comments like "hey, you should use a salt also". However, the complexity should be reduced if the attacker knows the exact length of the total password (M+N). I have updated the question to be more precise about this. – mhswende Mar 23 '12 at 11:59
  • Is there a reason why you care about such a strange choice for the hashing scheme? Are you talking about a legacy system? Also once you use a decently sized salt (say 64bit+) rainbow tables offer no advantage over direct brute-force. – CodesInChaos Mar 23 '12 at 12:07
  • 1
    Strange choice? It is one that I've seen in use pretty often, that's the only reason. And what does the size of 64bit+ come from? What's behind that figure? – mhswende Mar 23 '12 at 13:12
  • Strange choice, because passwords are typically hashed with PBKDF2, bcypt or scrypt. Schemes like `SHA1(password+salt)` are only used by people who don't know what they're doing. The exact size of the salt doesn't matter much, but 64 bits are certainly enough to make rainbowtables an inferior choice over normal brute-force. The table only amortizes itself if you use it to attack more than 2^64 hashes, which you won't. – CodesInChaos Mar 23 '12 at 17:46
  • I've done an estimation of the password length that can be broken when using PBKDF2 with 1000 iterations and a sufficiently long salt. http://security.stackexchange.com/questions/12114/aes-encryption-choice-of-password/12121#12121 It's about brute-force, not rainbowtables, since those aren't relevant in practice. – CodesInChaos Mar 23 '12 at 17:50
  • 3
    @CodeInChaos I'd wager that in actual fact, typically passwords are hashed with MD5, less so SHA1, even less so with salts, and even less so using PBKDF2. While they *should* be using the scheme you mentioned, chances are, they aren't. ;) – Steve Mar 23 '12 at 18:01
  • 4
    @SteveS Agree. I'd even wager that plain text password storage is more common than any hashing more complex than MD5. – mhswende Mar 24 '12 at 20:44

5 Answers5

34

Let's say they randomly chose alphanumeric (A-Za-z0-9 no symbols) for both salt and password; e.g., the sample space is (62)^M possible salts and (62)^N passwords.

Say they have a million GPUs in a farm at their disposal that can each generate a billion hashes a second (assuming a simple MD5 or SHA type hashes - bcrypt or PBKDF based hashes are much slower). So for a given salt, they can crack a 8 character password in 0.2 seconds (200 000 GPU-seconds), a 10 character password in 14 minutes (26 GPU-years), a 12-character password in 37 days (100 000 GPU-years), a 16-character in password in 1.5 million years (1.5 trillion GPU years).

Or another way to think about it; a typical GPU to crank out a billion hashes a second uses ~200 W. So if electricity costs you $0.10 per kWHr and neglecting start up costs, a GPU-hour costs $0.02. So an 8-character password costs ~$1, 10-character $4600, 12-character $18 million, 16 character - $260 trillion (the world's money supply is on the order of $10 trillion). (And this neglects the cost of buying/maintaining a million GPUs -- just electricity).

As for a rainbow table -- at some point it has to be constructed. So you want a complete rainbow table for all 4 character prefixes; and want to cover all passwords up to 8 characters (e.g., total of 12 char password) it will take 100 000 GPU years (37 days of a million GPUs) and cost about $18 million in electric bill.

Basically when M+N > ~12 for random alphanumerics it starts to become unfeasible (e.g., is unfeasible at M+N = 16).

EDIT: New summary table, where I list password length and type (e.g., 8 A-Za-z0-9 means 8 character uppercase + lowercase + numeric password).
Also included is a google application-specific password (16 random lowercase letters) though obviously google type password typically would have to be attacked online (not like an offline hash attacked by a GPU farm).

  PW Length  | # of PW | PW Entropy |       GPU-time | Electricity Cost at $0.10/kW-hr
-------------------------------------------------------------------------------------------
 8 A-Za-z0-9 | 2x10^14 | 47.6 bits  |       60 hours |                  $1
10 A-Za-z0-9 | 8x10^17 | 59.5 bits  |       26 years |              $4 600
12 A-Za-z0-9 | 3x10^21 | 71.4 bits  |   100 000 years|         $18 000 000 ($18 million)
14 A-Za-z0-9 | 1x10^25 | 83.4 bits  | 390 million yrs|     $69 000 000 000 ($69 billion)
16 A-Za-z0-9 | 5x10^28 | 95.2 bits  |1500 billion yrs|$260 000 000 000 000 ($260 trillion)
-------------------------------------------------------------------------------------------
16 a-z       | 4x10^22 | 75.2 bits  | 1.4 million yrs|        $242 000 000 ($242 million)
dr jimbob
  • 38,936
  • 8
  • 92
  • 162
  • 2
    Beautiful answer. Although if you're considering governments, you didnt take into account deficit spending ;-) – AviD Mar 25 '12 at 20:12
  • @AviD - There is a limit that even governments can get away with. I don't count those governments that don't hold free elections, those are the exception to the rules, in those cases even those rulers are limited to how much money they actually have. – Ramhound Mar 30 '12 at 12:35
  • 3
    @Ramhound very offtopic, but I think a certain western government is proving that there are no real limits. Quick - how many zeros in a trillion? ;-) – AviD Mar 30 '12 at 13:11
  • 1
    The unfeasibility arises when the cost/effort of the brute force exceeds the value of potentially brute forcing. Hard to imagine any password worth trillions of dollars. Much simpler to imagine a nefarious gov't just coercing passwords out of people ( [relevant xkcd](http://xkcd.com/538/) ) or paying tens of millions of dollars to introduce subtle back doors (say by introducing to a [compiled open-source compiler](http://www.ece.cmu.edu/~ganger/712.fall02/papers/p761-thompson.pdf) )/poor randomization/poor hashing/keyloggers/services that log password attempts expecting reuse), etc. – dr jimbob Mar 30 '12 at 15:52
  • (But then again, I don't have enough faith in the competency of gov't workers to be competent enough to have the foresight to pull this sort of attack off). – dr jimbob Mar 30 '12 at 15:54
  • @everyone can anyone please explain (a noob) like me how does calculations in 2 paragraph by dr jimbob works. e.g ...they can crack a 8 character password in 0.2 seconds (200 000 GPU-seconds), a 10 character password in 14 minutes (26 GPU-years), a 12-character password in 37 days (100 000 GPU-years), a 16-character in password in 1.5 million years (1.5 trillion GPU years). I'm really confused about this. – user1167026 Apr 10 '12 at 11:36
  • @user1167026 - I assumed they were given the salt and know that the password is just alphanumerics (A-Z, a-z, 0-9)=26+26+10=62 symbols. Hence there are N=62^8 (N~2x10^14) 8-char passwords. If you have one GPU that calculates at rate of r=1 billion hashes/sec=10^9 hashes/sec then it will take a GPU N/r = 2x10^14/(10^9) ~ 2x10^5 seconds = 200 000 GPU seconds. If you have a server farm with one million GPUs (10^6 GPUs), it will take 200 000/10^6 = 0.2 seconds for your server farm, assuming you've programmed the farm to do work in parallel. – dr jimbob Apr 10 '12 at 14:28
  • @user1167026. If the 62^8 part confuses you see http://en.wikipedia.org/wiki/Exponentiation#Combinatorial_interpretation. Basically when you select an m-letter word from a n-letter alphabet (with repetition) there are n^m possibilities. E.g., if you had 3 symbols (a,b,c) and a 2 letter password; you'd have to try 3^2=9 passwords (aa,ab,ac,ba,bb,bc,ca,cb,cc). If you add a letter to the password, then each password from the previous set (e.g., aa) has three possibilities for the new letter with has three choices for the last letter (aaa,aab,aac), so there are now 9*3=3^3=27 passwords. – dr jimbob Apr 10 '12 at 14:46
  • 2
    @AviD "_a certain western government is proving that there are no real limits._" Unless someone in the government has found new sources of energy (and does not want to share), there are physical limits. – curiousguy Sep 01 '12 at 09:15
  • Strange enough, the possibility of a botnet made of infected machines is not said here... as it is a common method for DDoS I see no reason someone would not do a 'neutral' infection to use the computing power to create rainbow tables/brute force passwords which would obviously drop the energy cost. (there would be a central place to store the results obviously, but that doesn't seems a real obstacle for this use). Am I fantasying here ? – Tensibai Aug 25 '16 at 15:29
  • 3
    @Tensibai Yes a botnet can defray the costs. But users will notice if their fancy GPU is running 100% and using 200 W of power (e.g., loud fans, electricity bill shoots up). If you have a million computers in the botnet (all assumed to have high end GPUs), it would take 390 years of current GPU time to break a 14 char pw, with each user getting an electricity bill of about $70,000. There are probably more effective ways to use the botnet (e.g., advertising click fraud, DDoS, spam, etc.) that won't be as obvious to the owner (who will remove it from botnet once they detect and nuke from orbit). – dr jimbob Aug 25 '16 at 15:57
  • I noticed that you talk about constructing a rainbow table, but don't take into account that the cost of the construction is about 1.7N hash computations with N potential passwords. So the costs would be even bigger. – SaPropper Jan 13 '22 at 17:19
9

I apologize ahead of time because this answer won't be a direct answer to your question, but I wanted to post it so that people know that the question you're asking is really only an academic one. The real answer to this question is: don't use sha1, use bcrypt. The sha1's and md5's of the world were designed to be fast and fast is not a good quality for hashing passwords. Bcrypt is designed to allow configuration of the amount of time it takes to hash a password, so you can slow down the hashing process just enough that users won't notice the difference, but it will take orders of magnitude more time for an attacker to generate a rainbow table or brute force a password. For more information on bcrypt and why you should use it, see this question.

In addition to all of this, bcrypt also includes a salting scheme. Because of the reasons mentioned above, you can be pretty certain that the 20 byte salts mentioned by woliveirajr will be viable well into the future.

TwentyMiles
  • 869
  • 5
  • 8
4

There are rainbow tables, and there is brute force... these are two distinct things.

A rainbow table is a special case of a big precomputed table of hashed passwords. As such, a rainbow table can "invert" a hash, i.e. recover a matching password, only if, at some point during the table construction, that password was considered and hashed. Correspondingly, if a password can be found with a rainbow table, then it could have been found with a simple dictionary attack which would have cost no more than the table building. The "rainbow" does not change that; in fact, the rainbow thing actually makes the table more expensive to build, by a factor about 1.7 (that's because the table construction tends to consider and hash several times the same passwords, and that's rather unavoidable).

A consequence is that no rainbow table is worth the effort unless it can be applied at least twice. We use salts precisely to prevent that from happening. The salt can be viewed as making a variant of the hash function, each new salt implying a new variant. A precomputed table is worth anything only if it was precomputed with the same variant (the same salt) than the hash value which is to be attacked. If no salt value is used more than once, then the intelligent attacker will not waste his time building Kleenex rainbow tables; he will just run a dictionary attack.


We consider that the attacker knows the salt. Why ? Because the server knows it, and the attack model is that the attacker could obtain a dump of the server database. Whatever the server knows, the attacker knows too. Thus, when attacking a hashed password, the salt length or contents do not matter (the attacker must include the salt value in his computations, but no salt length will make his task easier or harder).

Thus, we only have the password as line of defence. If we want to know how much the attack costs, then this becomes economics, and, as such, some complexity arises. In particular, we want to know if we are talking about an attacker who is after one very valuable password (say, the password which protects the main computer of the alien force which is about to obliterate Earth), or an attacker who makes a living out of breaking many passwords. In the latter case, the hardware costs become negligible with regards to power consumption.

If we take @jimbob's estimates, hardware which computes 109 hashes per second uses 200W worth of power, and power comes at $0.1 per kWh (note that power cost includes cooling: every Watt spent on computation also becomes heat, which must be somehow dissipated). This gives us 1.8*1014 hash values per dollar. From that, we obtain the following:

  • With $10K, an attacker can try 1.8*1018 hash values, which is more or less the number of possible passwords of 10 alphanumeric characters (uppercase, lowercase and digits).

  • With 683.7 billions of dollars (that's the total US military budget in 2010), an attacker could try about 1.23*1026 hash values, corresponding to about 14.5 alphanumeric characters. Let me add that this figure corresponds to the yearly output of about one hundred nuclear power plants, so that cracking effort would hardly be inconspicuous.

Conclusion: with 15 random alphanumeric characters, your passwords will resist even implausible enemies, even if you totally botched the hashing by using a single invocation of SHA-1, instead of using bcrypt or PBKDF2 with a high iteration count, as you should do. Note that this is valid only for random characters, not at all for the kind of characters you may come up with in the privacy of your brain. Human brains are not good at all at randomness.

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955
3

Rainbow tables represent a trade-off between CPU time and storage, so in theory the answer to this question is unknowable. It depends on which side of the trade-off your attacker favours: if they've got lots of CPU time available then upon encountering a new salt value the attacker can start computing a new table, so it's effectively worthless (this extreme is equivalent to the attacker being able to brute force a key without using precomputed tables). If they've got lots of storage then they can precompute tables for any value of the salt, but it'll take them a long time so the bigger the better.

Of course, things that don't work in theory often work very well in practice. The real world injects itself at this point and tells us that for many attackers, both storage and CPU time are limited. The longer the salt, the more attackers do not have the resources to mount a successful attack. Accepting that relationship, then there's an inequality that provides an upper bound on your choice of salt size:

The total resources on your side required to work with the salted hashes must be less than the resources required to satisfy your application's other requirements.

  • What I'm interested in, basically, is what the upper bound on M+N is that can be bruteforced **in practice, today**, by a dedicated attacker (as per the scenarios). Perhaps the focus that I put on rainbow table was wrong. – mhswende Mar 23 '12 at 13:29
  • @mhswende - In practice today is a totally different question. Many users will choose low entropy/reuse passwords that are vulnerable to dictionary attacks. E.g., if a malicious attacker runs a large web app and has been collecting passwords for a few years (and have say a 10 million common passwords in your dictionary), then if you have the hash and it was something simple like md5/sha, and the password is something someone else has used in your large dictionary; it will take you under a second to crack with a modern GPU. – dr jimbob Mar 24 '12 at 04:19
  • Yes, naturally, that's why I specifically pointed out that the passwords were considered random in the context of the question. – mhswende Mar 24 '12 at 20:30
0

This is my understanding, I apologise if this is incorrect in anyway.

The use of a salt with passwords makes rainbow tables almost useless for attacks in most instances. With password length P and a salt length of S you would need space P * S with naïve brute force look ups (adding a random salt you have to create S tables for each password). Rainbow tables reduce the space required to store the lookups for the password but they don't eliminate the multiplication by S (my Uni maths isn't good enough to give you the correct Big O notation; that can be an exercise for the reader :P).

The calculation can then be thought of as the length of your accepted character set (c) raised to the power of password length (p) multiplied by the number of possible salts (s).

(c^p) * s

Suffice it to say, this can grow to be a very large number very quickly.

Scenario 1

Could a government build a system suitably large to break a password + salt scheme using rainbow tables? This depends on the size of passwords, character set available and salt size. Above a certain size you start hitting physical limits (can you actually encode all the permutations on a device using the matter/energy at hand?). Governments often have greater resources than most institutions but even they can't exhaust these physical limits.

The only way to get around this would be to try and use intuition or prior knowledge to reduce the search space. For a commonly used system, peoples' passwords will be below a certain length so only bother checking those. They also will probably only use a subset of the character set available. You can use these to concentrate what hashes are used to populate your rainbow table (the p and c). But you will always hit against the fact that you multiply by the random salt (s).

You do have a random salt don't you!? This is where people start getting worried about the entropy of random numbers. It is pretty hard to exploit when non-random data is used where real random data is needed but this is one of those situations where it comes into play. Consider the previous problem where you've managed to reduce the (p^c) by making intuitions about the nature of the password. If we know something about how 'random' the salts are (or can influence them in some way) we can start reducing the size of s. We may know (or can influence) the s to be between a certain set of values and therefore drastically reduce the permutations on the rainbow tables. If an attacker can get this below reasonable physical limits then a government (that was so inclined) could use rainbow tables to crack the passwords.

(I hope this has answered your related question as well)

Scenario 2

At current 2012 levels I would say that, unless you were incredibly lucky, the likelihood of you being able to crack password + salt hashes (ignoring technical weaknesses in your implementation) as incredibly slim. This assumes you use reasonable lengths for p, c and s.

See http://chargen.matasano.com/chargen/2007/9/7/enough-with-the-rainbow-tables-what-you-need-to-know-about-s.html and http://www.codinghorror.com/blog/2007/09/rainbow-hash-cracking.html for some interesting reading on this matter (that I almost certainly have used and abused incorrectly).

webtoe
  • 453
  • 2
  • 3