15

Assuming that the password is stored hashed and salted, and that it is a string of random characters, is there a point where adding to password length doesn't add security?

Since the hash will have a constant length, there are a finite number of possible hashes. So I would assume that at some very long length, the probability that a password's hash collides with the hash of a shorter password starts to climb, until we get to a length where all passwords have collisions with shorter passwords.

Eric G
  • 9,701
  • 4
  • 31
  • 59
Peter
  • 253
  • 2
  • 4

4 Answers4

17

The strength of a password is a measure of what it could have been. That's how passwords work: they are something which a given user remembers, but is unknown to outsiders. How much unknown ? That's the tricky point.

If you consider a password consisting of 7 random letters (uppercase and lowercase) and digits, where each sign is chosen randomly, uniformly, and independently of each other, then there are 627 = 3521614606208 possible passwords and they all have the same probability of being chosen; so, for attackers, the best possible strategy is to try them all (no order is better than any other) until a match is found. On average, attackers will try half the space of possible password (hence 1760807303104, in this case) before hitting the right one.

So the strength of a password does not come from what it is, and in particular does not come from its length. The password length has no direct relation to password security. What makes a password strong is its randomness; and that's, strictly speaking, a property of the password generation method, not of the password itself. The indirect relation of length with security is that you need some room to exercise randomness: while a very long password is not necessarily secure, a very short password is necessarily insecure.

Moreover, if you create your password by accumulating randomly chosen elements (random letters, random words...) then randomness is increased by accumulating more of them, as long as they are chosen independently of each other (you don't get more randomness if you repeat four times the same word). In that sense, you may have the impression that length makes strength; but that's an illusion. Randomness makes strength -- and just happens to need space.

How much randomness do you need ? Enough to deter attackers. How many passwords an attacker can try per second depends a lot of the context. If the password is the PIN code for a smart card, then the attacker can try 3 codes, and no more (after 3 wrong PIN, the card locks itself permanently). If the attacker must talk to a Web server for each password try, then he will be able to try as many passwords per second as the server is willing to accept, based on the server CPU power and configuration. If the attacker knows a hash of the password, then he can try as many as he can based on his budget (for a simple hash function, this can range in the billions per second).

For each context, there is a limit to the number of passwords the attacker may try. If his chances of hitting the right password within his budget are too low, the attacker will deem it "not worth the effort", and then you win. You do not need to get more randomness than this point.

Since hash functions have a finite length, there is also another limit beyond which it is useless to increase password randomness, but it is much farther. It is not about collisions, but about preimages. Let's put figures under it. A realistic attacker, with a lot of budget, and faced with a simple hash function, may try up to about 270 potential passwords. We are talking about a budget of some millions of dollars here, and also a rather patient attacker (he will accept to spend a few weeks on that password). There are good reasons why this figure won't raise indefinitely with technological advances. Let's consider that the hash function has an output of n bits (e.g. n = 128 for MD5). The attacker's goal is to find a password which, when hashed, yields a given hash value; he does not need the password which was initially used, just a matching password. He has two attacking strategies:

  1. He can try "potential passwords" that the human user may have come up with.
  2. He can try random passwords until a match is found.

Strategy 1 will have cost P/2 where P is (more or less) the number of potential passwords that the user may have chosen. If P is beyond 271 then that's too expensive for the attacker.

Strategy 2 will have cost 2n-1. If n is greater than 71 then that's too expensive for the attacker.

Even the oldest, most basic hash functions like MD5 have an output size which is more than enough to make strategy 2 unworkable (MD2, MD4, MD5 have output size 128 bits; SHA-1 and RIPEMD-160 offer 160 bits; SHA-256 has 256 bits). In other words, the fixed hash output size could limit security only if it was below 70 bits or so.

For more on the subject of password randomness (often called "entropy"), see this answer. For more on correct ways to do password hashing (in particular, to do slow hashing, with salts to prevent parallel attacks), see that answer.

Tom Leek
  • 170,038
  • 29
  • 342
  • 480
4

To expand a tiny bit on Tom's excellent answer, the hard limit for useful length, assuming random characters, is the size of the resulting hash (eg: 128bits is about 21 random characters, assuming you use upper, lower, numbers, and basic symbols, and just English characters).

Real passwords, though, are much more bounded by construction mnemonics, and the ability of people to remember them. As Tom said, what you really want is 128 bits of random entropy (assuming 128bit hash), which can take far more than 21 characters to generate in a password with that much entropy which someone can also actually remember. In practice, few user-memorized passwords will have this much entropy.

Nick
  • 199
  • 4
-1

Absolutely, I've given a thought to that too. Specially after reading this paper:

How to Break MD5 and Other Hash Functions (PDF)

It's a study made by security researchers Wang and Yu, who developed a technique to force collissions between md5 hashes (And according to them HAVAL, MD4, RIPEMD and SHA0)... And it is a widely studied topic when it comes to colliding certificate hashes

Although, I believe, hash collissions are an issue primarly for certificates and files rather than passwords, since they are so short (comparing it to a file) their collissions are far less likely.

If we are talking strictly about passwords, this is the reason why authentication is slowly migrating to two factor auth. Passwords and their human keepers are simply unreliable for proper authentication. Not because of their weakness and hash-collission risks, but because of a much simpler factor: the humans behind them.

Hendrik Brummermann
  • 27,158
  • 6
  • 80
  • 121
Ryoku
  • 101
  • 3
    Collisions on hash functions have nothing whatsoever to do with password security. When passwords are hashed, they rely on _preimage resistance_, which is something completely different. – Tom Leek Mar 26 '13 at 00:13
-4

short answer: no, there is no point at which there is no benefit. as long as your password is totally random, the more characters you have, the harder it will be for the password to be cracked using brute force. that being said, there are many ways to get access to a password without brute force, where it really doesnt matter what your password is. as far as recent studies have said though, passwords that are 8 characters long are totally unsecure, and might as well be 1 character long. passwords that are more than 16 characters but non random are also unsecure. if you want a totally secure password for brute force, a random set of 32 characters, numbers, and special characters will pretty much be unbruteforceable without a new generation of technology, as that would require somewhere in the vicinity of 93^32 possible passwords, which is basically ridiculous

Xicor
  • 1
  • 6
    "... passwords that are 8 characters long are totally unsecure, and might as well be 1 character long." **You are so wrong**. Please read Tom Leek's answer. A 8 character long string of lower/upper case (as explained up there) would contain 67^8 possibilities, which depending on the hashing algorithm would take a very long time to bruteforce. On the other hand, a 1 character long of lower/upper case would contain 67^1. A *slight* difference. – Simon Sep 06 '13 at 17:38
  • 1
    Yea you obviously haven't dealt with users. First of all users can't recall random sequences of characters, if they can't recall a password they resort to writing it down, which is a lot worse than a password which has some form of logic to it. Second of all, using 8 characters containing, upper, lower letters, numbers and signs are still considered secure enough for non-administrative accounts. I always advice them to make a sentence they can remember from which they derive the pw. Second of all successfully brute-forcing the password will also entirely depend on the used hashing algorithm. – Lucas Kauffman Sep 06 '13 at 17:41
  • 1
    Note that writing down a password _can_ be a good idea, if it makes it possible for the user to use a good random password (randomness is _everything_ in password security). BUT the piece of paper must then be adequately protected; for most situation, "adequate protection" means "in the user's wallet, next to his cash" (people take care of their cash). – Tom Leek Sep 06 '13 at 17:44
  • @TomLeek and not on a postit on their screen ^^ – Lucas Kauffman Sep 06 '13 at 17:45
  • @LucasKauffman Or below the keyboard; that's a classic. In the classic movie [WarGames](http://www.imdb.com/title/tt0086567/), there is a password which is very cunningly concealed in... one of the desk drawers. – Tom Leek Sep 06 '13 at 17:49
  • @Xicor also if, for instance, PBKDF2 is used as hashing algorithm and your password is longer than 64 bytes, it will be truncated to 20 bytes (if using sha1) because otherwise the key would be larger than the blocksize. – Lucas Kauffman Sep 06 '13 at 17:52
  • 1
    I have written down every single one of my passwords. I use a password manager which stores the passwords in a heavily encrypted container (*legitimately* decrypting that file on commodity hardware takes a little over a second, and that's *knowing* the passphrase; under the hood, it's AES-256), and use long enough and random enough passwords that the issue can never reasonably be either hash output size or password entropy. This isn't perfect, but it definitely beats having to memorize hundreds of different passwords, including for sites and services I use once a year or even less. @TomLeek – user Sep 06 '13 at 23:06