7

Let's assume an organisation that generates ~100 Passwords a week; these Passwords are 32-Characters long and randomly generated. The consultants that need passwords generated use a Password-generator that

  • generates 20 Passwords upon request
  • the consultant then chooses one out of 20 suggestions

One requirement from our "security advisor" is to prevent generated-password-reuse, e.g. guarantee the uniqueness of each password. I'm asked to find a solution, and while I know that I would have to use bcrypt to store the hashes of each generated password I personally think this is a bad idea. An attacker that gets a hold on the ever-generated-pw-db (although just the hopefully secure hashes) has a nice dictionary.

Is there a solution on how to do this (store all generated passwords) securely?

is there one KILLER-argument (like rainbow-tables for simple hashing) to say that this solution is bad from a security POV?

castorio
  • 95
  • 5

3 Answers3

14

Let's do mathematics. I like mathematics.

You generate passwords in a space of size n = 6232 (32 characters chosen randomly, uniformly and independently of each other, in an alphabet of size 62, i.e. all uppercase letters, lowercase letters, and digits). Suppose that you have generated k such passwords; k is easily estimated since you will produce about 2000 potential passwords per week (100 "true passwords" but each of them is chosen among 20 fresh random passwords).

It is easily seen that k will remain much smaller than the square root of n (indeed, k will be in the vicinity of the square root of n only after about 457 thousands of billions of billions of years). We can then approximate the probability that you ever obtain a collision (i.e. the same candidate password shows up twice) to be about k2/2n. Assuming that you keep your system going for a hundred years (which is already preposterous), i.e. about 5218 weeks, this yields the probability of close to 2.4·10-44.

Now compare that with the probability of your security advisor being mauled by a gorilla escaped from a zoo; that probability is at least 10-18 per day (I have made this comparison before, and then again because I love this example; and the actual probability is higher than that, but let's be conservative). Therefore, on average, before hitting a collision, your security advisor will be visited by at least 4.7·1025 murderous gorillas, i.e. 47 millions of billions of billions of gorillas (since that far exceeds the number of living gorillas, one has to assume that some of them will come several times). If the advisor still worries about the collision and not the gorillas, then you have demonstrated that this advisor is no smarter than a baboon.

To sum up, it is easy to ensure that no password is reused: the Universe already makes it so. By doing nothing special you are already ensuring this absence of collision, with a success rate which is billions (of billions) of times better than your success at protecting your security advisor (and his children) from gorilla attacks (and you are probably already doing a good job at that, too).


Now let's suppose that your security advisor is unmoved by mathematics, because he believes in pretty pictures more than in pretty figures (in that respect, he would be no different from the average chimpanzee, so at least there is some kind of ironic consistency here). Setting up a system to filter out "duplicates" would be a waste of money, and, crucially, a way which can be easily demonstrated (e.g. to the boss of that "security advisor") to be a waste of money (be sure to make him aware of that fact); however, he may still insist on it, because he suggested it first and reneging on his own past word would endanger his perceived alpha-male status (again, chimpanzees come to mind). This unfortunately happens a lot with humans. Then your job, after having duly warned your hierarchy of the complete inanity of it (leave written traces ! They will protect you when somebody with more than three brain cells does an audit), is to make sure that, at least, this duplicate-detection system does not weaken the system. The duplicate detection is wasteful; let it not be harmful.

The space of possible passwords is very large, a bit more than 2190. All passwords are equiprobable, so you have about 190 bits of entropy, which is huge. For such passwords you DON'T NEED good password hashing because all of the features of password hashing functions (salts, iterations...) are ways to cope with the inherent weakness of "normal" passwords, i.e. the fact that they have low entropy, and thus can be brute forced (this is called a dictionary attack). However, your passwords have way too much entropy for brute force to be even remotely applicable to them (even if you somehow hire the gorillas to try passwords, you will run out of gorillas much before you run out of possible passwords). Therefore, simple hashing would work: store the hash values in a database, which will allow you to do a fast "already used" check.

However, there are systems out there who "use passwords" by actually hashing them and using the hash value as "the passwords". Windows in particular: when it uses a password, it actually works over the MD4 hash value (128 bits) computed over the password. Therefore, if you store hashed passwords and happen to use the same hash function as some other password-consuming system, then you are actually storing "cleartext" passwords, and that's bad.

To avoid this problem, use "your own" hash function, which in practice means HMAC. The hashing process would look like this (in C#/.NET):

using System.Security.Cryptography;
using System.Text;

// ...
static byte[] hmacKey = Encoding.UTF8.GetBytes("my security advisor is a chimpanzee");

static byte[] HashPassword(string password)
{
    HMAC h = HMAC.Create();
    h.Key = hmacKey;
    return h.ComputeHash(Encoding.UTF8.GetBytes(password));
}

It is quite improbable that any other system out there uses the same password hashing function (i.e. HMAC with the same key), so storing these "hash" values in a database would be safe, and allow duplicate detection.


Now the killer argument: why, oh why, would you want to detect duplicates ? A system which detect duplicates is also, inherently, a system which can tell what passwords are (or have been) in use. Such a system can only harm security: it helps attacker know whether a given password is "worth trying" on a system. Fortunately, the sheer size of the space of possible passwords in your case voids that harm. However, in all generality, insisting on detecting duplicates is also insisting on weakening the protection offered by the passwords. This is bad, bordering on malicious.

It would be relatively easy to suggest, with some careful wording, that the security advisor is trying to plant a backdoor (of course, not a very good backdoor, because he is not good at it, but intent is what matters in criminal cases). For instance, consider the following setup: you design a system which helps users generate 4-digit PIN codes for their bank smart card. The smart card will block itself after three wrong PIN codes, so an attacker who steals the card has only probability 1/3333 to guess it. However, if there is a system which generates random PIN codes and avoids duplicates then the attacker could hijack that system (e.g. dump its database with some SQL injection attack) and then let it generate 9999 PIN codes, which will all, by construction, be different from the PIN code that the smart card user obtained. This reveals the user's PIN: it is the only one, among the 10000 possible PIN codes, that the anti-duplicate system will not generate.

That's the core of it: when preventing duplicates for passwords has any detectable effect (not your case here, because duplicates won't happen in your lifetime or even in the Sun's lifetime, for that matter), then that effect is negative. This is a BAD IDEA. It smells of foul play.

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955
2

What exactly is the reason your "security advisor" provided for his requirement? If he cannot justify the requirement, ask him to shove off (In a polite fashion of course).

Think about it, you are using a CSPRNG. With 62^32 possible combinations, it is very unlikely that you will generate a repeat password. You are wasting precious disk space on something with zero benefit.

If you must implement this requirement, just store them like you would any normal password. bcrypt is a strong hash. An attacker will gain nothing if he manages to compromise a database full of 32 character long randomly generated passwords. It is very possible he might waste a lot of computing power trying to crack it (This might be considered a benefit if you like wasting the attacker's time for no good reason).

  • the reason is to guarantee the uniqueness of each password, nut just by possibility (62^32) but by storing and checking each generated password against this database. we had a security-audit (with even more silly recommendations ... ) and our bosses believe those powerpoint-presentations more than our experience. we even showed a calculation on how many thousends of years are needed to generate a password twice, even if we generate 100 password/second, 24/7, sic – castorio Sep 06 '13 at 09:41
  • @castorio So ask yourself *why* is it important to guarantee uniqueness? But sure, if it's unavoidable just do it. Like I mentioned, there won't be a security drawback. –  Sep 06 '13 at 09:55
  • @castorio Still, maybe you can convince your bosses by telling them how much additional resources this requirement will need (hard disk space, processing power to hash and compare every time a random password is generated...). –  Sep 06 '13 at 09:56
  • terry, i'm on your site here, thanx for your suggestions. btw, resources are not the problem. – castorio Sep 06 '13 at 11:54
-2

The best choice in this setting is to store all previous hashes of passwords. Also, please do not use bcrypt. It is good and safe, but new systems (which care about security margins) should employ different techniques such as scrypt. Which actually looks like will be replaced soon as well.

More info (a very nice roundup): http://www.unlimitednovelty.com/2012/03/dont-use-bcrypt.html

lkk
  • 77
  • 1
  • 1
  • 3
    That link is pretty misleading. `pbkdf2` is generally considered to be not as good as `bcrypt`. `scrypt`, while theoretically better is still not as time tested as `bcrypt` and does not have good implementations available in quite a few languages. Cryptography is one of those fields that does not adhere to Barney's One Rule "New is always better". For more info on password hashing algorithms, see http://security.stackexchange.com/a/31846/10211 –  Sep 06 '13 at 11:43
  • 1
    i'd also go for the mature bcrypt. i read a lot of good threads and canonical questions on *.SE, and when it co mes to security i'd always prefer mature and accepted solutions, no bleeding edge – castorio Sep 06 '13 at 12:02
  • Mature but is it studied? Researched? Can you please point me to some references behind bcrypt's safety? I know it's very good solution and hence I wrote that, but scrypt offers some additional nice properties. What do you mean by "tested"? Just because md5 is being used on more machines that sha512 is does not mean it is more tested -- well, this is a corner case because md5 is actually broken, not the case of bcrypt. Please look also on http://1.bp.blogspot.com/-orQpwmyYHTM/T2gG3HZwEPI/AAAAAAAAAf4/9qro2cgObzU/s1600/AoZCPIECQAI7gvp.png – lkk Sep 06 '13 at 18:24
  • Also,I do not adhere to Barney's rule. Please do not suggest I am doing so. This is unfair. Instead, you could point me to actually constructive criticism of scrypt. Just because something is new does not mean it's bad (e.g. SHA-3, sponge construction) either. – lkk Sep 06 '13 at 18:29