12

I always thought that salts is simply used to prevent rainbow tables to be used. Other have suggest they should be unique on a per account basis. Currently i have been using a config file to use as salt. In the past i did md5(salt + password) but now i use .NET PBKDF2 via (pass, salt_from_config).

Have i been doing this wrong or not as securely? Should salts be unique or are they just non predictable so no one can generate a rainbow table ahead of time?

curiousguy
  • 5,038
  • 3
  • 25
  • 27

3 Answers3

13

I posted an answer that explains this on another question, which should give give you a good background to all the major security concerns around password storage.

To answer your question more directly - a salt:

  • must be unique.
  • should be unpredictable.
  • should be unknown to potential attackers.

The problem with schemes like H(pass + username) is that the attacker knows the salt. So, whilst his ability to crack every password in the database with a single rainbow table is gone, he can create a rainbow table for key user accounts. This allows him to compute the rainbow table for a privileged user account (e.g. admin) ahead-of-time, then immediately use it once he breaches the database.

This is a problem, because it gives you no time to react to the breach. Seconds after you get alerted about the attack, the attacker has logged into your admin account. If the attacker is forced to crack the password after the breach, it gives you time to lock down privileged accounts and change passwords.

You could also look into a salt + pepper scheme, where the pepper is a second salt stored outside the database, e.g. in code. This helps prevent against the SQL injection model of attack, where the bad guy only has access to the database.

Best practice is something like this (feel free to omit the pepper):

hash = KDF(pass + pepper, salt, workFactor)

Where:

  • KDF is a strong, slow key-derivation function, such as bcrypt of PBKDF2.
  • pass is a strong password (enforce password policies!)
  • pepper is a long random constant value stored in your code.
  • salt is a long random unique value stored with the password in the database.
  • workFactor is appropriate for your hardware and security requirements.
Polynomial
  • 133,763
  • 43
  • 302
  • 380
  • +1, i'm curious, what if i mix the salt and pepper instead of pepper+pass? I'll assume same idea. What if the salt is hardcoded within app and the pepper mixed with salt is the rowid? This solves unique and unknown (hardcoded in app or selected in config file). However i don't understand the unpredictable part. As long as the attacker cant guess the salt its considered unpredictable? `hash = KDF(pass, salt ^ userid, workFactor` –  Jul 22 '12 at 21:26
  • Well the `userID` is predictable if it's an auto-increment. Can you guarantee to *never* leak a user's ID? The reason for mixing pass + pepper is that the salt for bcrypt is automatically generated, so it's easier to implement. You should not hard-code the salt, since it is an integral part of the security of bcrypt. The same goes for PBKDF2, since it uses HMAC, where the salt becomes the key. It *must* be unique. The pepper is only there for obfuscation purposes, as an extra defense against the SQL injection model of attackers. – Polynomial Jul 23 '12 at 06:00
  • ok there is one more thing i am not clear on. Why can't i leak the pepper if the salt is hidden?, How is it an extra defense against SQL when the salt isnt in SQL in the first place? Finally why should i not hard code the salt? (actually i'll have it in a config file but i'm sure you considered that as the same problem) –  Jul 23 '12 at 08:24
  • The way i see it you cant see the salt if you have SQL access. The pepper being userid will force the attacker to generate more hashes bc salt+pepper is different per user. And because the salt is not known/unpredictable until the user get access to the salt info on the server he cant start the attack ahead of time. –  Jul 23 '12 at 08:28
  • Making the salt hidden just turns it into a pepper. The point of a salt is to make each hash unique. The point of a pepper is to make it impossible to crack passwords with SQL-only access. You're trying to protect yourself against **all** attackers. All you're doing, in your scheme, is making the salt the pepper and vice versa, then using a predictable salt. It provides no benefit at all. In fact, it's less secure. Just generate a random unpredictable salt, and have a constant long random pepper. – Polynomial Jul 23 '12 at 09:14
  • Too much to discuss in comments, so I posted stuff to a pastebin: http://pastebin.com/raw.php?i=Hein4cy0 – Polynomial Jul 23 '12 at 09:30
  • I see, i didn't know lots of software generate salts. I'm not using any that does. I did notice the replies salt+pepper was different from my understanding. What i still don't understand is if the pepper is unknown and can't be access via SQL then why do i care if the salt is predictable? The first point said IF the proper is not secret. Well it is secret (because you'd need to access the app config file to get that data) so does that mean i CAN use a predictable salt? An attacker still needs to generate a table per user and cant start until they know the pepper. –  Jul 23 '12 at 21:45
  • The only extra protection i see is you're pretty much storing extra data in the db to use with the hash. Why would one bother? If the attacker has access to the config file he'll have access to the DB (via user/pass in the config) so he'll have access to that extra data. (if this is via ssh then db reads wont show up in server logs. If config data access via exploit on web then weblogs shows it like sql attack). If i used signed ipaddr+date as a salt (both should be unknown) would that satisfy your conditions? But i'm still not convinced the salt needs to be random while the pepper is secret. –  Jul 23 '12 at 21:51
  • That's not true. An attacker may have limited read-only access to files, e.g. through a web server bug. If the SQL server is properly protected (e.g. via firewall) from the internet side, there's no way he can use the database credentials. The idea is to protect against both models of attack, enforcing that they have to have access to the database *and* the code. – Polynomial Jul 24 '12 at 05:53
  • 2
    Security is about defense in depth. You want to choose approaches that provide *overlapping* layers of protection. It does not cost you anything extra to generate a random salt on user password creation, so *why would you do anything else*? Any time you do a good-enough half-measure when it would have been just as easy to do it absolutely correctly, you've failed to provide overlapping security measures. – Stephen Touset Oct 29 '12 at 16:18
  • Also, I slightly prefer `KDF(HMAC(key, password), salt, work_factor)` (where `key` is the "pepper", a cryptographically random series of bytes of appropriate length for the HMAC hash, and the `KDF` is bcrypt). – Stephen Touset Oct 29 '12 at 16:21
  • @StephenTouset The `pass + pepper` section is simple a combination operation. HMAC is a valid combination, as is concatenation. It depends on what you feel more secure with. – Polynomial Oct 29 '12 at 16:30
8

Salts should be unique for every password.

If you are using the same salt for every password, an attacker could simply generate a rainbow table using that particular salt and crack most of the passwords in your database.

With regards to the comments about PBKDF2 and GPU acceleration, I would like to point you to this link right here at Security.SE where Thomas Pornin gave a very excellent answer.

  • If i am using PBKDF2 wouldnt that take years? –  Jul 21 '12 at 02:54
  • No. Why would it? Such a task can be done in parallel using GPU clusters. Having the same salt means that only ONE rainbow table needs to be generated, and most of the passwords in your database would be cracked. –  Jul 21 '12 at 02:59
  • hmm, +1. I thought PBKDF2 algorithm can't be done (efficiently?) on GPUs. It appears they can. –  Jul 21 '12 at 03:16
  • 3
    PBKDF2 usually uses SHA256, which can be done easily on GPUs. Bcrypt and especially Scrypt are much harder to do so, which is exactly why they are recommended. – Matrix Jul 21 '12 at 06:26
  • PBKDF2 is still pretty good. GPUs accelerate it massively, but your security margin is still greater than your performance hit. Even if the GPU can do SHA256 ten thousand times faster than your CPU, you only have to compute one hash, but the attacker has to compute *billions*. I agree that bcrypt is better, though. – Polynomial Jul 21 '12 at 08:11
  • With regards to all the comments about PBKDF2, I edited my answer. –  Jul 22 '12 at 09:16
  • The issue with non-unique salts is that if any of your users share the same password (hint: they do), cracking one is equivalent to cracking them all. You're making the attacker's job easier. – Stephen Touset Oct 29 '12 at 18:08
  • @StephenTouset: If one has a database with 50,000 users and their passwords, and there are only 50 different salts (uniformly distributed). If one discovers the passwords of one of them (e.g. user #42) by some means, the shared salts will allow one to quickly determine for the 1,000 users whose salt matches user #42 whether their password matches his, but what would that buy the attacker? An attacker who knows user #42's password could can readily check whether the password of any of those users matches user #42's password regardless of whether the salts match. The biggest reason for salt... – supercat Sep 13 '14 at 16:26
  • @StephenTouset: ...is that the existence of multiple users with matching salts *but different passwords* will provide an attacker with more chances to get a "hit" on each password he tries. If an attacker wants to see if anyone has "Valjean" as a password, but has no idea who would have it, the attacker will have to retry the password algorithm for each different salt that is used. On the other hand, if an attacker finds an account Jean24601 on one service that has "Valjean" as a password, the attacker try that password on every Jean24601 account he can find, regardless of salting. – supercat Sep 13 '14 at 16:34
1

Exact requirements for a salt depend on the password hashing algorithm, but for the usual methods (bcrypt, PBKDF2...) the only requirement is that the salt is unique. Or at least as unique as is practical; the odd collision is not a big issue as long as it does not happen often and cannot be forced from the outside.

Uniqueness is worldwide; it is not sufficient for salts to be unique in a given server. Two distinct servers, using the same hashing algorithm, should have distinct salts too.

A relatively common and cheap way to get worldwide uniqueness is to generate salts from a cryptographically strong PRNG, with sufficient length (16 random bytes are sufficient). That's what bcrypt does. If the PRNG is biased, then you need a somewhat longer salt to achieve uniqueness. If the PRNG is weak because of too small a seed or internal state, then uniqueness will not be satisfactorily obtained that way. The user name is not a good salt, for two reasons:

  • The same user name can occur in several servers (e.g. each server might have an "Administrator" account, under that exact name);
  • The user does not change his name when he changes his password, leading to salt reuse.

Salts are not the same thing than keys (which are secret and must remain confidential) or initialization vectors (IV are "starting points" for some algorithms, and may have additional requirements such as uniformness and unpredictability in the case of CBC encryption). There is normally no problem in giving away your salt values; anyway, whoever recomputes the hash value from the password must know the salt. Therefore, publication of salts is intrinsic with password-based encryption of files (the salt is then encoded in the file header). It is also necessarily published in authentication protocols where the hashing occurs on the client (that's quite rare in Web contexts, because Javascript is too slow). There is no point in needlessly publishing the salts, but keeping them secret does not really enhance security either.

In this answer, a fringe scenario is evoked: an attacker learns the salt beforehand, prepares a big precomputed table, then enacts the actual attack which reveals the password hash. This does not make it easier for the attacker to break the password; in fact, this increases his effort (he has to produce a full table instead of stopping when the password is found, so that's double cost on average; if the table is of the rainbow persuasion, an additional 1.7 factor enters for table generation; and there are storage costs). What it changes is the dynamics: this shortens the time between the break-in (the hash value is stolen) and the password recovery. This is an edge case, so don't sweat it. If you use password hashing for storage in an authentication protocol, where hashing occurs server-side, then you just store the salt with the hash value and the salts will be as confidential as the hash values themselves, and that's good. In other cases (e.g. password-based file encryption), salts will be more "public", but that's not critical in any way, so don't go about adding extra complexity to keep the salts secret (extra complexity is bad, and much worse than public salts).

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955
  • How much advantage is there to using a stateful PRNG as opposed to concatenating a timestamp with a unique identifier for some resource that could only be associated with one password per second [e.g. a session ID concatenated with a site-specific GUID]? – supercat Sep 28 '14 at 17:31