80

I have seen examples of password hashing that were: H(username + salt + password). What is the purpose of adding username? Is there any purpose?

JustinLovinger
  • 790
  • 1
  • 6
  • 8
  • Related: [Is it acceptable to generate salts with a hash of the username?](https://security.stackexchange.com/q/39492/165253) – forest Mar 09 '18 at 10:06
  • Quite possibly, the authors have confused "salt" and "pepper"? It _does_ arguably make sense (though it's still not a good approach) to add the username if you mistakenly used pepper instead of (rather than in additin to) salt. Adding the username to pepper-only will at least make finding users with the same password on the same machine harder (which pepper-only otherwise doesn't prevent). – Damon Mar 09 '18 at 15:19
  • to generate exact unique hashing result. suppose there's already password saved before by other users?, hence need username to make it more unique to certain user. – mfathirirhas Mar 11 '18 at 14:25
  • @mfathirirhas Salts are unique per-user. – forest Mar 15 '18 at 04:15

4 Answers4

156

No, there is no purpose. This is security theater*. The purpose of a salt is to make parallel attacks against all hashed passwords infeasible, and to break rainbow tables. Adding the username in there does not improve that behavior or increase any other aspects of security. It actually has a bit of a downside as you now will run into some troubles if you are changing the username, and you are required to maintain a more complex and non-standard system. In a purely cryptographic sense, there is no downside. Practically speaking though, more complexity means more bugs.

The properties of salts and usernames

You might think that the username itself may not be public, so it couldn't hurt to use it as an additional secret, but the fact is that the database will likely already contain the username in plaintext, invalidating this already questionable benefit. People should stick to pre-existing authentication techniques. But let's look at the properties of each of these objects:

A salt is:

  • Not secret - Salts are stored in plaintext.

  • Secure - They are generated randomly and are long.

  • Unique - Every user's salt is intentionally different.

A password is:

  • Secret - Assuming they are not put on a sticky note.

  • Secure - If it's not hunter2. The password should be good.

  • Unique - Ideally, at least, but not all passwords are ideal.

Now compare this to a username. A username is:

  • Not secret - They are public or at least stored in plaintext.

  • Not secure - No one thinks to choose a long and complex username.

  • Not unique - Usernames are safe to share between sites.

The traits of a good salt

Now, what exactly does a salt do? In general, a good salt provides three benefits:

  1. It prevents an attacker from attacking every user's hash at once. The attacker is no longer able to hash a candidate password and test it against every single entry at once. They are forced to re-compute any given password to be tested for each user's hash. This benefit provided by a salt grows linearly as the number of distinct target hash entries grow. Of course, salts are still important even if only a single hash is in need of protection, as I explain below.

  2. It makes rainbow tables infeasible. A rainbow table is a highly-optimized precomputed table that matches passwords to hashes. They take up less space than a gigantic lookup table, but take a lot of time to generate (a space-time trade-off). In order for rainbow tables to work, a given password must always resolve to the same hash. Salts break this assumption, making rainbow tables impractical as they would have to have a new entry for each possible salt.

  3. It prevents targeted precomputation through rainbow tables, at least when the hash is truly random. An attacker can only begin the attack after they get their hands on the hashes (and with them, salts). If the salt is already public but the hash is not, then the attack can be optimized by generating a rainbow table for that specific salt. This is one reason why WPA2 is such an ugly protocol. The salt is the ESSID (network name), so someone can begin the attack for their target's router before they ever even get their hands on the 4-way handshake.

So what possible benefit would concatenating a value before hashing when this value is public, insecure, and re-used? It doesn't end up requiring the attacker dig for more information. It doesn't add to the security of a salt. It doesn't increase the complexity of the password. There is no benefit.

Proper password hashing

So what should they do to increase security? They can use a KDF such as PBKDF2, bcrypt, scrypt, or argon2 instead of a single hash. They can add a pepper, which is a random global value stored outside of the database and added to the password and salt, making it necessary to steal the pepper to attempt to attack the hashes rather than simply dump the database using SQLi.

EDIT: As some of the comments point out, there is one contrived scenario where the username would be beneficial to add into the mix. That scenario would be one where the implementation is broken badly enough that the salt is not actually a salt, and the username is the only unique or semi-unique per-user value in the database, in which case mixing in the username would be better than nothing. But really, if you have no salt, you should start using one instead of trying to use usernames. Use real security and don't be half-assed when your users' safety is on the line.

* In this context, I am defining security theater as the practice of implementing security measures that do not actually improve security in any meaningful way and are only present to provide the illusion of better security.

Glorfindel
  • 2,263
  • 6
  • 19
  • 30
forest
  • 65,613
  • 20
  • 208
  • 262
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackexchange.com/rooms/74220/discussion-on-answer-by-forest-why-add-username-to-salt-before-hashing-a-passwor). – Rory Alsop Mar 08 '18 at 22:00
  • 163
    "No one thinks to choose a long and complex username." I disagree with this. – Kf6Jg'dAHj-'zNw2FDthNCwkh '-4x Mar 08 '18 at 23:26
  • username is not necessarily known. One can build a system where no usernames are stored at the server side or only it's hash will be used as ID. By adding username to the salt, weak passwords become (slightly) stronger (/longer). – goteguru Mar 09 '18 at 12:48
  • @goteguru That is explained in https://security.stackexchange.com/q/25374/165253, which I linked to. – forest Mar 09 '18 at 12:49
  • 12
    "Usernames are often shared among sites." Had to laugh, because this is *not* in contrast with passwords, which are also commonly shared between sites. Since salts are the only guaranteed-unique part, we devs had better get that right! – jpaugh Mar 09 '18 at 14:47
  • Random number generation fails can be very insiduous because everything appears to work but security can be significantly reduced. Including the username in the hash provides some defense in such a scenario. – Peter Green Mar 09 '18 at 23:28
  • 1
    @PeterGreen I'm not sure why that would be the case. If the salt is using a bad RNG, it's still not all that bad. – forest Mar 10 '18 at 01:50
  • 1
    There's one other benefit to including a salt that you don't mention: It prevents copying a known password's hash on top of an unknown one, giving seemingly legitimate access to another user's account. I forget the name of that type of attack, though. – Bobson Mar 11 '18 at 00:20
  • @Bobson Length extension attack? It requires a bit more than just copying the hash over though. – forest Mar 11 '18 at 03:21
  • +1 for good answer, wish I could give another +1 for that bash.org reference! :D – Ian Kemp Mar 11 '18 at 19:03
  • `[Rainbow tables] take up more space than a simple dictionary of passwords...` In fact, rainbow tables are so highly compressed that even a single _bit_ encodes for multiple password/hash mappings. The Wikipedia article you referred to explains how that is possible. – mgr326639 Mar 12 '18 at 09:33
  • If you go to http://project-rainbowcrack.com/table.htm you will see that the third MD5 rainbow table (just an example) contains the info on 221,919,451,578,090 pw's/hashes. Storing this uncompressed would mean 8 bytes for the pw + 16 bytes for the hash = 24 bytes. So this would take 5.3*10^15 bytes to store. (in fact slightly more if I had not ignored the <8 char passwords for simplicity). But the rainbow table you download is only 160 Gb. That's a factor of 31000 difference! You'll never achieve that kind of compression with LZMA on a file containing the same [a-zA-Z0-9] strings. – mgr326639 Mar 12 '18 at 10:02
  • @mgr326639 That's a good point, though I wasn't talking about dictionaries that included the hash (as that would be a lookup table). I was thinking of rainbow tables that only held completely unrelated passwords delimited with `\0` or `\n` (smaller dictionaries especially). I updated my answer. – forest Mar 12 '18 at 10:04
  • Actually, I'm sure a radix tree of just the passwords themselves would be far smaller than a rainbow table (of course LZMA doesn't generate trees), so a dictionary could still be smaller. I'm a little too lazy to do the math to find out exactly how big a compact radix tree of everything in `[a-zA-Z0-9]{8}` would be, though. – forest Mar 12 '18 at 10:18
11

If the salt is sufficiently random as it is supposed to be then adding the username does not add additional protection. But it also does not cause any harm apart from making the code more complex which increases the likelihood of bugs. When doing a code review it might be taken as a sign that the developer does not fully understand what he is doing.

If instead the salt is just mostly static or derived from the password then the username might help since in this case the username essentially serves the purpose the salt failed to serve. But, this is not the recommended way since the salt is supposed to be random and a username is not.

See Is it a good idea to use the user's username as a salt when hashing a password like hash(username_str + password_str)? and What should be used as a salt? for a deeper discussion of what a good salt should be and why the username is not a good salt. See also Why are salted hashes more secure for password storage? to understand the purpose of the salt in the first place.

Steffen Ullrich
  • 190,458
  • 29
  • 381
  • 434
  • 3
    If the salt is derived from the password, then the implementer has more problems on their hand. If they are able to add the username, they should be able to fix the salt. – forest Mar 08 '18 at 07:27
  • 1
    @forest: Like I said: if the developer is not using a random salt he does not fully understand what he is doing. And in this case the developer might come up with the wildest ideas of how to use a non-unique salt, like using a username or deriving the salt from the password or both "just to be sure". See for [Do salts have to be random, or just unique and unknown?](https://security.stackexchange.com/questions/41617/) or [Is it ok to derive a salt from a password...](https://stackoverflow.com/questions/10051716/) to see how the concept of salt is often not understood. – Steffen Ullrich Mar 08 '18 at 07:36
  • 2
    Wait a second... If the salt is derived from the password, then you don't have any salt at all, just a slightly different hash generation algorithm that is again vulnerable to rainbow tables. Just slightly different rainbow tables. – gnasher729 Mar 10 '18 at 23:32
11

As @Damien_The_Unbeliever pointed out in a comment, it can prevent user impersonation in a scenario where the system has been partially compromised.

Imagine the following scenario. Somehow, an attacker has gotten read/write access to the db table used for login, containing usernames, password hashes, and salts -- perhaps via a SQL injection attack. This attacker has no other elevated access to the system, but wants to impersonate a user in the system in an untraceable (or hard to trace) way.

  • First, the attacker records the original password hash and salt for the victim's account.
  • Next, the attacker registers their own account, and copies the password hash and salt from their account into the victim's account.
  • Now, the attacker can log in to the victim's account using the attacker's own password.
  • When the attacker is done, they restore the victim's password hash and salt, making it difficult for the victim to realize what has happened.

While in some systems, an attacker with this level of access could simply generate a password hash using the victim's username, salt, and an arbitrary password (rather than copying their own), this is not possible if the system uses a pepper in addition to the salt (a globally-defined constant added to the input for all password hashes, which is stored someplace different from the password hashes and salts). Without compromising the pepper, the only way the attacker in this scenario can set a password hash with a known password for the victim is to copy one generated by the system, as described above.

Note that two-factor authentication likely can't even prevent this. If the attacker also has access to the 2FA initialization codes for users (perhaps stored in the same db table), the attacker's code can be temporarily written into the victim's account as well.

In contrast, if the username is used in calculation of the password hash, this particular impersonation attack will not work. Even once the attacker copies their password hash, salt, and 2FA code into the victim's account, using the attacker's password on the victim's account will not result in the same password hash as for the attacker's account, and so the login will fail.

It is debatable whether the benefits are worth the additional complexity and possibility of bugs of course, since this scenario requires the attacker to have already seriously compromised the system, and in this scenario the attacker doesn't recover the user's actual password. But, it can be argued that this is an additional protection.

zacronos
  • 211
  • 1
  • 3
  • It honestly doesn't make sense. You can't copy and paste hashes, and if you have a compromised system, you could trivially impersonate users anyway. I can't think of any scenario where it would prevent user impersonation or even make it remotely harder. – forest Mar 08 '18 at 14:40
  • @forest You can copy-and-paste hashes just like any other data, if you have the access this scenario postulates, so I don't see what you mean by that. The scenario is fairly well described here: it's one mechanism of user impersonation in a partially compromised system. There might well be other routes - working out the hash algorithm used and generating your own hash, or bypassing the authentication altogether somehow - but they likely require more than the database to be compromised. – IMSoP Mar 08 '18 at 14:48
  • 1
    @IMSoP A system where the database is already compromised with read/write access is compromised badly enough that you can already impersonate users. Simply set your user to be an administrator. Simply change the user ID of your user. You could do anything. – forest Mar 08 '18 at 14:51
  • 3
    @forest If an attacker is able to execute SQL injection attacks, but otherwise has not compromised the system, then no, it is not trivial to impersonate a user. I've added some clarification, and a link to a SO post explaining why "pepper" is useful in addition to just salt. (It is for the exact same scenario, where a compromised db does not imply the entire system is compromised.) If this were not a scenario people care about, then no one would use peppers in their hashes. – zacronos Mar 08 '18 at 14:54
  • @zacronos I agree that a pepper can be important. I fail to see why mixing in the username would be useful in any way (since a pepper would already solve the issue of an attacker who can do extensive SQLi but does not have raw filesystem access). A pepper is a standard, well-known technique to prevent this. Hashing in a username is not. – forest Mar 08 '18 at 14:55
  • 2
    @forest It helps in exactly the way this answer says it helps: it prevents one method where a user with *write* access could impersonate a user. As you say, there *might* be other ways to do that, by messing with other records in the database, and this answer makes clear that the benefit is debatable, but it does restrict that specific attack, and force the attacker to find a different route. – IMSoP Mar 08 '18 at 14:58
  • And what if the attacker temporarily swaps usernames with the victim? I don't know if I'd go so far as to say there's no benefit, but it's so limited and contrived I don't see that it's ever really worth the trouble. – AndrolGenhald Mar 08 '18 at 15:05
  • 2
    @AndrolGenhald agreed, and I acknowledged in my answer that it may not be worth the trouble. Swapping usernames may be an option, though if there is any denormalization taking place in the data or logging at all, simply swapping username may not provide perfect impersonation. The question posted by OP was not whether it is worth doing this, but simply whether it has any purpose, and if so what it could be. – zacronos Mar 08 '18 at 15:14
  • It's worth noting that what this answer describes is the same sort of thing that [encryption with associated data](https://security.stackexchange.com/questions/179273/what-is-the-purpose-of-associated-authenticated-data-in-aead) is often used for. And in fact, some specialized password hashing functions take associated data as an optional argument additional to the salt. – Luis Casillas Mar 08 '18 at 19:16
  • 1
    I think the point here (and why this answer isn't very useful) is that if someone has read-write access to the database, **you have already lost!** Don't make weird security practices for instances like this, instead use parameterized queries (or avoid SQL altogether) and stop them from compromising the database in the first place! – NH. Mar 08 '18 at 20:17
  • 3
    The attack described in this answer strikes me as more plausible if we reimagine it as insider threat. Imagine a company that has a single-sign-on system backed by a database instance managed by a rogue admin that has read/write access to the password database, but not to other assets they want to get into (including, critically, the mapping between SSO identities and permissions). Such a rogue admin could gain access to other, more privileged accounts by overwriting a user's password entry to substitute in their own password hash. – Luis Casillas Mar 08 '18 at 20:50
  • 1
    @NH. While I understand the point you're making (and acknowledged that in the answer), many common security practices are based on the idea that mitigation is (or at least can be) worthwhile. One could similarly (and incorrectly) argue that adding a salt to the calculation of a password hash shouldn't be necessary -- just secure your password hashes so they are never compromised, and the salt is unnecessary! Adding a pepper to the password hash calculation is also only useful for mitigation when the db is compromised, yet it is an accepted practice. – zacronos Mar 09 '18 at 20:08
  • That's because those mitigate ubiquitous security issues, not contrived ones. So yes, a sysadmin has to protect users from their own mistakes, but the sysadmin should not need to come up with a solution for an issue that _they_ created. They should instead fix that issue. – forest Mar 10 '18 at 06:15
  • Could this fall into the category of "[Defense in depth](https://en.wikipedia.org/wiki/Defense_in_depth_(computing))"? – Jonathan Cross Mar 12 '18 at 11:08
  • 1
    @JonathanCross Yes, it would. It is not security through obscurity because it doesn't rely on the attacker being unaware of how things are done. It *is* making (some) things harder for an attacker who has already partially compromised the system, rather than giving up at that point. – zacronos Mar 13 '18 at 12:59
  • @zacronos Ah I thought Jonathan's comment was on a different answer. My bad, will delete my previous comment. My issue with the answer is pretty much what NH says. This seems like a weird security practice that can be better accomplished with application-level code. – forest Mar 14 '18 at 03:14
6

For starters, password hashing should be done with a dedicated password hashing function such as bcrypt, Argon2, scrypt or PBKDF2. In that case, you don't have to deal with concatenating the salt and password like a primitive; the function takes those as separate arguments. See "How to securely hash passwords?", one of the top Q&A's on this site for this topic.

The purpose of the salt in password hashing is to randomize the hash function that's applied to each password entry. So three rules that generally apply to modern specialized password hashes are these:

  1. Everytime you enroll a new password you should generate a fresh salt. (Note this means that when a user changes their password, you should generate a fresh salt for that password, not reuse the salt for the old one. Salts are bound to states of a password database entry, not to users.)
  2. The salt's value should be independent of the password itself; knowledge of the password should be of no help in guessing the salt, or otherwise an attacker could use such knowledge to precompute an attack table. An obvious example here is you shouldn't use the password itself as the salt. (That's less dumb than it might sound—you wouldn't store the salt along the password in this case—but it is dumb, because two users with the same password would have the same hash.)
  3. The chance that two salts are equal to each other should be very low. Preferably not just within your application, but globally across the whole world.

Now we can answer your question by making these observations:

  • A salt that's made up of a sufficient number of random bytes satisfies these criteria already. 16 random bytes (from a cryptographically strong random number generator) sounds sensible.
  • Concatenating the username to such a random salt doesn't help.
  • If the salts are being generated in some way that's predictable but somewhat nonrepeating—for example, as a counter that's incremented each time you enroll a new password, or as a timestamp—concatenating additional values like the username or your site's domain name can help make it more unique. But using random salts would be simpler (no need to keep track of a persistent counter state).
Luis Casillas
  • 10,361
  • 2
  • 28
  • 42