42

Everyone knows that if they have a system that requires a password to log in, they should be storing a hashed & salted copy of the required password, rather than the password in plaintext.

What I started to wonder today is why the don't they also store the user ID in a similar hashed & salted password?

To me this would seem logical because I can't see any drawbacks, and if the db was compromised, the attackers would need to "crack" the password and the username hash before they could compromise that account. It would also mean that if usernames were hashed and salted email addresses, they would be more protected from being sold on to SPAMmers.

Grezzo
  • 652
  • 1
  • 6
  • 12
  • 2
    Because you would need to likely store the salt? What happens when you forget your username and/or password that would be horrible useability. There is a reason this isn't done. – Ramhound Dec 13 '12 at 12:29
  • 29
    Usernames are not for authentication, merely for identification. Treating them with any kind of secure protocol in mind is asking for trouble - it's just as important to identify what *does not* need to be kept private as it is to identify what *does*. A clear distinction is important. – lynks Dec 13 '12 at 14:22
  • 10
    "everyone knows that if they have a system that requires a password to log in, they should be storing a hashed & salted copy of the required password, rather than the password in plaintext." *no*, ***they don't***. None of that is obvious to someone who is new to security. Even the difference between encryption and hashing is beyond what some people can understand. – zzzzBov Dec 13 '12 at 16:37
  • 1
    @zzzzBov OK, perhaps I meant "everyone who understands security" – Grezzo Dec 13 '12 at 17:19
  • 2
    @lynks but wouldn't it be nice if when a database was compromised, they didn't get all the email addresses of the users. Not only can they SPAM these addresses, they can also target them with site specific phishing attacks – Grezzo Dec 13 '12 at 22:54
  • @Ramhound I'm sure you are right that there's a reason that this isn't done. I just want to know why. If we used a site wide salt for the usernames (I know this would mean it would be possible to attack all hashes in one sweep using brute force) then you could still lookup the username by hashing it before performing the query – Grezzo Dec 13 '12 at 22:56
  • For one second I thought it could be a good idea, but what company does never need to send an email to their users? – maaartinus May 22 '14 at 21:42
  • typically when storing data to a database, the data is not hashed and salted. hashing is only done when there is a need (such as to protect a password). what would the need to hash usernames be? in other words, the premise of your question is erroneous ... you're assuming there should be a reason NOT to hash usernames, when not hashing data is the default behavior. – user428517 Jun 08 '15 at 21:05
  • Could the reason be to deter timing attacks even further? Since the username is typically done with string comparison, one could theoretically deter people from gauging there guesses (in microseconds) usernames too (which tend to be email addresses)? For example, if someone wants to hack email address usernames 10,000,000 password guesses each, using something like password_verfify() in PHP for the username could make the returned time of failures even more erratic, thereby harder to analyze? People still want to email their community, but that does not mean email address have to be in ... – Anthony Rutledge Feb 01 '17 at 12:57
  • ... the table used for logging in. – Anthony Rutledge Feb 01 '17 at 13:01
  • But, each username should have a unique salt ... – Anthony Rutledge Feb 01 '17 at 13:03
  • Basically, the reason I see most people poo-pooing this idea is "because that's not the way it was done in the past. Furthermore, here are the reasons and assumptions the past is built on." There is a cost to logging in, and this technique, no matter what your view, is on average more computationally expensive. – Anthony Rutledge Feb 01 '17 at 13:55
  • Collisions are a possible, but not probably given good programming logic in terms of when to store the hashed username. – Anthony Rutledge Feb 01 '17 at 14:00
  • a great many sites (like stackexchange) display the user names to the public in things like comments and posts. then what is the point in hashing user names? – Skaperen Aug 14 '18 at 01:51

9 Answers9

73

You see that thing up there where it displays your username? They can't do that if the username is stored hashed now can they?

One word, usability.

  • 27
    The snark is strong in this one ;) – Polynomial Dec 13 '12 at 14:44
  • 75
    Welcome, 21840c1a3e3db69e01445c8782a99f9b. – Thomas Dec 13 '12 at 16:29
  • 7
    But they could if you had a login name and a displayed name, like you do in facebook and many other services. In fact don't we log in to stack exchange using an email address, but a Nickname is displayed – Grezzo Dec 13 '12 at 17:28
  • 1
    @Thomas, yep, you got me. – lynks Dec 13 '12 at 17:29
  • @Thomas good one! – Grezzo Dec 13 '12 at 17:39
  • 19
    Actually you could. After a successful login store the unhashed username in the session. – Freiheit Dec 13 '12 at 18:00
  • 1
    ... But if security of the username is an issue, you don't want to transmit it from client to server in plaintext. Double-hashing is the order of the day; hash it on the client side, then send it over to the server which hashes again. Now, the rest of the user info, including a copy of the username or display name, could be encrypted with a PBKDF based on all or part of the hash passed in by the client; then you simply decrypt it after verifying the password, and put the display name in the session for use by webpages. – KeithS Dec 13 '12 at 18:40
  • 2
    I'm not convinced the username couldn't be displayed client side only. – wim Dec 14 '12 at 00:40
  • @KeithS Doesn't that seem to be overly complicated for something that isn't meant to be a secret anyway? –  Dec 14 '12 at 02:25
  • 3
    Well it was posited that if the username were an e-mail address, dumping the DB would provide someone with a long list of valid e-mail addresses. It follows that the username in that case would be sensitive. – KeithS Dec 14 '12 at 04:08
  • @KeithS My first thought was "The e-mail address does however need to be stored somehow in order to make a password reset mechanism work", but then I realised that since said mechanism requires you to enter your address you might as well use its hash... interesting idea. Of course you won't be able to spam your users with newsletters then – Tobias Kienzler Dec 14 '12 at 07:49
  • 1
    @TobiasKienzler instead of a 'push' system for newsletters, where the site operator or content producer instigates delivery, they could use more of a 'pull' mechanism, where users subscribe to a 'feed' whether it is RSSish or some other format/mechanism. – JustinC Dec 14 '12 at 21:34
  • 1
    @JustinC That would be better indeed. Just wish those "content" producers agreed... – Tobias Kienzler Dec 15 '12 at 09:18
  • 2
    For those that were curious, `echo -n "gotcha" | md5` `21840c1a3e3db69e01445c8782a99f9b` – Mike Weller Dec 19 '12 at 16:21
  • 1
    Here's another word. Imagination. – Anthony Rutledge Feb 01 '17 at 18:10
  • This answer is clearly incorrect, as there's no rule forcing display names and usernames to be the same, as others before me have pointed out before. Please stop upvoting it. – airstrike May 11 '17 at 13:53
41

While what Terry is saying is true, sometimes login systems actually hash the username (but without salt). They have you pick a login name and a display name. The login name is stored hashed (without salt because you need to be able to look it up) and the password is salted. The display name is different from your login name (because this should be kept secret as well) and is shown where needed.

Even when an attacker sees your name, he will be unable to attach it to your login name. While I say it can be salted, there is actually no need to this. The most important part is just to keep it secret. If the database gets compromised, stuff like your email address or name will still be there for the attacker to use if he wants to stage new attacks on your other accounts.

Neil G
  • 105
  • 4
Lucas Kauffman
  • 54,229
  • 17
  • 113
  • 196
  • 5
    Yeah, I've seen this done. It's more of an obscurity mechanism really, because if the password is hashed strongly then it shouldn't be necessary at all. – Polynomial Dec 13 '12 at 14:46
  • 1
    @Polynomial It may not be necessarry, but if login names were something like email addresses, it could be a good way of trying to hide a users email address if the database were to get dumped, no? – Grezzo Dec 13 '12 at 17:32
  • 2
    mmm good point, you can hash the email too if you don't plan to send any emails to your users. Password resets by email could still be done I guess. – Lucas Kauffman Dec 13 '12 at 19:45
  • 1
    @LucasKauffman I guess that's a good point, it helps to be able to email your users. That certainly would certainly mean hashing it before storing could not be done! – Grezzo Dec 13 '12 at 23:00
  • I think this answer and it's comments are the best for me. Thanks. – Grezzo Dec 13 '12 at 23:02
  • "The logging name is stored hashed (without salt because you need to be able to look it up)" How would you look it up? Rainbow tables? Or am I missing something? – Honza Brabec Dec 15 '12 at 13:26
  • store the username hashed, when you someone logs in, hash the username and look it up in the db to find the hash of the according password. – Lucas Kauffman Dec 15 '12 at 13:38
  • You can't look up the username if it's hashed (that's by definition). You can only test if the provided username/password (both hashed, salted or not) are there in the database and are related to each other. The only thing you can look up is the display name. – Mario Awad Dec 17 '12 at 19:27
  • @MarioAwad you are wrong, I'm affraid. – Lucas Kauffman Dec 17 '12 at 19:42
  • @LucasKauffman which part of my comment you think is wrong? Please elaborate, maybe we'll all learn something new here. Thanks. – Mario Awad Dec 17 '12 at 20:53
  • You can lookup a username in a database even if it's hashed. Once the user introduces his username you just hash it. Once it's hashed you can look it up in the database. If you would be using a salt, then this would not be possible as you would first need to look up the salt which would not be possible as the hash you need to lookup already contains the salt. – Lucas Kauffman Dec 17 '12 at 21:41
  • It looks like we are both saying the same thing but using different terminology. When I say lookup I mean getting the username and displaying it when you mean just checking if it's properly provided or not. Cheers. – Mario Awad Dec 22 '12 at 13:27
  • This doesn't make much sense to me. It means basically using the username as a "password extension". Just use a stronger password instead, and avoid confusing fields. – o0'. Jun 22 '14 at 14:07
  • 4
    @Grezzo: Actually, you could still email users after hashing their email address. Just request that they provide their email address when requesting a password reset or other request, hash that email against their stored hash to confirm their identity, and then send the password reset or reply to the email address they just gave you. Since you don't have their email stored, you can only send them email with their cooperation, but unless you plan on spamming them with sales offers, I don't see that as a bad thing. – Mark Ripley Nov 04 '16 at 10:46
  • 1
    Hashing a username prevents enumeration exploits of your login system. Basically, most front-end brute force attacks fail if they don't have a valid username to start with, so hashing it with a site wide salt means that a successful attack against your database won't enumerate your user accounts. A person would need to hack you application's file system to get your salt which is often much harder. If you don't hash your usernames, then an enumeration attack could be followed by a front-end barrage of checks for weak passwords against those usernames. – Nosajimiki Aug 21 '18 at 15:44
15

Generally usernames are not considered secure, they are identity, not authentication. It's good to not reveal what usernames are valid, but would be worse if you happened to have a collision. You could still work around this by looking at all matching usernames for a password hash that matches, but that's kind of messy.

Realistically, if you otherwise have good password security and limits on login attempts, a complete list of usernames offers little practical value to an attacker. It's main benefit would be phishing, but if your official correspondence has any information in it, then that information can't be hashed and they'd get it if they compromised your DB anyway.

Also, usability like Terry said. It's far easier to find your account if they can see usernames. You don't gain enough by trying to secure an identifier to justify it in most contexts.

AJ Henderson
  • 41,896
  • 5
  • 63
  • 110
  • Looking up by password have a one problem - what to do if they are the same (also - how to do reset)? If you prohibit such situation then you effectively disclose password. – Maciej Piechotka Dec 13 '12 at 15:34
  • @MaciejPiechotka - Yeah, if you have a double collision, that is also a problem, though the chances of a double collision are pretty remarkably small. Also, if you did prevent a double collision, you wouldn't be disclosing the username it corresponded to, so you'd still have a pretty hard time making use of the information, but it is still a valid observation that it would be telling them that some user has a password which would resolve with their password, but the chances of an attacker managing to hit that case are pretty (cryptographically secure) remote. – AJ Henderson Dec 13 '12 at 15:39
  • 2
    FWIW if there was a collision, the second user would be met with a "that username already exists" message upon attempting to register. – panofsteel Mar 03 '16 at 20:39
  • @MaciejPiechotka That's why you look up the identifier first. It follows that if the identifier is present, only one password (hash) should be associated with it. If one were looking up the hash of a login identifier and using a timing attack safe way of doing the comparison, how could this hurt? – Anthony Rutledge Feb 01 '17 at 13:20
7

While others have well pointed out that there are few (if any) advantages, I would take issue with your claim that there aren't any drawbacks. If you store just the hashed username, then searching for the username is easy. If you store a salted, hashed username then searching becomes a bit more problematic.

Let's assume that if we build some SQL table containing usernames and (hashed) passwords and tell the SQL server to index the username column that it will do some sort of binary search or some other magic. We could have a table that looks like:

Username  |  Password
test      |  j9lnvqjAuhNJs

(This is the old-school unix crypt(3) hash just for simplicity and brevity.)

If you store your usernames in plaintext, retrieving the (hashed) password for a user is a simple SQL call. Let's say you want to validate the credentials for a user who typed in the username test:

SELECT password FROM users WHERE username='test`;

Simple enough. Now if we were to store the usernames in the same format as the passwords, our table ends up looking like this:

Username       |  Password
M1CAtvzDdJDGU  |  j9lnvqjAuhNJs

Now when a user types in their username of test, how do you validate the password? A binary search is useless here, since you don't even know the salt you used to store the username. Instead, you need to iterate over each username in the database, crypting the given username with the salt for that username and comparing it to the stored (hashed) username to see if it matches. Youch!

Assume that you took some good precautions and used a nice slow hash like bcrypt instead of good old Unix crypt? Double youch!

As you can imagine, there are some serious drawbacks to storing a salted hashed username instead of just plaintext.

  • but if you don't salt it, or use a site wide salt for the username you could use the same select command, but instead of searching for the plaintext username, you could search for the hash of it. Or am I missing something? – Grezzo Dec 13 '12 at 17:25
  • 1
    If you use a site-wide salt, you might as well not salt at all. http://crypto.stackexchange.com/questions/1855/passwords-with-same-salt-what-does-this-mean – Edward Thomson Dec 13 '12 at 17:29
  • 4
    That's not true. It is true that with a site specific salt they could attack all hashes at the same time using brute force, but without a salt attackers could use an unsalted rainbow table which would be much less effort. And after all, isn't security about making it too much effort to attack, not really about making it impossible? – Grezzo Dec 13 '12 at 17:35
  • If you assume that my email address exists in some rainbow table of unsalted `bcrypt`s somewhere, then yes, you're right. – Edward Thomson Dec 13 '12 at 17:39
  • That's a fair point I suppose. I doubt my email address is in any rainbow tables as it's quite a number of characters. I guess that's probably true of most peoples including yours. Having said that, there must be tons of email addresses that are [a-zA-z0-9]{8}@hotmail.com. That wouldn't require a huge rainbow table, though I have no idea (and doubt) if anyone has ever made one like that – Grezzo Dec 13 '12 at 23:10
  • I am enjoy this post. I wish more people would talk about the threats hashing the username hinders. Without using a timing attack safe method of comparison, hashing the username is a zero sum gain. – Anthony Rutledge Feb 01 '17 at 13:29
4

If you hashed and salted the username, how would the system know that new accounts had a unique username, without iterating through all existing records and hashing the new username with every single existing salt?

SilverlightFox
  • 33,698
  • 6
  • 69
  • 185
3

Your idea is noble and the question interesting. Now, I believe you either did not think of usability at all while bringing up the question or you missed the point of hashing (or maybe you just misspelled 'encryption').

Hashing is irreversible, unless you have supercomputers to brute force things or rainbow tables to try search for hashes. Hence the usability would go downhill if you were to hash the usernames/emails used for logins. If, however one were to 'encrypt' the same using a predefined key in the program (the one which check for the username) itself, then it might be a bit secure. However, once again - an encryption key stored directly in a program is just as good as no key at all. These are the prime reasons usernames are not hashed or encrypted.

AJ Henderson
  • 41,896
  • 5
  • 63
  • 110
  • 1
    I definitely meant hashing it, not encrypting it. If we also have a displayed name, there there is no usability sacrifice is there? – Grezzo Dec 13 '12 at 17:30
  • @Grezzo - If you have a display name stored separately, the majority of people are going to make their display name closely resemble their username. If you generate their username, then it reduces usability. – AJ Henderson Dec 14 '12 at 14:18
  • Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/52884/discussion-between-aj-henderson-and-anthony-rutledge). – AJ Henderson Feb 01 '17 at 18:14
3

I think the most likely reason is that hashing the usernames along with the password doesn't actually give any extra protection.

We encourage users to create difficult and complex passwords, making them harder to crack. Any database of hashed usernames could be cracked in minutes with a basic dictionary... unless you require your usernames to have at least 6 alphanumeric characters ;)

devel
  • 31
  • 2
  • 2
    It would give a very small amount of extra protection, but admittedly not much. It could give the benefit though that it obscures users usernames, which are likely to be email addresses which could be used for targeted phishing attacks. – Grezzo Dec 13 '12 at 23:13
1

This idea falls in line with two tactics: 1) Security by obscurity , 2) Extra mitigation of timing attacks (if you use a timing safe comparison function). You cannot use a unique salt to do this effectively, but it is a good idea. This would allow one to separate the table that holds login information from the one that hold "person" information. Two tables then, person (could hold plain text email addresses) and user (could hold a hashed username {site wide salted} and a hashed and uniquely salted password). Looking up the user by hashed username is not a problem.

1

I think this is an excellent idea. As expressed by Anthony Rutledge; this can provide security through obscurity/obfuscation and if using a key derivation function, such as PBKDF2, there is, I presume, increased difficulty, in orders of magnitude, for a would-be attacker attempting to compromise a stolen database.

For example, an offline-brute force attack attempted on a stolen database would require finding hash collisions for both username and password, notwithstanding the fact that the salting of the username in the method, as I mentioned above, would render any identified username collisions moot.

This means that attackers would need to commandeer your machine and also identify and understand your source code, since the salt wouldn't exist anywhere within your database. It adds multiple layers of additional defense, without adding much - if any - significant increase in computational load for the web application.

In regards to issues with hashing: the username can be salted with a proprietary server-side hash, based on the actual username/email.

That would mean the salt is known, albeit programmatically. And the script that calculates and returns the salt can reside on the machine in an out-of-band fashion, relative to the web service - such that any method for generating the salt could be inaccessible from any web-facing entry points.

schroeder
  • 125,553
  • 55
  • 289
  • 326
James
  • 11
  • 2