17

I am wondering about the security implications of using email as the main identifier via which a user logs in (with an associated password, of course). Can I treat it as I would a normal username or should extra precautions be taken, like 'sanitizing' the email address before comparing it for uniqueness by removing any + parts (e.g. somebody+note@example.org)?

The one true implication I can think of right away is to be careful not to make it publicly checkable whether an email was registered in the system (e.g. by failed password checks).

Gregor Petrin
  • 273
  • 1
  • 2
  • 4
  • have you thought about adding a magic value to each email account? So it masks the original. – Saladin Mar 27 '13 at 13:57
  • No, I really just wanted some feedback in order not to miss a potential issue as I couldn't come up with any serious ones (besides the failed password checks revealing if an email is registered). And a quick search didn't turn up anything written directly on this subject so it seemed a legitimate question. – Gregor Petrin Mar 27 '13 at 14:18
  • i agree its a thought provoking question. – Saladin Mar 27 '13 at 14:24
  • 1
    @Saladin - I think adding some magic value would counteract the intended purpose to have the account be easily remembered and globally unique. Hiding the original is desirable, but can't really be done without counteracting the reason for using it in the first place. – AJ Henderson Mar 27 '13 at 14:27
  • thanks AJ for responding I was thinking thats this internal masking remains internal e.g abc@me.com is what known to public but for attacker to get in using thus email as user name he have to find the translated magic values that gets appended on the fly.This value can then be hash compared with backend user db. – Saladin Mar 27 '13 at 14:34
  • to explain further this magic value is session based for one time use only the hash for it can pre computed an stored. – Saladin Mar 27 '13 at 15:05
  • I strongly object against the notion that stripping an arbitrary part from the e-mail address "sanitizes" it. While it is true that *many* e-mail systems treat the `+` character as special, this is merely an informal convention. What's next, "sanitizing" the address by removing the all the characters `a`, '4', '%' and `Q`? (a completely arbitrary choice) – Piskvor left the building Mar 27 '13 at 18:25
  • May off topic but: The main problem with email as an username is that my email address can change. When that happens I usually want to keep the username the same. – Hennes Mar 28 '13 at 10:52
  • A bit off the topic for a security related discussion, but a valid point none the less. A service I've been using for a long time distinguishes between my creating & contact email. It feels a bit weird entering my old college email address to log in every time :) – Gregor Petrin Mar 28 '13 at 22:59

5 Answers5

19

The email is usually a good thing as a username because:

  • the user remembers it;
  • it is unique worldwide, thus simplifying the management of collisions (if one user wants to use the same login name as another, then one of them did a mistake);
  • it can be coupled with an "email verification" system which is convenient if you want the server to be able to contact users in case of emergency.

Nominally, at least the right half of the email address (the domain name, after the "@" sign) is case-insensitive, so you should normalize that part to lowercase, which is easy, since it is supposed to be a valid domain name, hence limited to a subset of ASCII (Note: you will want to take care to use what .NET calls the invariant culture, and Java terms the root locale; otherwise, your code will break in Turkey).

For what is on the left, case sensitivity depends on the receiving site. Most sites are case insensitive for that part too, and it seems "reasonable" to do lowercase normalization, because it is improbable that a given site is both case sensitive and uses case to distinguish between distinct people (i.e. that bob@example.com and BOB@example.com are both valid addresses for two different Bobs). Thus, I suggest lowercase normalization of the whole address for comparison purposes (i.e. to decide which user we are talking about); but keep the address "as is" if you ever want to send back an email to the user, or even if you want to show it to the user (e.g. as a "Welcome, Bob@example.com" banner -- Bob might be quite fond of his uppercase 'B').

About the "+" sign: From your point of view, that's part of the address. This "+" is handled on some sites as a way for each user to generate a lot of functionally equivalent addresses: Bob will be able to use bob+work@example.com, bob+home@example.com, bob+the-ultimate-warlord@example.com... all emails sent to any of these addresses end in Bob's mailbox, but, in the eye of Bob, they still are distinct addresses which Bob types as such. Bob expects the addresses to be considered distinct. So your handling of the "+" depends on what you really want:

  • If you just want a unique "login name" so that management of collisions is easy, then leave the address "as is"; don't do anything special with the "+".
  • If you want to enforce uniqueness of accounts per human user (i.e. you don't want Bob to be able to create one million distinct accounts), you may want to remove characters from the "+" sign to the "@" sign, there again for comparison purposes. But don't believe this rule will deter most Bobs; obtaining zillions of email addresses without a "+" is easy and cheap (the ultimate way being to buy a domain and rent a server to host it).

Summary: keep the address as entered at registration time and use it "as is" for display and for sending emails. For comparisons (i.e. locating the user entry in the table of users, e.g. upon login), normalize the email to lowercase (with an invariant culture).

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955
  • 1
    `"the user remembers it;"` I use an unique email address for every website this is not at all practical for me. `"it is unique worldwide"` So is my username, and else I'd choose another. `"it can be coupled with an "email verification""` As if websites otherwise won't want my email address. But I'm going offtopic; good post as always. – Luc Mar 27 '13 at 13:50
  • Isn't coupling the username to the method of emergency contact bad (in the real world where people reuse passwords) since it makes it an unreliable means of emergency contact when a compromise occurs? – AJ Henderson Mar 27 '13 at 14:24
  • 1
    `"since it is supposed to be a valid domain name, hence limited to a subset of ASCII"` Domain registration using non-latin scripts has been available since 2011; unless there're additional restrictions on email addresses your advice is no longer correct. Even if there still is an ascii restriction today assuming it will always be present is IMO dangerous. – Dan Is Fiddling By Firelight Mar 27 '13 at 15:38
  • 1
    Technically, internationalized domain names are encoded with [Punycode](http://en.wikipedia.org/wiki/Punycode) so there still are, at the machine level, ASCII. But, indeed, some users will want to type their email address _with the non-latin characters_ so, in order to support that, someone has to do the Punycode conversion at some point. – Thomas Pornin Mar 27 '13 at 15:50
  • Case-sensitivity in local part is *uncommon* nowadays, but not *improbable*. Bob@example.com and bob@example.com may be, [as far as the respective RFC goes](https://tools.ietf.org/html/rfc5321#section-2.3.11), two different e-mail addresses. **It is unwise to *assume* identity because it's less work for me**, never mind that the RFC specifically prohibits this: "the local-part MUST be interpreted and assigned semantics only by the host specified in the domain part of the address." You seem to be contradicting yourself: should the local part be munged (`[bB]ob`), or not (`bob+something`)? – Piskvor left the building Mar 27 '13 at 18:32
  • 4
    I am trying to follow how the _user_ thinks -- that's the key to a good user interface. Most users I encounter are case-insensitive in their head, but consider the '+something' to actually matter, so they will expect "the machine" to accept what they type regardless of case, but to honour differences in the '+something'. – Thomas Pornin Mar 27 '13 at 20:11
  • 1
    @Luc '"it is unique worldwide" So is my username'. How do you know? Have you created accounts with your user name at every single web site that takes user names? – Martin Brown Jan 06 '14 at 16:30
  • @MartinBrown Good morning mister edge-case ;). I'm pretty sure noone elses uses my handle 'lucb1e'. If it's not available somewhere, it's more likely someone is trying to impersonate me than anything else. (Or I simply had an account already.) So far at least I've never had collisions, and it's been quite a few years of registering anywhere from Twitter to Gmail to, well, anywhere. – Luc Jan 07 '14 at 08:00
  • [§4.1.2](https://tools.ietf.org/html/rfc5321#page-42) states that "While the above definition for Local-part is relatively permissive, for maximum interoperability, a host that expects to receive mail SHOULD avoid defining mailboxes where the Local-part requires (or uses) the Quoted-string form **or where the Local-part is case- sensitive.**" So while strictly permissible, the RFC strongly suggests that receiving implementations not impose case sensitivity on the mailbox portion of the address. IMO that gives application writers some ground to stand on. – Ben Collins Jun 26 '15 at 20:18
5

The main risk to using e-mails as usernames is that it gives up a secondary means of contacting the user. Many (most?) users use the same password for multiple things. If the username and password are compromised and the username is their e-mail, then their e-mail account will also likely end up compromised. This can make account recovery very difficult unless you use something like an SMS based recovery system.

AJ Henderson
  • 41,896
  • 5
  • 63
  • 110
  • I don't think using the email address as username will significantly increase how many users use the same password on your website as for their email account. Nearly all websites require an email address to be entered anyway. – Luc Mar 27 '13 at 13:51
  • @Luc - right, it isn't that it would make more people use the same password. It is that if the e-mail is disclosed to someone who has compromised an account, then the e-mail can not be reliably used for account recovery. Ie, I have an account abc@123.com. I use the same password for my e-mail. My account get's compromised and I go to try and recover it. How do you verify it is me? Well, you could send me an e-mail, but since the attacker knows my e-mail, he can take over my e-mail as well. It's safer to have a decoupled (side channel) means to connect with a user. – AJ Henderson Mar 27 '13 at 14:22
  • A decent solution is to allow users to provide additional ways of contacting them and in my case we do support several OAuth credential providers to be registered in addition to (or instead) the email - in case of a security breach customers can be contacted through these channels and one hopes not all were compromised. Something like a phone number would be even better because it relies on something physical, but that would require a lot of resources for a small product so I am hoping the email providers offer this and customers can first restore their emails and then our site's credentials. – Gregor Petrin Mar 28 '13 at 08:46
  • @GregorPetrin - I agree, that's why I mentioned the SMS system, though any alternate contact means would work. There are existing systems that will let you send messages to SMS for what I think it reasonably cheap. Maybe even free for lower volume. – AJ Henderson Mar 28 '13 at 13:07
3

The email address is often a bad thing as a username (for highly sensitive functionality such as banking) because:

  1. It allows someone causing someone elses account to be locked maliciously (if I know your email address and you bank with Barclays I can lock you out of your account by repeatedly attempting wrong passwords).
  2. You can try hacked username and password combinations from other sites on the banking site.
  3. It makes it possible to iterate through many accounts trying a few common passwords on each
  4. It reduces the amount of information an attacker needs to know (essentially the user id is a bit of information you are expected to write down or otherwise record but that still extends the effective length of the password).
  5. Email addresses can and do get re-used. You don't want someone who happens to get re-allocated a used email address to be able to carry out reset password requests etc.

This all needs balancing against a user being able to remember their username. On many sites there is a send me my username function linked to an email address. Others have covered the positives for using email as user id already though.

Andy Boura
  • 759
  • 3
  • 10
2

Can I treat it as I would a normal username or should extra precautions be taken, like 'sanitizing' the email address before comparing it for uniqueness by removing any + parts (e.g. somebody+note@example.org)?

See I Knew How To Validate An Email Address Until I Read The RFC for more information, but I wouldn't go about sanitising the address, as believe it or not the following are all valid:

  • "Abc\@def"@example.com
  • "Fred Bloggs"@example.com
  • "Joe\Blow"@example.com
  • "Abc@def"@example.com
  • customer/department=shipping@example.com
  • $A12345@example.com
  • !def!xyz%abc@example.com
  • _somename@example.com

Using the email address as a username can lead to username enumeration if implemented incorrectly, and it is more of a target than arbitrary usernames - the attacker is more likely to know valid ones since they are likely to be the public email addresses of users. However, this is a solvable problem.

Account lockout DoS can be mitigated by throttling repeated failed login attempts or password resets by both email address and IP address individually.

You should also validate the email address of all new users by getting them to click an activation email. This ensures the email address is the user's and that they can reset access in future if need be. You can combine registration with the password reset system in order to kill two birds with one stone.

SilverlightFox
  • 33,698
  • 6
  • 69
  • 185
1

Email as username is a poor idea, because a significant class of potential users have no long term control over their email. ISPs come and go, and for one reason or another email addresses become unusable or unavailable.

ddyer
  • 1,984
  • 1
  • 12
  • 20