8

I was just thinking about this the other day, after reading about making safe passwords, you have a few options:

The first would be, adding numbers, or something other than just a word

Password15068 or Pa55w0rd15068 or Passw0rd15068

But I believe the general consensus is that the more random, L337 speaky you get, the harder it is to remember.

The next method I read was taking 4-5 words and combining it to a phrase or a single word, say our 5 words are "dog" "walks" "in" "the" "yard". Possible passwords could be:

Dog walks in the yard or Dowainthya or some other jumbled combo...

But this poses the same issue of hard to remember. So it got me thinking, I don't think I'd ever do it, but is cracking software advanced enough to assume an emoji is 1 character? or would it read it out in unicode. Another thing is say I use this emoji: this generates to this code xn--ss8h so would the password "Password" be read as 9 letters/nums? or 17?

Password or Passwordxn--ss8h

If it were to be read as the unicode, that would add an extra amount of letters for a cracker to have to guess. (assuming it's a brute force attack). Even in the event where the BF cracker finds the password Passwordxn--ss8h would entering that in a password slot not let you log in because it's not the matching emoji?

This is obviously a theoretical situation, I don't think i'd ever actually condone this, but I think it could definitely help in the future when, instead of the password happyfacemoneybags you could use: or xn--ss8h7u but the system would only let you log in if the correct icons are in the slot.


Update 1

This is in reference to running the possible passwords through a password checker.

Control Password: hellohello

hellohello generates a score of: 0% and a complexity of "very weak".

Some key things to note:

  • I gain a bonus of +40 due to length
  • I am deducted a total of -52 due to letters only, consecutive lowercase and repeat words

hellohello generates a score of: 42% and a complexity of "good".

Some key things to note:

  • I gain a bonus of +64 due to symbols, lowercase, and length
  • I am deducted a total of -24 due to consecutive lowercase and repeat words

HelloHello generates a score of: 80% and a complexity of "very strong".

Some key things to note:

  • I gain a bonus of +94 due to symbols, lowercase, uppercase, and length
  • I am deducted a total of -18 due to consecutive lowercase and repeat words

HeLLoHeLLo generates a score of: 88% and a complexity of "very strong".

Some key things to note:

  • I gain a bonus of +98 due to symbols, lowercase, uppercase, and length
  • I am deducted a total of -10 due to consecutive lowercase and repeat words

Based on this password, clearly adding the symbol is making it a little bit more secure, but I am not sure what algorithm this password checker is based off of.

knocked loose
  • 265
  • 1
  • 8
  • 2
    “ this generates to this code xn--ss8h”... This is incorrect. This only applies to Punycode (an encoding for URLs) and only when the emoji is the only character in the string. Most systems would receive the emoji as UTF-8 (`0xF0`, `0x9F`, `0x92`, `0xB0` in this case), and then apply the hashing algorithm as usual. – Arturo Torres Sánchez Oct 19 '16 at 00:11
  • you’ve openened pandoras box... – user2497 Feb 22 '19 at 13:50

4 Answers4

12

The difficulty of cracking a password is measured by the entropy in the process used to generate it. This can be measured for most passwords by taking the base-2 logarithm of the number of symbols raised to the power of the number of symbols used. For instance, a randomly-chosen password consisting of 8 lowercase letters has log2(26 ^ 8) ~ 38 bits of entropy in it.

This method can be extended to "correct horse battery staple"-style passwords. In this case, each word is itself a symbol, chosen from a dictionary of the 10,000 most common words. A password generated randomly this way would have log2(10,000 ^ 4) ~ 53 bits of entropy, and be significantly harder to break.

Hopefully by seeing these two examples, you can see that dramatically increasing the base has a far, far smaller effect on password entropy than modestly increasing the exponent. This is important, because by adding additional symbols to choose from (uppercase letters, digits, symbols, and at the extreme, non-Latin Unicode code points like emoji, RTL markers, kanji, etc.), you're increasing the base and not the exponent. This will generally make your password harder to crack, and may even make it more memorable to you, but increasing the length (and thus the exponent) will help even more.

For instance, by including lowercase letters (26), uppercase letters (26), digits (10), common symbols (32), and a handful of emoji (say 200, for example's sake), a random eight-character password will have log2(26 + 26 + 10 + 32 + 200) ^ 8) ~ 66 bits of entropy. On the other hand, simply doubling the length of a password consisting only of lowercase letters would result in a password with log2((26 ^ 16) ~ 75 bits of entropy, which is over five hundred times more difficult to crack.

While adding emoji can make a password stronger, it's not as effective as simply increasing the length. It won't turn a terrible password like Password15068 into something meaningfully more difficult to crack. And it simply won't work in many real-world password input fields (some will reject it outright, and worse, many others will silently ignore the character). Furthermore, a two-character password like will never be secure, for the simple reason that there are only a handful of emoji to choose from; even if you assume ten thousand of them, you're left with a password with only log2(10,000 ^ 2) ~ 27 bits of entropy.

Your best bet is to simply use a password manager. Humans are terrible at memorizing complex passwords, but computers are great at it. Password managers give you the benefit of strong, unique passwords for every site you visit and remove the weakest component from the chain: the human.

Update: I didn't make this clear in the original post, but linear increases in entropy represent exponential increases in difficulty to crack. A password with n bits of entropy requires 2^(n-1) operations on average to crack, so each time the entropy increases by one, the effort required to break the password doubles.

Stephen Touset
  • 5,774
  • 1
  • 23
  • 38
  • Agreed. It won't make a terrible password better. – Mark Buffalo Oct 29 '15 at 19:25
  • So if there is a capital in the password, wouldn't you have to use `52` since it could be either upper or lowercase? `a` registers differently than `A`. – knocked loose Oct 29 '15 at 19:27
  • 2
    Yes, but I specifically mentioned lowercase letters in the example I gave. The entire point of my answer is that increasing the symbol count (e.g., 26 to 52) doesn't increase entropy as quickly as slightly increasing the length (e.g., 8 to 10). – Stephen Touset Oct 29 '15 at 19:29
  • 1
    Not to have sour grapes, but it's generally considered good practice to withhold "accepting" an answer to a question for enough time to give multiple people a chance to provide answers. Accepting too early discourages additional answers. – Stephen Touset Oct 29 '15 at 19:40
  • 1
    @StephenTouset I found your information on password length and entropy rather informative, and appreciate it. I learned something here. I knew longer passwords were always much better, but I hadn't really been able to put it into words until now. – Mark Buffalo Oct 29 '15 at 19:50
3

The purpose of complex passwords is to defend against brute force attacks. Unless you are being directly targeted by an attacker, it's likely that the attacker will just try easy passwords and permutations of them (where "easy" still covers tens of thousands of passwords), then move onto someone else. Those easy passwords are usually based on dictionaries of common (pass)words, and not just trying every possible combination of characters. At the moment, because almost nobody uses emojis in their passwords, using them is likely to make your password unique and therefore incredibly unlikely to be cracked (as long as it's stored in a remotely secure way), simply because it's obscure. Even if an attacker did add emojis to their dictionary, there are enough emojis in unicode that, unless you pick something as obvious as ☺, it would still be stronger than most other characters that you might put in your password. So yes, emojis do make your password safer.

However, if you do include an emoji, that doesn't mean that you can just not follow other password guidelines, particularly using a different password on each site. There are some sites that, against all best practices, store users' passwords in plain text or a format that can be converted to plain text. If you used the same password everywhere and one of those sites experienced a security breach, that would make the benefit of including an emoji in that password be precisely zero, because the attacker can now just log into other sites.

Another thing is say I use this emoji: this generates to this code xn--ss8h so would the password "Password" be read as 9 letters/nums? or 17?

Unless the service that your password is for is doing weird things, it would be 9 or 13 characters, depending on the attack used. The xn-ss8h is just a representation of that character to allow it to be used in things like URLs, whereas in reality it is just a number. In hex, that number is 0xF09F92B0, so if an attacker was determined to get your password, they would either brute force it by trying every character (including ), or they would try it byte by byte (in which case it's 0xF0, 0x9F, 0x92, 0xB0).

JackW
  • 713
  • 3
  • 8
1

If we assume that the method described by Randall Munroe in this XKCD comic strip is sufficient, then we can determine how many emojis you neeed to make a sufficiently strong password out of only emojis. And as it turns out, emojis are quite comparable to words. There are 3,178 emojis in the Unicode standard (~11.6 bits), while Munroe rated each word at around 11 bits (2,048 word dictionary).

tl;dr: "️♐" is at least as good as "horse battery staple correct".

As others have mentioned though, it's very much a possibility that websites will misbehave, either by not allowing the password, or by accidentally weakening it.

Fax
  • 175
  • 6
0

DISCLAIMER: I am not a cryptography expert or anything, but here's how I would try an attack against this kind of thing.

This is obviously a theoretical situation, I don't think i'd ever actually condone this, but I think it could definitely help in the future when, instead of the password happyfacemoneybags you could use: or xn--ss8h7u but the system would only let you log in if the correct icons are in the slot.

I'm having trouble understanding what you mean by, "will only let you log in if the correct icons are in the slot."

It's possible it could "help" only by making passwords longer, unless emojis are only output as lower-case characters. Here's why:

If you have an emoji unicode reference output to xn--ss8h as a string, which is all lower case, then that means you can't use X, N, S, or H. So that reduces the amount of time I need to crack your password.

Regarding two-emoji passwords, I would break them almost instantly, if not instantly.

  1. Take the full Emoji list.
  2. Create a struct table / whatever. Here's some C++ semi-pseudo code.

    typedef struct EmojiDef {
        string face;    
        string value;   
    } EmojiDef,*pEmojiDef;
    
    static EmojiDef EmojiTab[]=
    {
         {"", "xn--ss8h"} // etc...  would be the actual unicode value, whatever it is. xn--ss8h would be your formatting. 
    };
    
  3. Loop through the table, and generate all the possible passwords.

  4. Brute force attack using the generated list of passwords.
  5. Wreak havoc.

As you can see, this may actually make it worse, especially in the case of two emojis. They can be represented in a structure table quite easily. Since you are removing several possible characters by making them all lower case, it would probably make it easier.

I'm not sure at this point whether or not there are programming languages will allow you to create unicode references in either lower case, or upper case, or mixed case, but in your example, I believe you're subtracting the number of required characters to perform a successful brute force attack. Let me try to explain it better:

xn-ss8h7u can't be xN-sS8h7u, can it? Can it be XN-SS-8h7u? If you can only have lower case characters when referencing the emoji unicode, then you're screwed!

UPDATE

If my password is hello i can only access it as hello using xn--ss8hhello would not let me log in. So let's take that into consideration

Remember the structure table? That was just for converting the unicode character to your string output. I will still crack low character emojis very quickly using array comparison loops.

UPDATE 2

Now I'm seeing your point... but if you introduce each emoji like this, you're also likely reducing the amount of necessary characters to test against. Each emoji could be referenced as a single string test, but there are thousands of them, so you're actually increasing the strength of your passwords by quite a bit, if you use a combination of emojis, mixed-case, etc. For simple passwords, this would still be very crackable. You simply change the structure table to an array/list, and insert every single reference, then loop through it, and try to attach it to a weak password. Something stupid like this:

for (int i = 0; i < emoji; i++)
{
    if (checkPassword(emoji[i] + dict[i]) { writeFound(emoji[i] + dict[i]); } // check either side, etc. 
}

UPDATE 3

hellohello generates a score of: 42% and a complexity of "good".

HelloHello generates a score of: 80% and a complexity of "very strong".

HelloHello generates a score of: 80% and a complexity of "very strong".

HeLLoHeLLo generates a score of: 88% and a complexity of "very strong".

Comparing a string array of emojis (either before or after) to a badpassword in a brute force table, generates a score of: all of your passwords cracked in an hour or less. Trying this:

  • emoji + badpassword
  • badpassword + emoji

...would be defeated very quickly through string concatenation of the unicode reference plus the bad password (either before or after), against the input.

CONCLUSION

Yes, I believe so, but only if used in conjunction with typical best practices for passwords, and by using more than one of them. At that point, a lot of people suggest using a password manager instead.

Maybe someone with more knowledge should chime in, but I wouldn't trust several parts of this system at first glance, especially if you wanted to use two-emoji passwords. Or even 10, or 20 in a row. That essentially becomes a collection of numbers represented in an array, and smaller number-based passwords are among the most insecure, and easy to crack.

Mark Buffalo
  • 22,508
  • 8
  • 74
  • 91
  • 1
    Hmm, good points. With unicodes, doesn't there have to be a command to convert `xn--ss8h` from text to an icon? Let me change my password on something real quick and see if I can log in with the unicode. So I used a local hosted site that didn't restrict your content in the password, to do this. If my password is `hello` i can only access it as `hello` using `xn--ss8hhello` would not let me log in. So let's take that into consideration. – knocked loose Oct 29 '15 at 18:32
  • 1
    So with your 2nd update, I decided to test the theory. I am using [this site](http://www.passwordmeter.com/) to check password strengths. The only thing is that it has to be 8 chars or more, so I'll use `hellohello` as the password in question. (I'll update my question). – knocked loose Oct 29 '15 at 19:00
  • I would still crack your two-emoji passwords instantly, and I'm not even a cryptologist.. You can output unicode characters in almost any programming languages. Parsing all of them would be no problem. You'd need to use a mix of emojis and the usual password length/etc. Some would suggest password managers to make this unnecessary, though. – Mark Buffalo Oct 29 '15 at 19:06
  • 1
    Yes, I understand that the 'emoji story' password would be easiest to crack, mostly due to length, but now I am considering of tacking it on to an already insecure password, such as `hellohello`. – knocked loose Oct 29 '15 at 19:13
  • No, that wouldn't help. See my latest update with the ghetto `for loop`. Add the emoji before or after, and check against all insecure passwords in a database. You'll loop through the array, testing all emojis using simple string concatenation. `"emoji + weakpassword";`. `"weakpassword + emoji";`, etc. It will only help if you combine them with best password practices. – Mark Buffalo Oct 29 '15 at 19:15
  • 1
    Ah, sorry, I didn't catch the loop. I'm not really a back end coder so i didn't quite understand it. But yes, I see how it would just crack for the weakpassword, then concatenate the emoji options on to find the *secure* password. However, how would you, or in this case, the computer determine that it has cracked the weak password to start trying emojis. Or even how would it know what space contained an emoji? – knocked loose Oct 29 '15 at 19:19
  • Eventually, it will be known. Maybe someone will attempt it for fun, maybe your server gets hacked and they find your code. Maybe others will try what you did with emojis, and reveal what they're doing to others. You're relying on security through obscurity here. I've already defeated your weak+emoji, emoji+weak passwords without barely trying. Imagine a skilled cracker... bad news bears. It wouldn't be likely for you to use normal non-emoji unicode, as most people simply can't type those. You should only include them with a strong password to begin with. – Mark Buffalo Oct 29 '15 at 19:22
  • 1
    Hm, I could definitely see how it wouldn't help a already terrible password. Thanks for your answer, I'll be able to sleep peacefully tonight knowing that I shouldn't start included emojis on passwords :) – knocked loose Oct 29 '15 at 19:30