95

On Chrome, if you open a sign up page, it will offer to fill and remember the password field. I did this and got the following sequence of passwords offered as generated:

suCipAytAyswed0
LUnhefcerAnAcg2
it2drosharkEweo
UndosnAiHigcir0
AKDySwaybficMi5
DorrIfewfAidty5
MeecradGosdovl9
KasEsacHuhyflo4
OngouHemNikEyd0

In all of these, there is only one digit, and in all of them except the third the digit is at the end. The chance of me getting that many with this exact pattern is extremely low, and I've seen it in other passwords generated by Chrome, so I'm confident it would continue if I generated some more.

This just seems weird, wouldn't it be easier to generate the password as a random collection of letters and numbers than to enforce some strange pattern; given that these are not supposed to be remembered? Why would Chrome do something else?

Azteca
  • 1,116
  • 7
  • 16
gngdb
  • 863
  • 1
  • 6
  • 6

4 Answers4

184

Conor's answer is a good starting point, but if you dig into Chromium's source the situation starts to look a little bleaker (but still better than not using a password manager at all).

Chrome 68 (current version as of August 1st, 2018)

Up through version 68 Chrome follows FIPS 181 to generate a 15 character pronounceable password allowing uppercase letters, lowercase letters, and numbers. If the result doesn't contain both an uppercase letter and a number, it changes the first lowercase letter to uppercase, and changes the last lowercase character to a random digit.

Unfortunately the entropy of a FIPS 181 password is pretty hard to calculate, as it generates variable length syllables rather than characters, and there are a bunch of rules dictating whether or not a syllable is allowed.

The non-uniformity has severe implications. A 1994 paper (page 192) estimated that to break into 1 out of 100 accounts with 8 character passwords, an attacker would only have to try 1.6 million passwords. Even if increasing the length from 8 to 15 doubles the entropy, that's still probably under 60 bits of entropy on average1, though this is improved slightly due to capitalization.

The original standard doesn't appear to support uppercase letters or numbers, and the implementation2 only capitalizes the first letter of a syllable with a 50% chance (interestingly y is replaced with w in the array of characters checked, so y will never be capitalized). This means that rather than adding about 1 bit of entropy per letter, capitalization only adds 1 bit per syllable. The number of syllables isn't constant, so it's hard to determine how much entropy this actually adds, but given the scarcity of single letter syllables it almost certainly adds less than 8 bits on average.

Numbers and symbols are supported by turning each single letter syllable alternately into a digit or symbol with 50% chance (though the symbol feature isn't used). Unfortunately, as you have noticed, single letter syllables are uncommon3, so ForceFixPassword usually ends up swapping out the last lowercase letter for a digit.

There may be more issues, but I'm getting a bit tired of looking at it. In short, this isn't a very good method of generating passwords, and the entropy is significantly less than one would expect for the length. In practice this is still probably ok for the average user, as it means they won't be using their favorite low entropy password in 20 different places, but breaking it is definitely possible for a determined and funded attacker with a fast hash (ie not a good password hash) of the password.

Chrome 69 (scheduled for release in September)

Things are looking much better in Chrome 69. The character set is upper and lowercase letters, numbers, and the symbols -_.:! with the following removed for readability:

  • l (lowercase letter L)
  • I (capital letter i)
  • 1 (digit one)
  • O (capital letter o)
  • 0 (digit zero)
  • o (lowercase letter O)

The generation works by adding a random character from each class until the minimum count for said class is met. By default and as currently used this is one lowercase character, one uppercase character, and one digit.

Then the rest of the password is filled with random characters evenly chosen from all character classes (respecting a maximum count for each class, currently unused).

Finally, since characters were added to the beginning of the password from predictable classes in order to satisfy requirements, the string is randomly shuffled. The shuffling happens up to 5 times if two dashes or underscores are adjacent in order to improve readability, so this will very slightly reduce the entropy, but the reduction is so slight as to be unnoticeable (and removing dash or underscore from the allowed symbols would be worse).

With 61 possible characters a fully random password would have log2(6115) = 88.96 bits of entropy. Using inclusion-exclusion to account for the required characters, I come up with 88.77 bits of entropy:

61^15          all possible passwords
-53^15         passwords without digits 2-9 (0 and 1 are excluded)
-37^15         passwords without lowercase letters (l and o excluded)
-37^15         passwords without uppercase letters (I and O excluded)
+29^15         add back passwords excluded twice for lack of digit and lowercase
+29^15         add back passwords excluded twice for lack of digit and uppercase
+13^15         add back passwords excluded twice for lack of lowercase and uppercase
-5^15          remove all-symbol passwords that were excluded then added back

The extra shuffling will shave off a fraction of a bit as well, but I don't have time to calculate it right now. In the end, the password should have over 88 bits of entropy, which is pretty good.

The old generator still exists in version 69, but when I tested the dev build it was using the new one. Whether or not there's any way to use the old generator I don't know.


1. Average entropy isn't necessarily useful with non-uniform distributions, the original 1975 paper gives an example (pages 29-30) of a generator that produces a single password (e.g. "password") with a 50% chance, and a high entropy password otherwise. The average entropy may be high, but there's still a 50% chance the password will be guessed immediately. Even so, extrapolating from the 1994 analysis, I believe it should still have well over 40 bits in the worst case.

2. The implementation isn't actually Chrome's, but is taken from the APG program, with minor modifications for compatibility

3. Testing with apg reveals that single letter syllables actually occur in about 33% of passwords, but 70-75% of those only have a single letter syllable at the end.

AndrolGenhald
  • 15,506
  • 5
  • 45
  • 50
  • 7
    Great job on the math, and have upvoted this answer, but I really think that "bleak" is overstating the problem. You're adding "only" 30 bits of entropy. That still makes it literally about 1 billion times harder to guess the password. I'd say that's far from perfect, but you're safe from anything except an offline attack using leaked password hashes. And, even if the password is compromised in an offline attack, they can't use the same password to break into other sites because the password isn't re-used on other sites. – Patrick M Aug 01 '18 at 21:58
  • 1
    @PatrickM Yeah, I'm considering adding a bit more about how it's still probably ok for the average user. Since the non-uniform password distribution is difficult to calculate I'm not actually that confident it's over 50 bits of entropy, but that's still better than people who would otherwise have a much worse password. – AndrolGenhald Aug 01 '18 at 22:00
  • 4
    It should be pointed out that 40 bit of entropy, while low for a generated password, is still probably _orders of magnitude_ better than a password chosen by a user, as many user may choose things like "secret" or "password"... – sleske Aug 03 '18 at 09:48
38

Let me start with an important and accurate caveat:

http://dilbert.com/strip/2001-10-25

The math

Looking at the actual odds, there are 62 possible characters (a-zA-Z0-9) and therefore assuming all are equally distributed that means that any given character has a ~16% chance of being a digit.

You have shown us 136 characters in total which means that on average there should be ~21 digits instead of the 9 you have shown. Running through 1000 random trials I get a mean of 21.5 digits in such a run with a standard deviation of 4.3. That means that your 136 character string with only 9 digits is about a 3sigma outlier (only a 0.3% change of being due to random chance). Of course, this presumes that you didn't have any examples with more numbers that you left out of your example.

The conclusion

This suggests that the lack of numbers is not just random. That could mean a couple things: either the password generation algorithm google is using just sticks a digit on the end or there is something wrong with their password generation algorithm or source of entropy. Without looking at their source it is hard to say which of those cases is more likely. Of course, it isn't so simple as just sticking a digit on the end, because you have an example with a digit towards the beginning. So if this is an intentional result of their password generation algorithm, then their algorithm is very strange, which makes me think something else is going on.

Update Per Androl's answer above, chrome's password generation algorithm is indeed doing something very strange.

The answer

Of course none of this answers your actual question: do these passwords have enough entropy? Ignoring the digit of unknown randomness, we can think of this at least as a string of random letters of length 14. Such a string (presuming they are using a good CSPRNG) would have 1.06e24 possible values, or ~80 bits of entropy. For comparison, a 15 character long string composed of letters or digits would have ~90 bits of entropy. Is that "enough" entropy? Well, that is a much harder question to answer. It depends - is that password then stored on a website that uses md5 and leaks their database for offline cracking rigs? Is it stored on a website that stores passwords in plain text (in which case no amount of entropy matters)? How about for a website that secures your password with modern best-practices? This answer gives a useful comparison and suggests that for plain sha256 a password with 80 bits of entropy is effectively uncrackable (with lots of caveats):

https://security.stackexchange.com/a/168511/149676

As a result, I would say that even if they aren't properly randomizing digits the password is long enough that there is more than enough entropy for the foreseeable future.

Conor Mancone
  • 30,380
  • 13
  • 92
  • 98
  • 4
    I can tell you that even with a "weak" hash like MD5, the keyspace for a brute force attack is simply too massive. Even if we assume there's only ever one digit and it's always at the end, that's still (52^14)*10 combinations. Even at a trillion hashes per second, it would still take eons to crack. – Mr. Llama Aug 01 '18 at 17:53
  • 3
    @Mr.Llama Indeed. I was just trying to get the idea across that "high" entropy or "enough" entropy can be context-specific. – Conor Mancone Aug 01 '18 at 17:58
  • Only way to get further with might be to dig into the code of Chromium, but my C++ isn't good enough to find the code that generates passwords. – gngdb Aug 01 '18 at 18:13
  • Your math is wrong. OP didn't **predict** a single number per password, op retrospectively found a pattern. That's like saying "what's the chance that exactly this combination of human beeings is alive at exactly this moment?". 100%. One will always find a pattern if looking hard enough. Retrospectively saying "this was unlikely" is wrong math. – DonQuiKong Aug 02 '18 at 11:40
  • @DonQuiKong "This is unlikely" is the only correct math. I calculated the odds that that many passwords would have only one digit assuming that digits and letters were uniformly distributed, showed that the odds were low (~0.3%) and therefore concluded that it was "unlikely". Further digging from Arnhold showed that indeed the password is not made from uniformly distributed letters and numbers. I understand that people are good at finding patterns (in fact I have comments and answers just like that on this site), but in this case there really is something behind it. – Conor Mancone Aug 02 '18 at 12:55
  • @ConorMancone I'm not saying there isn't anything behind it. The thing is, you didn't prove that. It would be unlikely if predicted. But there's a bunch of other very unlikely random patterns that could have occured by chance, then you could have calculated that they were unlikely and therefore not by chance and you would be wrong. Yes, this time you're right, but your argument isn't, it's just luckily right *this time*. – DonQuiKong Aug 02 '18 at 13:07
  • @DonQuiKong again, I don't disagree with your general point but your specific application here. I've left plenty of answers and comments about how just because something doesn't "look" random doesn't mean it isn't random (indeed, that is the whole point of the link at the top of my answer, which was my first thought). – Conor Mancone Aug 02 '18 at 13:19
  • However, a little bit of intuition isn't a bad thing. Looking at the list of generated passwords all but one have a single digit at the end. If there was just one or two like that then I would have stopped at "it just looks random because people like finding patterns". But that many? The odds of that are extremely low, much lower than the 0.3% I calculated for having just one digit. – Conor Mancone Aug 02 '18 at 13:19
  • @ConorMancone but the chance of that happening doesn't matter. All of the passwords contain at least one English word. What's the chance of that happening? There's regularly someone winning the lottery. The chance of exactly that person winning is way lower than the chance of having digits at the end in a few passwords. Are they all cheating? The thing is, retrospectively, the chance doesn't prove anything. And still, you are arguing it does. Your conclusion is correct, the argument is not. It's the tipical “there must be a god because the chance of humans evolving is minuscule“ argument. – DonQuiKong Aug 02 '18 at 13:44
  • The problem is, what about all those people that didn't get a combination with so many digits at the end. It's impossible to calculate the probability, because you don't have all data. Some day, someone will come here and say "google chrome suggested "Pickachu1" as a password, did it get hacked?" And you can then calculate the chance to be almost 0 and say "yes". And you'd still be wrong. I do agree that intuition is important in security, but the argument you're using here is plainly wrong. It doesn't prove anything. And it just educates people the wrong way. – DonQuiKong Aug 02 '18 at 13:49
  • Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/81057/discussion-between-conor-mancone-and-donquikong). – Conor Mancone Aug 02 '18 at 14:22
7

You have two wrong assumptions in your question that lead to the answer of why Chrome generates passwords like this and not, say, p^&7+4+{ZgfnP#P/.

First, you assume that the passwords are not supposed to be remembered and should thus be completely random. They are not. Chrome, like many random password generators, creates pronouncable passwords, as outlined in detail in AndrolGenhald's answer.

Using pronouncable passwords instead of gibberish reduces input errors, allows the password to be passed on to another person on the phone, for example, and makes it easier to remember them. Why do you assume that the Chrome passwords are not supposed to be remembered? Very few people these days visit the Internet with only one device.

Secondly, you assume that passwords need to be complicated. This is a common misconception that has been falsified many times, the most official one being the withdrawl of the complexity recommendations in NIST SP-800 (see, e.g. https://www.passwordping.com/surprising-new-password-guidelines-nist/).

Passwords should, most of all, be long. Chrome satisfies that condition nicely (12 characters is a current recommendet minimum for non-critical web applications).

In light of these facts, the passwords generated by Chrome satisfy good, modern password standards. Improvements can be discussed and are certainly possible, but they are certainly on the right track there.

Tom
  • 10,201
  • 19
  • 51
  • 1
    Do you have a reference for the 12 characters minimum? It seems reasonable, but it doesn't seem to be widely discussed; one of the few actual references I could find for that being a recommended minimum is https://www.cnil.fr/sites/default/files/atoms/files/recommandation_passwords_en.pdf section `I`, subsection `1`, case `1` (PDF page 3). – user Aug 02 '18 at 09:38
  • Don't have a reference in my head. Discussions are widespread at the moment, with too many people still stuck in the old ways the way Astrologie still isn't dead. The exact length depends on your scenario and threat analysis, 12 is just a reasonable blanket number. Some people recommend 16 or even 20. That sounds very long but actually isn't if you allow people to use regular words or phrases instead of gibberish, so that they can use regular typing speeds (which makes shoulder surfing more difficult, etc.) – Tom Aug 02 '18 at 11:58
  • 5
    I also had no idea that Chrome was generating passwords that were supposed to be memorable. But if you have a bunch of these types of passwords, I challenge anyone to be able to remember which one goes with which site. They're only barely pronounceable, and there's nothing mnemonic that relates them. I think most people will need to depend on a password manager. – Barmar Aug 02 '18 at 13:33
  • 2
    Maybe I'm just not used to them, but I find the Chrome passwords just as easy to remember as a 20 character uniformly random alphanumeric password (i.e. not at all until after a week or so of typing them) – timuzhti Aug 03 '18 at 04:14
2

Note that this is only a guess on my part, but it's possible that this is an attempt to make the passwords easier to remember and/or transcribe into a password input box. In addition to the odd number placement, I notice that the provided examples seem to be mostly pronounceable, and I've seen other password generators do something similar (e.g. LastPass has an option to make a generated password "Easy to say"). There are public algorithms to do this (e.g. https://exyr.org/2011/random-pronounceable-passwords/ ).

Making them pronounceable and having seemingly non-random number placement obviously lessens the entropy, but without knowing the actual algorithm in use, it's impossible to say by how much. It's possible that the Chrome developers have assessed the generated entropy and decided that the usability benefits outweigh the lost entropy, and that the remaining entropy is still relatively secure.

loneboat
  • 1,444
  • 1
  • 13
  • 16