What is the best way to calculate true password entropy for human created passwords?

Question

Okay, I know it might seem this has already been beaten to death but, hear me out. I am including a fairly good password strength algorithm for my app for users on sign-up. This one, which I've copied (with minor adjustments). I also want to give a ROUGH metric in addition to the strength tester. I want to calculate and communicate users' password entropy by cost to crack in the same way 1Password has here. I think this can communicate well to users in a way that is real to them.

Here is a common problem which leads to my question, password entropy. I will give users a switch to flip, whether the password is human-created or machine random. Now machine random has its own set of entropy calculation issues such as whether it is a totally random sequence, is it a symbol-separated word sequence chosen from a 307,111 word list, etc, etc. I've got that covered. The trouble is some human passwords seem stronger than machine crypto random:

Issue with standard password entropy calc methods:

1Password machine random - rmrgKDAyeY = 57.37 bits entropy
Human created non-random - isAwtheSUN = 57.37 bits entropy

Obviously, this would not be a good estimation...

I tried using log(pow(2500, 4))/log(2) => 4 words, 2500 possible combinations based on people using easier-to-remember words, as a percentage of the average human vocabulary of about 20,000 and this gave a resulting entropy of 45.15. This seems pretty reasonable. But I need to hear from the pros and looking for other ideas.

What metrics could be used to calculate human-created passwords so the result is much less secure looking than machine randoms?

Keeping in mind I'm after entropy only so to give users a cost-to-crack estimate. I know nobody but us cares about entropy.

[Humans are bad at random generation](https://crypto.stackexchange.com/q/87978/18298) which is why prefer [DiceWire-like](https://www.eff.org/dice) password mechanisms. Besides Entropy is a measurement of the quality of the source, not the output that has zero entropy. Strength is a better term. — kelalaka, Oct 08 '22 at 21:26
I agree 100%, I think every security aware & caring person would. But I know not all users use machine gen passwords. Thats why I am asking about this and why I'm enforcing a higher degree of security for passwords. I'm doing it smartly not unreasonable- see first link above^^^ — RobbB, Oct 09 '22 at 03:26
Potential duplicate: https://security.stackexchange.com/questions/33914/computing-entropy-for-a-passphrase — schroeder, Oct 09 '22 at 07:57
_"This one, which I've copied (with minor adjustments)."_ - And, like many strength checkers, it fails hard on passphrases. It classifies the reasonably strong passphrase `beach mommy tray zen` (about 50 bits of entropy) as "weak", while it has roughly the same strength as 8 random ascii characters (including punctuation). — marcelm, Oct 09 '22 at 09:31
Would you believe there is a chance that your perfectly entropic machine will sometimes generate the password: "qwertyuiop"? As @kelalaka says, it's more about the quality of the source, not the output. — Gregory Currie, Oct 09 '22 at 10:35
@marcelm, usually a passphrase includes some symbol separators, im okay with `beach mommy tray zen` having a low score. `beach-mommy-tray-zen` gives a very high score. @Gregory - I agree very much, this is why strength calculator needs new internals and why I want mine to be really good — RobbB, Oct 09 '22 at 17:59
@RobbB What, why? `beach mommy tray zen` and `beach-mommy-tray-zen` are completely equivalent from a security perspective. — marcelm, Oct 09 '22 at 19:48
touche! i stand corrected- the calculator does not consider the space very secure, considers it a lowercase letter. I will adjust that in my own calculator — RobbB, Oct 09 '22 at 20:34
Whatever metric one comes up with would probably have some rather glaring weaknesses, as there are way too many ways human laziness can come up with bad passwords, especially when you show that metric to users or try to enforce it. — NotThatGuy, Oct 10 '22 at 08:13
Yeah, that thing is completely bogus. `vFjIbHvWsI` is "good" while `hyxgvdmehwvxj` is supposedly "very weak". — AndreKR, Oct 10 '22 at 15:46
I doubt it's possible to genuinely answer this problem. For example, is `19450706` an eight-digit random number generated using a cryptographic-quality RNG (very high entropy), or is it my birthday (very low entropy)? What's the entropy if it's the date of an event of personal significance? Historical significance? On the other hand, does your threat model consider targeted attacks? A birthday may have very low entropy w.r.t. a targeted attack while being equivalent to ~5-6 genuinely random digits for an *un*targeted attack. — Matthew, Oct 10 '22 at 17:05
So really there is a hell of a lot more to consider than just human generated/crypto random generated. Users that use personal data in passwords are asking for it. — RobbB, Oct 11 '22 at 05:20
@Matthew: correct. No algorithm can determine the entropy difference between `beach mommy tray zen` and `correct horse battery staple`. That requires real-world knowledge, and therefore is time-dependent. — MSalters, Oct 11 '22 at 12:11
Imagine if you use words from your first language. They will look likely strong passwords for me. While they are not. — peterh, Oct 11 '22 at 15:27

score 38 · Answer 1 · answered Oct 08 '22 at 19:36

38

There isn't really a "true" level of entropy for passwords - all you can ever do is estimate. And this is especially true when the only information you have is the password itself.

Imagine a password like RobertAmazonMonday. On the face of it, it's three "random" words, so you can have a guess at the entropy based on your assumptions about the average size of someone's vocabulary. But if that password was set by a guy called Robert on his first (Mon)day working at Amazon, then that completely changes the situation.

So I'd question what the problem is that you're really trying to solve. If it's just giving users an idea of how strong their password is, then taking an approach where you look at length/character set initially and then reduce the strength estimate for strings within it (such as words, common patterns, their username, etc) like existing tools (such as KeePass) do could work.

But is giving them a number sensible here? Does a user care than their password has an estimated entropy of 57.23 bits vs 53.86 bits? Can they make a meaningful decision based on that information? Is the difference between your password taking 500 trillion years and 600 trillion years to crack with some arbitrary work factor (because you're using a decent hashing algorithm) relevant?

answered Oct 08 '22 at 19:36

Gh0stFish

6,800
1
23
23

Very good points. On the third paragraph I might add that users will not be interested in these numbers, i agree. They will be more interested in cost to crack such as what 1Password has done as linked above. I will edit my original post to make it more clear that I'm after the cost-to-crack metric, but will get this from the entropy. I think this is a good metric users might care about. Especially when protecting high value data. – RobbB Oct 08 '22 at 23:21
I guess what i'm really trying to get at is getting closer to a real entropy for human created passwords so that they aren't estimated to have an abnormally high entropy and therefore cost-to-crack versus a computer crypto random gen. But this involves knowing how passwords are cracked and engineering from there... – RobbB Oct 08 '22 at 23:26
3

@RobbB I think the key thing is to pick out strings from the password, and then to treat those differently. So `Rober` is calculated as five random characters (`52^5`), but `Robert` is one word (out of ~30k). You can see this is basically how KeePass does it - when you type `Rober` it shows 28 bits of entropy, but when you add the `t` that drops to 15 bits. It's a bit crude, but certainly better than treating them as fully random. – Gh0stFish Oct 09 '22 at 08:40
1

Couldn't agree more! and as of yet it seems zxcvbn is the only calc that does exactly this. Tried entering some passwords into the demo as linked in comments below and it works exactly as you've stated above. If it senses zero words or dictionary matches, it flips into random mode and gives different results. – RobbB Oct 09 '22 at 18:04
10

@RobbB "But this involves knowing how passwords are cracked": exactly. Entropy is effectively a measure of unpredictability — which depends on _how_ you predict. Prediction algorithms get better over time as they understand the data they work on; but of course, as people change how they choose passwords, predictors have to change to match. So password entropy isn't really a fixed quantity… – gidds Oct 09 '22 at 23:10
@gidds It is, but estimations of it aren't. – user253751 Oct 10 '22 at 13:31
3

@user253751 Entropy is *contextual*: given the string "a", you might calculate it as one of 26 one-letter strings, one of 52 case-sensitive one-letter strings, one of 127 one-character ASCII strings, etc. It's meaningless to say that one of those gives the "true" entropy, and the others are "estimations", unless you have a reason to choose that particular context. The most useful context to judge a password's entropy is not actually how it was generated, but *how it is going to be attacked*; new attacks mean new contexts, so new calculations of entropy. – IMSoP Oct 10 '22 at 13:48
1

@IMSoP the "true" entropy depends on whether you ***actually did*** pick it as one of 26 one-letter strings, 52 case-sensitive one-letter strings, etc! – user253751 Oct 10 '22 at 13:51
Mind you, maybe we don't care about the true entropy, but the entropy according to password-cracking algorithms. Which is rarely below the true entropy. – user253751 Oct 10 '22 at 13:51
@user253751 I don't agree that the method of selection is any more "true" than the method of cracking. If I pick a string of random letters, and come up with a dictionary word, the useful measure of entropy is the length of word list likely to contain it, because there's a very good chance that an attacker will use such a list before iterating random combinations of letters. – IMSoP Oct 10 '22 at 14:35
1

@IMSoP "Mind you, maybe we don't care about the true entropy, but the entropy according to password-cracking algorithms. Which is rarely below the true entropy." <- your chance of randomly picking a dictionary word is low – user253751 Oct 10 '22 at 14:36
I think this is where this is all going. Reverse engineer from a cracking algorithm and based on successful attacks. – RobbB Oct 11 '22 at 05:23
The fundamental problem with the number is the same as implementing any solution based on [xkcd's correct horse battery stapler](https://dropbox.tech/cms/content/dam/dropbox/tech-blog/en-us/2012/04/password_strength.png): assuming the input is random. Humans are a terrible source of random. The correct horse battery stapler trick only works when none of those words were your idea. You need a source of random. Without that the numbers shown by this are wrong. But it encourages people to try harder so everyone's just gonna shrug and put up with it. – candied_orange Oct 11 '22 at 14:32

user2233709 · Accepted Answer · 2022-10-08T19:02:29.787

19

I’d suggest you have a look at zxcvbn. It uses a dictionary and identifies common words, other common patterns and common substitutions to provide a fairly good estimation of the entropy of the process that produces the password.

edited Oct 08 '22 at 19:02

answered Oct 08 '22 at 18:50

user2233709

745
6
13

Some very interesting & smart metrics to work with in this package – RobbB Oct 08 '22 at 23:36
19

`zxcvbn` is pretty great, but I don't think this answers does it justice. It would be good to explain _why_ it works the way it does. Perhaps also link to [this blog post from its creators, detailing how it works](https://dropbox.tech/security/zxcvbn-realistic-password-strength-estimation), and [this interactive example](https://lowe.github.io/tryzxcvbn/). – marcelm Oct 09 '22 at 09:28
Thanks @marcelm for the comment. I just read the blog post; I did not know about it. Feel free to edit and improve my answer, or to write your own. – user2233709 Oct 09 '22 at 09:44
very good info here, will attempt to implement this in the next little while. This should be the correct answer but it needs more examples and substance like @marcelm has shown. – RobbB Oct 09 '22 at 17:50
@RobbB I understand your point. But when I read your question, I remembered that tool I used a few years ago and thought it might be interesting for you. That’s why I wrote that answer. I did not really remember much details about how `zxcvbn` works. Now I read @marcelm’s comment and Dan Wheeler’s blog post, but I wouldn’t feel comfortable copying parts of his post in my answer. if marcelm (or, even better, Dan Wheeler) feels like writing a more detailed answer about zxcvbn, I’ll be happy to upvote it and delete my answer. If someone wants to edit and improve my answer, that’s fine as well. – user2233709 Oct 09 '22 at 22:56
Fair enough, out of all the answers this one pointed me in the correct direction and answered the question most precisely (including @marcelm comment with the blog post and interactive demo). This is the accepted answer unless someone else wanted to scoop it with a more descriptive answer. Thanks for the help and resource :) – RobbB Oct 09 '22 at 23:33

score 17 · Answer 3 · answered Oct 10 '22 at 17:10

One password does not have entropy. A method to generate a password has entropy.

This is quite clear in the table in the 1Password page you linked to. They give entropy of the method ("3 word, constant separator", "8 char, uppercase, lowercase, digits"...), not the password. The password given is just one example.

So if the method is "15 characters picked randomly from a set of 72 values" (upper and lower basic latin letters + 10 digits + 10 symbols), then the entropy is 92.55 bits. Whether one actual result generated this way ends up being abcdefghiklmno or something that looks truly random, the entropy is the same (for a brute force crack, which is what entropy is about, abcdefghiklmno is as random as any other password).

From a password, you can try to guess how it was generated. So if you see 15 lowercase letters, you can estimate the method to be "10 characters picked randomly from a set of 26 values", and the entropy to be 70.5 bits. If you detect uppercase letters, or digits, or symbols, you'll change your estimate of what the method is, and what the associated entropy is. But the password does not have entropy. The method to generate it has.

But your guess is just a guess. Maybe the password was actually generated as a set of three 5-letter words taken from a list of 1000 words. Entropy 29.9 bits. Or a set of three 5-letter words taken from a list of 100 000 words (entropy 49.83 bits). Or anything else.

Or, more likely than not, the password was generated from a limited set of words and numbers. Say, some significant name (spouse, children, company...), some significant number (date of birth, of marriage...), a random special character thrown in there, a random character uppercased, and a few possible permutations (name+number+symbol or name+symbol+number, etc.). If you know the user, entropy is probably less than 10 bits. If you don't know the user, then it does increase, but remains quite low (I'd think less than 40 bits).

That's why people do dictionary attacks rather than brute force attacks. Because most people don't generate random passwords.

So no, you cannot determine the entropy from a single password. You can make guesses, but that's about it.

First things first:

If you can avoid having to deal with any passwords yourself, do so! Rely on some other platform for authentication. Make it their problem.
Check passwords against the top-whatever list of most common passwords.
Check password hashes against known password leaks.
Make sure the password is long enough.

I don't think much else really makes a difference, and can be counterproductive.

score 2 · Answer 4 · answered Oct 11 '22 at 12:39

The true level of entropy depends on the probability that the attacker will consider a given password. Since nobody knows in advance how the attacker acts, this level cannot be reliably known.

if the attacker brute-forces by iterating through all characters, both passwords have equal entropy.
if the attacker uses a dictionary and combines words/characters into a password (trying longer words first), then isAwtheSUN will have a lower entropy that rmrgKDAyeY.
if the attacker uses a password list, the entropy will depend on whether either rmrgKDAyeY or isAwtheSUN are part of the list, and their position. It's even perfectly possible that rmrgKDAyeY has lower entropy than isAwtheSUN.

The character count corresponds to the entropy for the approach #1. Modern brute-force techniques are relying on approaches #2 and #3, so it's commonly considered that character count is a poor entropy metric, and an entropy calculation based on on #2 gives more plausible results.

Still, it's important to understand that any password entropy metric remains an estimation, and its "quality" may change at any moment. It's possible that once quantum computers become available to password crackers, the metric based on character count will once again become "state of the art".

score 2 · Answer 5 · answered Oct 11 '22 at 18:06

As noted by others, passwords don't individually "have entropy". Entropy is a property of their relationship to a larger distribution of password choices, possibly through a method by which they were generated.

Conceptually, the "best" way to measure entropy of passwords is with an optimized compression algorithm (which is largely a matter of an optimized dictionary) that best compresses the passwords "known to be weak" (known to appear widely). You could approximate something like this using existing known sets of compromised passwords, natural language dictionaries, and advanced statistical modeling (aka "AI" language models) of relationships between words. Then, the number of bits in the compressed version of a given password might be a reasonable estimate of "its entropy".

However, even then, it will not tell you whether there are external correlations between the password and its owner, the context in which it's used, etc. that reduce the real effective entropy an attacker would be working with.

score 0 · Answer 6 · answered Oct 11 '22 at 17:41

Your password would have entropy relative to a password guesser. You could just say that if guesser X guesses your password in 2^n attempts, you could say you have n bits entropy relative to that guesser. You could implement several password guessers and try them all.

A problem is that if your entropy is high, then guessing the right password will take a long time. If it takes you an hour to find the entropy of a password that a guesser takes an hour to crack, that's not too useful. However, you have a huge advantage: You know the password. So if you know for example that the password guesser will now guess 123,456,789 eight letter passwords, and your's is nine letters, you don't have to actually try those 123 million passwords. And if your password started with K, you can likewise skip testing all the nine letter passwords starting with A to J. So you might be able to determine very quickly how many attempts it takes your password, given that you know it.

In addition, entropy is one thing. The time it takes to try whether a password guess is correct is another thing. For example, iOS passcodes supposedly take 80ms to test one passcode. A six digit passcode would. take up to almost a day to crack.

What is the best way to calculate true password entropy for human created passwords?

6 Answers6

Linked