Strength of variable-length generated password

Question

I am contributing to the Word Sequencer plugin for KeePass password manager, which can generate diceware-style passwords using a high-quality PRNG. Something in particular I'm working on is estimating the strength of passwords generated using the tool. I'm having a little trouble figuring out how to account for one of the configuration options, which can set one of the words in the sequence to have a probability of appearing in the generated password or not; i.e. an option to make the password a randomized length.

For the sake of example, suppose you're choosing 2 words from a wordlist of 8 words for your password (obviously you'd actually want a much larger wordlist/number of words, this is just a toy example). If you always choose both words, then the entropy of the password is:

lg(8*8) = lg(64) = 6

or alternatively:

lg(8) + lg(8) = 3+3 = 6

Now, say that you've configured the second word to not appear sometimes. Thus you have a chance of a one-word password (8 possible) or a two-word password (64 possible) for a total of:

lg(8 + 8*8) = lg(9 * 8) = lg(9) + lg(8)

...which should be a tiny bit more than the previous entropy of 6. This should be the entropy if an attacker was guessing JUST THIS ONE PASSWORD and he or she knows exactly how the password was generated.

But it doesn't actually matter if the password COULD have been 2 words long. If it's only 1 word long, and the attacker just guesses all 1-word passwords, the possibility of a second word doesn't really make the one-word password any stronger. So assuming there is a 25% chance of including the second word, maybe a better strength estimate would be the entropy of the expected value of the password space:

lg(8) + lg(3/4 * 0 + 1/4 * 8) = 3 + 1 = 4

Or, maybe it would be the expected value of the entropy itself:

lg(8) + [3/4 * 0 + 1/4 * lg(8)] = 3.75

So my question is: which method of calculating the expected entropy of this generated password is correct?

Should I treat the random length as adding additional possible passwords, thus slightly increasing the strength of a 2-word password?
Should I treat the random length as possibly decreasing the length of the password, so I take the entropy of the average number of password choices, for a strength somewhere between a 1-word and 2-word password?
Or should I take the expected value of the entropy, again for an in-between strength?
Maybe it's something else entirely?

If in fact the "correct" calculation would decrease the password strength when the option is enabled, then I guess that begs a follow-up question: is there any useful reason to even have this option, if it's just going to reduce the work an attacker must do on average?

Neil Smithline · Accepted Answer · 2015-12-04T23:19:56.337

2

Using fewer words from a diceware-style list is equivalent to sometimes allowing passwords with fewer characters. As per the accepted answer to Would allowing shorter passwords sometimes be more secure?:

The benefit gained by forcing increased length far outweighs the number of possible passwords lost.

The same holds true for shorter diceware-style passwords. So always use the exact length and never something shorter.

EDIT: Note that the problem is not that shorter passwords reduce entropy. They actually increase it. But adding shorter passphrases dramatically reduces the effort to guess the shorter passwords. If your diceware word list has X words and your password will have a length of N, the password has X^N possibilities. But a passphrase that is one word shorter has X^N-1 or X fewer possibilities. So these passwords will be much more vulnerable than the ones that are just one word longer. The small gain in entropy is not worth the risk to these shorter passwords.

Consider this, if using passphrases that are 1-word shorter improved security, then the argument could be repeated to prove that adding passphrases that are 2-words shorter improved security, and so on. But ever-decreasing length passphrases is clearly a bad idea. So too is shortening a passphrase by 1 word.

edited Dec 04 '15 at 23:19

answered Dec 04 '15 at 22:38

Neil Smithline

14,702
4
38
55

1

OK, so you'd recommend removing the option entirely it sounds like. For curiosity's sake (and in case the project owner wants to hang onto that option anyway), do you have insight into the correct entropy calculation? – Ben Dec 04 '15 at 22:57
I responded with an edit @Ben. – Neil Smithline Dec 04 '15 at 23:20
OK, so you're saying the entropy of the method increases, but the generated password may be weaker, so the method is a bad idea. To me that says the "expected entropy" of the password itself is probably a better measure than the real entropy of the method since it will show that the method is weaker even though it adds possible password choices. – Ben Dec 04 '15 at 23:36
That sounds right. – Neil Smithline Dec 04 '15 at 23:39
I used the following thought experiment to really convince me, inspired by your closing paragraph. Consider a password generator that has a 1% chance of generating a 18-word diceware password (huge overkill), but the other 99% of the time would always generate the single word, "password". Obviously this would be hugely insecure. Entropy considering all possible passwords would be about 232. But the expected value of the entropy (.99*0 + .01*232) would be about 2, which is a much more accurate depiction of the "strength" of passwords generated with this hypothetical method. – Ben Dec 09 '15 at 15:46
2

@Ben - I would say that the minimum (expected) entropy would be your wisest rating - it doesn't matter if the password "could" be strong if a smart attacker, who checks all 5 word combinations before looking at 6 word combinations, is going to find it on an earlier cracking pass. – Anti-weakpasswords Feb 07 '16 at 23:42

Strength of variable-length generated password

1 Answers1