13

I find that truly random diceware passphrase, more often than not, either contain a word that is easily misspelled or has an order that is illogical. I think there are three ways to make a diceware passphrase more memorable:

  1. Throw out passphrases until you get one you can remember
  2. Throw out individual words that are difficult for you to remember
  3. Rearrange words so that it is easier to remember

Of course, the issue is that all three options reduce the bits of entropy. Number 2 can be avoided by editing the diceware list manually but that is too much work for most people. I still believe that the resulting passphrase will better then many other options and useful for most purposes. However, I am interested what the resulting bits of entropy for each method of these methods.

The entropy of a truly random 4 word passphrase is log2(7776*7776*7776*7776) = 51.7 bits. The worst case for option 3 is that there is only one logically way to rearrange the words. In this case, I believe that the bits of entropy is log2( (7776*7776*7776*7776) / (4*3*2*1) ) = 47.1. I am not sure what formula to use for 1 & 2. For example, if I throw out a word three times what are the bits of entropy. I think 1 is much more ambiguous. On paper, it should not reduce entropy but clearly it does.

tony
  • 283
  • 1
  • 8
  • Your question illustrates the tradeoff between cost (effort to generate, remember, and recover) of a passphrase and strength (entropy) of the passphrase. To decide, I think it is helpful to determine the value of the asset(s) the passphrase is meant to protect. – this.josh Sep 08 '11 at 21:44
  • 2
    I realize that there is a tradeoff but I was interested in a formula to calculate the change in entropy for these kinds of changes. This would let me determine the real strength of a password, so I can weigh the tradeoffs. – tony Sep 08 '11 at 22:46
  • 1
    For point 2. I would tend to simply state that you're in effect shortening the list from 7776 words to less (say, 7542), so the entropy would drop to log2(7542^4). That's assuming the number of difficult-to-remember words is known (which it probably isn't if you're considering different persons) – Joubarc Sep 09 '11 at 09:34
  • @Joubarc, your approach is valid, but see my answer for why it may be tricky to use correctly in practice. – D.W. Sep 09 '11 at 17:19
  • 2
    @this.josh, I disagree with your criticism. tony has a perfectly good question: how much does the entropy decrease, if he chooses passphrases as he describes? That's a totally separable and independent question from how much entropy is *needed* in any single situation. I think tony's question is good, relevant, well-defined, and useful. – D.W. Sep 09 '11 at 17:21
  • just make sure that you roll all 1s, and then your password will be "a a a a a". Very easy to remember. – endolith Jul 16 '16 at 01:55

2 Answers2

13
  1. Throw out passphrases until you get one you can remember -

    If you look at 16 passphrases, and keep the one you like best, then you've reduced your entropy by at most 4 bits (log2 16 = 4). The intuition is similar to what you gave for rearranging: the worst case is that there is logically only 1 of those 16 passphrases that someone is likely to choose.

    In general, if you look at N passphrases and keep the one you like best, then you've reduced your entropy by at most log2 N bits. If you throw out N-1 passphrases and keep the Nth, then this is basically the same situation, so you can estimate that you've probably reduced your entropy by at worst log2 N bits.

  2. Throw out individual words that are difficult for you to remember -

    Let me suggest two different methods to calculate how much entropy you've lost:

    • You could think of this as taking your 7776-word dictionary and throwing out a bunch of words that are too difficult for you to remember. If you throw out 1000 words, you've reduced from a 7776-word dictionary to a 6776-word dictionary, so you've reduced the entropy per word from about 12.9 bits to 12.7 bits, or the total entropy of your passphrase from about 51.7 bits to about 50.9 bits. Not much of a reduction.

      Unfortunately, this calculation method is a bit tricky to use in practice. To estimate the entropy this way, you'd have to peruse the entire word list and count how many words are too difficult for you to remember, or scan a random sample of the word list and use that to estimate what fraction are too difficult to remember. (Also, this calculation assumes that whether a word is too difficult to remember is independent of the rest of the passphrase.) It would not be valid to choose a random passphrase, reject it because one of the words was too difficult, choose a second passphrase, keep that one, and then reason that you've reduced the dictionary from 7776 words to 7775 words so your entropy goes from 51.6993 to 51.6985 bits. That would not be valid reasoning, because it doesn't count how many other words might be in the dictionary that you also would have rejected, if they had been selected.

    • You could think of this as just another reason that you might throw out a passphrase. Then we're back in the situation we already analyzed above ("Throw out passphrases until you get one you can remember").

  3. Rearrange words so that it is easier to remember -

    You've already covered this in your question, and yes, your reasoning is accurate. You've reduced entropy by at most log2 4! = 4.6 bits of entropy.

D.W.
  • 98,860
  • 33
  • 271
  • 588
  • This is a great answer. However, is it correct to assume that 1 only holds true if you decide beforehand to select 16 passphrases and select the easiest. If you select the 16th one because it is the first memorable, then are you in the same situation as 2? Are you just looking for the possibly much smaller subset of acceptable passphrases? – tony Sep 10 '11 at 16:26
  • @tony, yes, strictly speaking, I think you're right. However, for method 1, I *think* the reasoning I gave is a pretty good approximation. (In contrast, for method 2, I know it is not a good approximation.) – D.W. Sep 10 '11 at 19:48
4

Entropy is about probabilities, so there are averages everywhere.

For method 1, if you throw away on average n-1 passwords and keep the n-th, then entropy is divided by n (in bits, you use log2 n bits). This assumes that the attacker can accurately model what is a "hard to remember passwords", i.e. predict if you will throw away a password or keep it. In practice, an attacker will not predict your reactions with 100% accuracy, but it is hard to quantify how much successful he will be at it, so it seems safer to simply assume that the attacker can do that.

For method 2, the attacker will simply use a reduced list from which he removed all the words that you removed too. If, for instance, the list had x words, you removed y of them, and your passphrases consist in m words, then the entropy is reduced by a factor (x/(x-y))m (that's just using a shorter list, in fact).

For method 3, rearranging the word will reduce the entropy by up to m!, if your passphrases have m words (that's factorial of m, i.e. 6 for m=3, 24 for m=4, and so on). The idea is that each passphrase is part of a set of m! passphrases which use the same words in some other order, and you always stick to the one ordering that you prefer, and the attacker can model that too. There again, this depends on how good the attacker is at guessing how your brain operates; since we cannot give a good measure for that, we just suppose that the attacker is "very good".

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955