51

I recently investigated best-practices in regards to passwords, and the overwhelming majority of sources recommended using a password manager. This is great advice, but not usable in every situation. Certain situations, such as OS login, Disk Decryption or Password Manager unlocks do not allow me to let a password manager "type my password in for me".

As such, I have looked at the second-best alternative, which seems to be Diceware and Passphrases. What had me stumped was this answer of a related question, which hinted that Diceware was superior. An excerpt from the answer:

Passphrases are great (Diceware is better) for locking password managers, [...]

Emphasis mine

What confuses me is why this claim that Diceware is supposedly superior? I used zxcvbn to compare the strength of the two example passwords below and it seemed as if the passphrase was more secure than the Diceware password. Further, the Passphrase generates a visual image, although nonsensical, which is easy to remember. The only disadvantage I can imagine is that the passphrase takes longer to type, which is a marginal disadvantage considering it would only need to be typed once before a password manager can be used again.


Examples

Diceware

Diceware is the process of rolling a set of dice, which would indicate a random word from a pre-defined list. Depending on the desired security, more words are chosen.

An example outcome of a Diceware process might be the password:

cleft cam synod lacy yr wok

Passphrases

A passphrase is in essence a sentence, which make sense to the user and hopefully nobody else. It might make grammatical sense, but is very unlikely to make semantic sense.

An example of a passphrase would be:

Blue Light shines from the small Bunny onto the Lake.
curiousguy
  • 5,038
  • 3
  • 25
  • 27
  • 19
    ``why this claim that Diceware is supposedly superior`` Because it is more random then a sentence someone thinks of themselves – Kevin Apr 24 '19 at 11:29
  • 3
    As a note, some password managers and disk encryption allow for a hardware key, which is going to be better than *any* memorized option (though for truly optimal security, you could use a hardware key alongside a secure memorized secret). – Larkeith Apr 24 '19 at 13:34
  • 1
    @Larkeith Yes, I know that a key file on a thumb drive is possible, but it was merely an example. And as you mentioned, that key file should probably be encrypted with a strong passphrase, leading us back to square one. –  Apr 24 '19 at 13:35
  • Technically passphrases should not be "phrases" at all, because phrases are supposed to follow the rules of the English language (syntax) and English doesn't provide much entropy (only about 1 bit per character for valid sentences). So "good convenience store" has less entropy than "grumpy ruin store". Diceware (or similar methods) help you build non-sentences using words that otherwise you might not be inclined to choose, making sure you have enough entropy. – reed Apr 24 '19 at 15:27
  • 4
    One comment about your question, you are considering Diceware and passphrases separate things. They aren't. Diceware generates passphrases, albeit ones composed of random words. You're talking about 'natural language passphrases' when you describe passphrases. Both are the same category of secrets, just different variations. – PwdRsch Apr 24 '19 at 16:18
  • 44
    side note: Almost all estimators of password strength **are complete and utter nonsense**. The assumed mathematical complexity rarely exists in real life, and brute-forcing is seldom your main threat. When I register for new sites that have a "password strength" estimate on their register form, I typically try for giggles how good they thing AAAaaa123!!! is. Surprise, it quite often is apparently the best password they've ever seen. They also consider 200 random letters a weak password because it doesn't have a number... – Tom Apr 24 '19 at 16:21
  • @Tom Yes, I agree. But then again 200x`a` seems like a decent enough password too –  Apr 24 '19 at 17:04
  • 1
    All password-strength meters, including zxcvbn, are wrong. The meters work by examining a password and trying to guess what pattern was used to create it; this only has value if the meter's guess is the same as the pattern being used by an attacker to generate passwords. – Mark Apr 24 '19 at 21:53
  • 1
    Not using a password manager because it can't type your OS password in for you would be a terrible reason not to use a password manager, you can still very easily, for example, show the password on your phone and hand type it into your bios screen or whatever (which are easy to type if you're using Diceware techniques) – Brian Leishman Apr 25 '19 at 12:39
  • You're comparing a 6 word passphrase with a 10 word passphrase. I don't need zxcvbn to tell me which is stronger – slebetman Apr 26 '19 at 03:50
  • That zxcvbn site is interesting, but it highly over-estimates its pattern matching. I put in a random 12 digit number, generated by random.org, and it estimated the guesses at 10^9 instead of 10^12. A factor of 1000 off. It claims to have seen a date pattern in the password, despite it being randomly generated. – Steve Sether Apr 26 '19 at 21:59
  • 1
    @SteveSether The point is, a password cracker doesn't care whether the password was randomly generated or not; all that matters is how it can be cracked as quickly as possible. If it happens to contain a date pattern that was randomly generated, that last fact makes no difference at all. An RNG *can* output`123456` just as likely as it can output any other 6-digit combination. – Marc.2377 Apr 27 '19 at 02:21
  • @Marc.2377 The "date pattern" was a year. I tried several times with several random numeric passwords and each time it claimed to find a date pattern in the password. If you look hard enough, you'll find a "pattern" in anything. That doesn't mean the pattern actually exists. – Steve Sether Apr 28 '19 at 03:19

4 Answers4

76

Most people that use passphrases, use passphrases wrong.

The remark that Diceware is better probably comes from the fact that, when people use passphrases, they usually take a well-known or otherwise logically structured sentence and use that. "Mary had a little lamb" is a terrible passphrase because it is one of a few billion well-known phrases that a computer can run through in a short amount of time. I know this works pretty well because I tried it.

Diceware is just random words. It's as good as any other randomly generated set of words, assuming you use a good source of randomness: for Diceware, you should use dice, which is a reasonably good source. Digital password generators are usually also good, though homebrew implementations might use an insecure random generator by mistake.

We know that any random passphrase is good because it's basic math. There are two properties to a passphrase:

  • Dictionary size
  • Number of words in the phrase

The 'randomness' of a passphrase is simple to calculate: dictionary_size ^ words_in_phrase, where ^ is exponentiation. A passphrase of 3 words with a dictionary of 8000 words is 8000^3= 512 billion possible phrases. So an attacker, when guessing the phrase, would have to try 256 billion phrases (on average) before s/he gets it right. To compare with a password of similar strength: a random password using 7 characters, consisting of a-z and A-Z, has a "dictionary size" of 52 (26 + 26) and a "number of words" of 7, making 52^7= ~1028 billion possible passwords. It is well-known that 7 characters is pretty insecure, even when randomly generated.

For randomness, it's the more the better up until about 128 bits of entropy. A little more than that helps buffer against cryptographic weakenings of algorithms, but really, you don't want to memorize 128 bits of entropy anyway. Let's say we want to go for 80 bits of entropy, which is a good compromise for almost anything.

To convert "number of possible values" to "bits of entropy", we need to use this formula: log(n)/log(2), where n is the number of possible values. So if you have 26 possible values (1 letter), that would be log(26)/log(2)= ~4.7 bits of entropy. That makes sense because you need 5 bits to store a letter: the number 26 is 11010 in binary.

A dictionary of 8000 words needs about 7 words to get above the desired 80 bits:
log(8000^7)/log(2)= ~90.8 bits of entropy. Six words would be:
log(8000^6)/log(2)= ~77.8 bits of entropy.

A large dictionary helps a lot, compared to the relatively small Diceware dictionary of 7776 words. The Oxford English Dictionary has 600k words. With that many words, a phrase of four randomly chosen words is almost enough:
log(600 000^4)/log(2)= ~76.8 bits of entropy.

But at 600 thousand words, that includes very obscure and long words. A dictionary with words that you can reasonably remember might have a hundred thousand or so. Instead of the seven words that we need with Diceware, we need five words in our phrase when selecting randomly from a dictionary of 100k words:
log(100 000^5)/log(2)= ~83.0 bits of entropy.

Adding one more word to your phrase helps more than adding ten thousand words to your dictionary, so length beats complexity, but a good solution balances the two. Diceware seems a little small to me, but perhaps they tested with different sizes and found this to be a good balance. I am not a linguist :).

Just for comparison, a password (consisting of a-z, A-Z, and 0-9) needs 14 characters to reach the same strength: log(62^14)/log(2)= ~83.4 bits of entropy.

peterg
  • 3
  • 2
Luc
  • 32,378
  • 8
  • 75
  • 137
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackexchange.com/rooms/92942/discussion-on-answer-by-luc-is-diceware-more-secure-than-a-long-passphrase). – Rory Alsop Apr 27 '19 at 18:41
  • 1
    Just a note on diceware size - it has to be a power of 6, because it is DICEware :) And six times more words would not fit the criteria very well - not very similar, not very uncommon, not very long, – Petar Donchev Jul 03 '20 at 08:26
11

Passwords should be easy to remember and hard to guess. As AviD once said, security at the expense of usability, comes at the expense of security. A passphrase is easy to remember because it has some sort of meaning to the user, even though it might seem random at first. Taking a look at usability, a passphrase is more superior: You don't need dice and a list of words, you can think of a passphrase yourself and remember it more easily.

However, using dice and a random list of words makes for a near fully random password. There is no link to the user, where a passphrase most of the times (unless truly random) was made up of something related to the user.

Any password checker online can only verify how hard it would be for a computer to guess a password, where a sentence (or passphrase in this case) might be more easily guessed by another human. In your example, the length of your diceware generated password is less then the passphrase (however still very long compared to security standards nowadays), but as you stated yourself, you can create longer passwords when you want to.

I wouldn't say diceware is always superior, but it definitely is more random and can still have the same length as a passphrase which makes it superior in certain cases.

Kevin
  • 1,653
  • 10
  • 20
  • "it definitely is more random" More precisely, you know *exactly* how random a diceware password is: it always has exactly `log2((vocabulary_size)^(num_words))` bits of entropy. A passphrase's generation method isn't well defined; it's just a phrase selected from the set of the first few phrases the user thought of *somehow*. So you need to consider what sort of generation methods an attacker might use to guess it, which in turn considers both how common the precise phrase is, how common the words in it are, and how grammatical the sentence is. You can't really *know* how good a passphrase is. – Ray Apr 26 '19 at 14:48
  • You have to consider dice bias - most dices are likely to have big bias in the order of 10% toward one of the numbers. However the effect is not that devastating and not systematic (bias is unlikely to be predictable from dice to dice). – Petar Donchev Jul 03 '20 at 08:40
11

That statement you cite that Diceware is "better" than passwords doesn't have a theoretical justification attached, which makes it tricky to assess. But I can come up with one such justification: Diceware comes with a procedure for generating passphrases at random with dice, and this guarantees that the outputs generated have at least some minimum amount of entropy (difficulty of guessing). Since log2(6) is about 2.6, Diceware gives you at least 2.6 bits of entropy per dice roll.

One the other hand, there is no obvious way of estimating how difficult a long natural language passphrase like "Blue Light shines from the small Bunny onto the Lake" would really be for a password cracker. People usually assume that because it's long that automatically makes it strong, but that's not true. This Ars Technica article about cracking very long passphrases is very instructive in that regard:

[Kevin Young] joined forces with fellow security researcher Josh Dustin, and the cracking duo quickly settled on trying longer strings of words found online. They started small. They took a single article from USA Today, isolated select phrases, and inputted them into their password crackers. Within a few weeks, they expanded their sources to include the entire contents of Wikipedia and the first 15,000 works of Project Gutenberg, which bills itself as the largest single collection of free electronic books. Almost immediately, hashes from Stratfor and other leaks that remained uncracked for months fell. One such password was "crotalus atrox." That's the scientific name for the western diamondback rattlesnake, and it ended up in their word list courtesy of this Wikipedia article. The success was something of an epiphany for Young and Dustin.

"Rather than try a brute force that makes sense to a computer but not to people, let's use human beings because people typically make these long passwords based on things that humans use," Dustin remembered thinking. "I basically utilized the person who wrote the article on Wikipedia to put words together for us."

Almost immediately, a flood of once-stubborn passwords revealed themselves. They included: "Am i ever gonna see your face again?" (36 characters), "in the beginning was the word" (29 characters), "from genesis to revelations" (26), "I cant remember anything" (24), "thereisnofatebutwhatwemake" (26), "givemelibertyorgivemedeath" (26), and "eastofthesunwestofthemoon" (25).

If you just pick long passphrases innocently without any sound theory of why your procedure gives strong passphrases, they might be vulnerable to some attack you just haven't thought of. Whereas Diceware is invulnerable to anything but brute force, because cracking Diceware is at least as hard as guessing 25+ dice rolls.


I used zxcvbn to compare the strength of the two example passwords below and it seemed as if the passphrase was more secure than the Diceware password.

Here I should repeat a point I made more at length in this answer to another question:

  • A password strength meter can conclusively prove that a passphrase is weak;
  • But no such meter can ever prove that a passphrase is strong, because the passphrase might be vulnerable to some attack the meter does not model.

For example, zxcvbn—which is an excellent tool overall, but just isn't designed for the use you're making of it—estimates centuries for this passphrase:

password:   Am i ever gonna see your face again?
guesses_log10:  31.35342
score:  4 / 4
function runtime (ms):  5
guess times:
100 / hour:   centuries (throttled online attack)
10  / second: centuries (unthrottled online attack)
10k / second: centuries (offline attack, slow hash, many cores)
10B / second: centuries (offline attack, fast hash, many cores)

But this is one that I took from the Ars Technica article quote above, so we know it has been cracked in real life. We have independent proof that the zxcvbn estimate is wrong.

zxcvbn's analysis gives cleft cam synod lacy yr wok a guesses_log10 value of 26.22025, which is technically weaker than it estimates for Am i ever gonna see your face again?. But if it's a 5-word Diceware passphrase that we generated by making 25 dice throws, we have independent proof that it has at least log2(6) × 25 = 64.5 bits of entropy (whose corresponding guesses_log10 value would be more like 19.4, so zxcvbn is arguably overestimating how strong it is).

For your passphrase Blue Light shines from the small Bunny onto the Lake., we just don't have any independent argument for why it's strong other than your hunch, which is undermined by the fact that you've posted it to Stack Exchange (and thus could now be used as input for an attack like what the Ars article explains). Maybe it is strong, but the philosophy that a system like Diceware embodies is that you shouldn't base your password strength on hunches, but rather, on actual random procedures that give you minimum entropy guarantees.

Luis Casillas
  • 10,361
  • 2
  • 28
  • 42
0

Diceware(tm) is intended to meet several goal:

Security Usability Prescriptivity

Security is achieved by random word selection. As others point out, the entropy of any passphrase selected randomly from a list or words is easy to compute: (number of words in passphrase) * log2(number of word in list). Using dice eliminates concern about the quality of computer random number generators.

Usability is enhanced by keeping the words short. The maximum length of Diceware words is 5 characters. That makes a Diceware passphrase easier to enter accurately, particularly on mobile devices. Also many password protected systems limit the length of allowable passphrases. NIST's new version of its password guidelines, Special Publication 800-63B, recommends allowing up to 64 characters, but many systems allow less. Long passphrases generated from a much larger list, say a complete English dictionary, can even exceed the NIST limit.

Prescriptivity. If you are reading StackExcange, you might be technically clever enough to make up a passphrase that is actually secure. Maybe. But if you are relying on many other users to create secure passwords, the likelihood that they will all invent secure passwords using the typical guidance is minute. Diceware is completely prescriptive. Anyone can follow the instructions and create a strong passphrase.

user52619
  • 96
  • 1