32

I have read from multiple sources that it might be better to have a password composed of several random words since this is easier to remember than a random sequence of characters. For example this article from Thomas Baekdal. I even see this xkcd comic quite often.

Now, I read this article about a new tool called brainflayer, currently target Bitcoin wallets, that can guess 130000 passwords a second. This makes Bitcoin brainwallets useless. I wonder if a similar tool could be used against all passwords and are passwords such as "this is fun" really as safe as Thomas Baekdal claims?

RoraΖ
  • 12,347
  • 4
  • 51
  • 83
user2563661
  • 431
  • 1
  • 4
  • 5
  • 1
    Relevant? http://security.stackexchange.com/a/6096/64787 – MonkeyZeus Aug 19 '15 at 16:21
  • For future reference, the tool is brainfl*a*yer, not brainflyer. – Nick2253 Aug 19 '15 at 17:32
  • 1
    Even at 130000 guesses a second, an XKCD-style 4-word password would hold up around 2 years on average: https://www.google.com/#q=2%5E44+%2F+(130000%2Fs)+%2F+2 Add another word, and you get this... https://www.google.com/search?q=2%5E44%20%2F%20(130000%2Fs)%20%2F%202 Brainwallet [recommends](https://en.bitcoin.it/wiki/Brainwallet) 13 words. Brainflayer won't be cracking a password like that anytime soon. – Ajedi32 Aug 19 '15 at 21:26
  • 10
    It's important to keep in mind though that "13 random words" is definitely not the same as "a 13 word phrase". The words need to be chosen completely randomly, using a pair of dice or a cryptographically secure random number generator. If you "thought up a few random words yourself", you're doing it wrong. – Ajedi32 Aug 19 '15 at 21:28
  • 4
    @Ajedi32 "Even at 130000 guesses a second, an XKCD-style 4-word password would hold up around 2 years on average" What if you have multiple instances of the software doing the guesses? Or even a botnet? – user2563661 Aug 19 '15 at 22:18
  • @user2563661 Then there'll be a higher number of guesses per second, and you'll need more words in your password in order for it to withstand the attack. If you want to play around with the math yourself, the formula is `2^(bits of entropy) / (guesses per second) / 2 = number of seconds on average required to crack`, and an XKCD-style password has 11 bits of entropy per word. – Ajedi32 Aug 19 '15 at 22:43
  • 3
    user2563661 and Ajedi32, at that point guesses / second is not really a good way to think about it, instead think about guesses / $$. Botnets and GPU rigs are time consuming / expensive to run. I'm pretty confident that the amount of $$ it would take to crack my password is greater than what the hacker can sell my personal data for, so I sleep easy knowing that they are just wasting their own money if they try. (With the exception of banks that force you to have a weak password) – Mike Ounsworth Aug 20 '15 at 00:04
  • @Ajedi32 About a year later, I was able to quadruple the speed of brainflayer and find several four random word passphrases used for brainwallets. Throwing several dozen EC2 instances at problems like this can be quite effective. – ryanc Nov 25 '19 at 08:25

5 Answers5

53

I wrote brainflayer and gave a talk about it at DEFCON. Neither Thomas Baekdal's article nor XKCD's comic apply well to modern offline attacks. I read Thomas's article and his FAQ about it, and it may have been marginally reasonable when he wrote it, it no longer is. A key point is that password cracking attacks have gotten much better since then.

Q: If I cannot write "this is fun" because of the spaces, can I not just write "thisisfun"?

A: Absolutely not! The reason why "this is fun" is 10 times more secure, is simply because it is much longer (11 characters). By removing the spaces, you reduce the length and the complexity substantially. The spaces are effectively special characters, which in itself makes the password much more secure.

Use "this-is-fun" instead.

Password crackers don't try long brute force attacks much - it's all about cracking ROI. A smart cracker will try word combinations with various delimiters, so using spaces, hyphens, underscores or nothing all ends up providing about the same security. Today's cracking methods use wordlists - which can include phrases - and large corpuses of previously compromised passwords along with popularity. This is combined with rule-based permutation and statistical models. Ars Technica posted a great article detailing modern techniques mid-2013, and attacks only get better.

I am also of the (possibly controversial) opinion that it is pointless to talk about guesses per second for offline attacks. A much better way of thinking about it is guesses per dollar. If you want to be pedantic you could add a one-time guesses per second per dollar cost, but the operational cost will tend to dominate. Brainflayer's upper bound on operational cost is 560M guesses per dollar, based on EC2 spot instance benchmarks - with zero one-time cost. It's possible to make these costs many orders of magnitude higher with a "harder" hash function like bcrypt, scrypt, PBKDF2 or, once it is finalized, Argon2.

ryanc
  • 647
  • 5
  • 7
  • +1 great article indeed! Since you linked it, what do you think of `passwords that are easy to remember are sitting ducks in the hands of crackers.`? I think there's a step of logic missing there because, as stated, that means that all diceware passphrases are weak, no matter how long they are. Opinion? – Mike Ounsworth Aug 19 '15 at 15:34
  • 3
    Human *chosen* passwords and passphrases are "sitting ducks" in the hands of crackers. For passwords and passphrases generated by a random process, it is possible to accurately model the entropy of the result. Modeling the entropy of the result allows us to objectively determine if it's secure against any arbitrary attacker. The question is whether such generated passwords and passphrases are "easy" to remember. They can certainly be memorized with a little effort. – ryanc Aug 19 '15 at 16:13
  • 1
    So if I run `secpwgen -p 6`, it generates eg. "avert gmt waxen 2f klein hatch" and states that this has an entropy of 78 bits. How is this worse than a random password with 78 bits of entropy (eg. "K)Z#RI2=#-}==S55")? And is 78 bits sufficient for a brainwallet? – oliver Aug 19 '15 at 16:37
  • 1
    @oliver if you read the rest of this thread, the problem with brainwallet is not that poepe are using weak passwords, it's that the those passwords are not very well protected once you hand them to brainwallet. Better to avoid brainwallet entirely. – Mike Ounsworth Aug 19 '15 at 17:34
  • 1
    78 bits of entropy is 78 bits of security regardless of the representation - density (password vs passphrase) doesn't really matter. That's probably fine to use with warpwallet. I really advise against using traditional brainwallets all together. – ryanc Aug 19 '15 at 19:03
  • "Neither ... or ... don't ...." I'm confused. – TRiG Aug 19 '15 at 19:10
  • @TRiG I got it right the first time, then misread my own answer, edited it to fix, then a reverted it. Is it currently unclear? – ryanc Aug 19 '15 at 23:01
  • Well, it should be *nor*, @ryanc, but it's no longer unclear. – TRiG Aug 19 '15 at 23:17
  • @ryanc I agree that "78 bits of entropy is 78 bits of security regardless of the representation", but doesn't that effectively invalidate the rest of your answer? The point of these methods is to produce passwords with as much entropy as random unreadable sequences of characters, without the random unreadable sequences of characters, and they do that successfully. In either case it's up to the user to have *enough* password. – hobbs Aug 20 '15 at 01:24
  • @hobbs There is not currently any algorithm that can, in the general case, determine the entropy of an arbitrary user provided string. As a consequence, any human generated string has undefined entropy, though it's possible to find an upper bound with something like zxcvbn. A "representation", then, is some mapping (which needn't be reversible) of a ~78 bit number to a string. – ryanc Aug 20 '15 at 03:35
  • 2
    @ryanc it's not meaningful in this context to talk about the entropy of a string, but we can talk about the entropy needed to choose a string randomly from a collection. A string chosen from every sequence of seven common words represents about 78 bits of entropy, as does a string chosen from every sequence of 19 characters in `[A-Za-z0-9]`. – hobbs Aug 20 '15 at 03:47
  • 2
    @hobbs I guess I don't understand your objection, and I feel like we're arguing semantics. When I say a string has 78 bits of entropy I mean that it would take, on average, 2^77 attempts to guess it with knowledge of the generation algorithm. The issue with how humans choose passwords and passphrases is that the human's vocabulary, personality, life experiance, and cultural knowledge are inputs into the "generation algorithm" and these things are difficult to quantify, and our models of these things are *very* complicated and improve unpredictably. – ryanc Aug 20 '15 at 15:09
  • 3
    The XKCD strip explicitly states that it's considering protection against online attacks. When choosing passwords for websotes your protection against offline attacks would be to avoid password reuse. – Taemyr Aug 21 '15 at 07:17
  • @Taemyr The XKCD comic does state that it's for an online attack, however people forget/ignore that part. For online attacks the site should be implementing rate limiting anyway. – ryanc Aug 21 '15 at 14:08
  • XKCDs comic is fine. You just need more than four words now. Look up Diceware, which is basically the same as the xkcd comic. – PyRulez Aug 21 '15 at 14:33
  • 2
    4 words selected from the 4000 most common words is 48 bits. You get 560M guesses per dollar, or 29 bits per dollar. To crack a 4x4k XKCD style password, that is 19 bits of dollars, or half a million dollars. To crack a 5x4k XKCD style password, you need 31 bits of dollars, or 2 billion dollars. To crack a 6x4k XKCD style password you need 43 bits of dollars, or 8 trillion dollars. To crack a 7x8k XKCD style password, you need 62 bits of dollars, or the GDP of the world for approx 60,000 years, all devoted to nothing but cracking the password. And that is with a poor hash defending it. – Yakk Aug 21 '15 at 18:08
  • 1
    "Neither Thomas Baekdal's article nor XKCD's comic apply well to modern offline attacks. " -- I'm just confused by this claim. Is it just that the XKCD style doesn't say "oh, and use another word/bigger dictionary if you want to deal with a faster attack/have a stronger password"? Similarly, the Thomas article says 'The examples below are based on 100 password request per second.' -- 130,000 is only ~1000x faster. The 5 uncommon word password is still strong against that attack (half a billion, instead of half-a-trillion, computer-years). – Yakk Aug 21 '15 at 18:12
  • XKCD is talking about online attacks, not offline. Thomas's article doesn't say. Passwords per second isn't meaningful for offline attacks, it's passwords per dollar. Brainflayer's crack rate on ec2 is 560M passphrases per dollar. The 130k passphrases per second is on my old desktop computer and doesn't really matter. For weak password hashes, you can build a cluster for $10k that can do 150 billion a second for 15 cents an hour in electricity. – ryanc Aug 22 '15 at 18:39
15

The specific attack your talking about had, per the defcon talk, certain specific characteristics that mean that the same attack doesn't really apply to all passwords.

First and most importantly, brainwallets effectively put the password hash in a public location. Usually the first defence against cracking of password hashes is to try and secure the hash itself.

Second up, brainwallet didn't use any of the common password storage algorithms (e.g. PBKDF2 or bcrypt) to protect the hashes. In cases where the attacker gets access to the hash, it's important to try and slow down their attack as much as possible. Password storage algorithms do this by making it "Computationally difficult" for the password hash to be created from the password, slowing down the attack. There's loads more details on these here

Rory McCune
  • 61,541
  • 14
  • 140
  • 221
8

Interesting question. @RoryMcCune nicely addressed the question about brainflyer, so I'd like to address your more open-ended question:

are passwords such as "this is fun" really as safe as Thomas Baekdal claims?

No, no it is not.

My first thoughts are 1) that Thomas Baekdal doesn't explain how he's calculating his time estimates, which makes my inner scientist skeptical, and 2) that his article doesn't take into account Moore's Law - the idea that our computing power is increasing at an alarming rate. The article was written in 2007 and since then both our brunt computing power per $, and ability to effectively use parallel processors (esp. GPGPU's) have gone up leaps and bounds.

I take issue with his statement: "100 years - this is really the limit for most people. Who cares about their password being hacked after they have died?". This is a fallacy; this number is based on the speed of today's computers and assumes that technology will not increase during those 100 years.

In my opinion, passwords are a dying technology since they do not scale with CPU speed (ie, our ability to crack them is increasing exponentially, but our ability to remember longer ones is not). Unfortunately, no single technology has risen yet to replace them. Two good attempts to solve the scaling problem come to mind:

  1. Private key-based authentication. This scales since we can keep increasing the key size as CPU speed increases.

  2. Proof-of-work schemes. This idea is to require the login process to involve something which takes a very large amount of computation. Password stretching / hashing functions like PBKDF2 or bcrypt are examples, as well as challenge-response authentication schemes which include asking the client to do a very long computation which the server can easily verify (like factoring: "here is a very large number n, send me back p, q such that p*q = n"). As processors get faster, you simply increase the amount of work that you're asking for.

Both of these have nice theoretical properties, but are inconvenient for end-users, which is stopping them from getting widespread adoption.

Mike Ounsworth
  • 58,107
  • 21
  • 154
  • 209
  • Password length does NOT need to scale. Only strength of encryption and processing required to check one guess, because you don't bruteforce passwords, you bruteforce their hashed/encrypted form. – Oleg V. Volkov Aug 20 '15 at 12:03
  • @OlegV.Volkov That almost makes sense, slower hashes scales crack time linearly, longer passwords scales crack time exponentially. 1) If your webpage is too slow to load, people will complain, and attackers will always be using better hardware than your average user. 2) Hash speed becomes irrelevant once an attacker has pre-computed a lookup table for your hash function. Rainbow tables for all passwords up to length 8 easily fits in RAM all at once, rainbow tables for all passwords up to length 20 would need 10^29 terabytes just to store it (hint: that's more than all HDDs on earth combined). – Mike Ounsworth Aug 20 '15 at 12:29
  • @MikeOunsworth Consumer hardware also follows Moore's law. The problem right now is the speed ratio between cracker and user is increasing, but increasing memory usage helps. – PyRulez Aug 21 '15 at 03:31
  • @PyRulez Good point. For usability reasons you have to support users on bottom of the line, or even obsolete hardware, while attackers will have top of the line. – Mike Ounsworth Aug 21 '15 at 11:34
  • @MikeOunsworth Of course, this is less of an issue if the hash is done server side. – PyRulez Aug 21 '15 at 13:14
3

Offline attacks are slightly different than online attacks.

In an online attack, rate limiting and the overhead of network transmission means that extremely fast password attempts are not practical.

In an offline attack, when you have a hashed password, you can get much much faster attack rates typically. And you can scale it up (by using cloud computing) in a broad way (which is difficult to do online, without bot-armies).

Still, a speed of 130,000 attempts/second, or 560 million guesses per dollar, is not that hard to defeat. Thomas assumed 100 attempts/second: hitting 100,000/second just makes passwords 1000x less secure. The 5 uncommon word password had a crack time of half a trillion years: after a 1000x speed, this becomes half a billion years.

The XKCD method's makes it easier to analyze the cost of your password. Suppose your attacker can try 1 billion passwords per dollar spent ("30 bits" of password entropy per dollar) (twice as efficient as brainflayer).

Then picking 4 random words from the most common 1000 English words (a 4*1k XKCD password) has 40 bits of entropy, so it takes 40-30 = 10 bits of dollars, or about 1000 dollars, to crack your password "offline".

Up it to 4 random words from the most common 4k English words (a 4*4k XKCD password), and it takes 48-30=18 bits of dollars, or 250,000 dollars to crack your password.

6 random words from the most common 8k English words (6*8k) is 78 bits. It costs 48 bits of dollars to defeat it today, or 250,000,000,000,000$, or 2-4x the annual GDP of the entire world.

Now, if we presume Moore's law will continue to hold with regard to password cracking (every 2 years, the cost of cracking a password halves), and we want to know "how long will our password be 1 million dollars secure), we can do this:

30 (current bits/dollar) + 20 (1 million dollars) = 50.

Take the number of password entropy bits, and subtract 50.

Moore's law states that computation gets half as expensive every 2 years. So double the number of bits remaining, and that is how many years your password should be secure against a 1 million dollar attack.

4*1k is 40 bits, which is less than 50, so it is not secure today against an offline attack.

5*4k is 60 bits, so it is safe for ~20 years against an offline attack.

6*8k is 78 bits, so it is safe for ~56 years against an offline attack.

All of this assumes you let a good quality, secure system do the password pick for you (if you try to pick a password this way, you'll almost certainly get way less entropy).

Yakk
  • 499
  • 2
  • 7
1

SHA-256 and ECDSA are considered very strong currently, but they might be broken in the far future. If that happens, Bitcoin can shift to a stronger algorithm.

(Block hashing algorithm)

So the hashing algorithms used are not an issue. Adding to this, the computational capacities (Block hashing algorithm) required for cracking are not that easy; even if this is questionable already (ASIC mining). The strength of Bitcoin is not your question but take this as a comment to the other answer you have, and actually Knowledge of passphrase == Control of money (Cracking Cryptocurrency Brainwallets)

Coming back to your question properly said, theoretically speaking, the set of human behaviors is more enormous than the universe itself. However, in practice we tend to all share the same practices as others (Mind, society and behavior). As a matter of proof, even with 4 digits PINs, lot of combination are rarely used whereas others are so widely used:

enter image description here

I mentioned this PIN code survey of 4 digits length to make a parallel with a password or a phrase as a password (passphrase): it is not a matter of what is the best password schema to use, (Edward Snowden Explains Why You Should Use Passphrases, Not Passwords) but rather how many people are ready to follow the advice in practice . So what you ask about is really plausible (above table is a good proof). These last days, researchers here in France succeeded to develop a software based on a set of algorithms to identify users following the way they type on a keyboard (an example of how we can model even complicated activities such as typing on a keyboard, so modeling -cracking- passphrases could not be that complicated either).

I think this a race of mouse and cat.

RoraΖ
  • 12,347
  • 4
  • 51
  • 83