52

I have been hearing more and more that the haveibeenpwned password list is a good way to check if a password is strong enough to use or not.

I am confused by this. My understanding is that the haveibeenpwned list comes from accounts which have been compromised, whether because they were stored in plain text, using a weak cipher, or some other reason. This seems to have little to do with password strength to me. There could be very strong passwords that were stored in plain text, and thus compromised, and would really be pretty fine to use as long as they weren't used in combination with the original email/username. The fact that their hashes are known (duh, any particular password's hash is known!) doesn't matter if the place you are storing them is salted. Although it really doesn't hurt to rule out these passwords, as perhaps a hacker would start with this list when brute forcing, and it is easy to choose another one.

But the inverse is where I am concerned - there will always be very easy to crack passwords that aren't on the list. "longishpassword" at this time has not had an account using this password that was hit by a leak. This does not mean however that were a leak of hashes to happen, this password would be safe. It would be very easy to break.

What is the rationale behind checking a password (without an email/username) against the haveibeenpwned list to see if it is worthy to be used? Is this a good use of the list or is it misguided?

edit:

It is way too late to change the scope of the question now, but I just wanted to be clear, this question came from a perspective of checking other people's passwords (for instance when users register on your website, or people in your organisation are given AD accounts) not for validating the strength of a personal password. So any comments saying "just use a password manager" have not been helpful to me.

Nacht
  • 925
  • 1
  • 6
  • 12
  • 23
    Can you cite a source saying HIBP "is a good way to check if a password is strong enough to use or not"? – schroeder Jun 03 '19 at 09:11
  • 9
    In brief, HIBP has a huge list of *real* passwords, including both strong and weak ones. It is possible that the strong ones are filtered and not used in bruteforce attacks, but it's also possible that it's not worth filtering the list (after all, passwords that look strong might actually be weak and used by more than one user). So attackers might just use the whole list for bruteforcing, and therefore *every* password on that list is going to be at risk. – reed Jun 03 '19 at 09:30
  • 5
    I'd say that if your password was able to be cracked using any hash, it's not a good password. I'd also say that your your password was revealed in plaintext, it's now in the cracking dictionary, and thus a bad password. But I certainly wouldn't say that NOT being in haveIBeenPwned.com means it's a good password. The website primarily exists to show how common account cracking is, and how BAD your password is. It's nearly impossible to show how good a password is, unless it can be demonstrated to have a sufficient amount of entropy by the method used to generate it. – Steve Sether Jun 03 '19 at 19:52
  • 3
    @SteveSether I think the [zxcvbn](https://blogs.dropbox.com/tech/2012/04/zxcvbn-realistic-password-strength-estimation/) password strength meter does a pretty good job. Also the best thing HIBP seems to give the world is knowledge when your account has been compromised, not really anything to do with how good your password is. – Nacht Jun 03 '19 at 23:12
  • 13
    HIBP and pretty much any password strength meters including zxcvbn can tell you when a password is bad; they can't tell you that a password is strong. – Lie Ryan Jun 04 '19 at 02:51
  • @Nacht Password strength meters are just guesses, and often not particularly good ones. They can, and often do see patterns that don't actually exist, or miss patterns that do. As others have pointed out, the only way to know if you have a good password is to understand how much entropy the password generation method produces. Anything else is just a guess at the entropy. – Steve Sether Jun 04 '19 at 15:04
  • 1
    Using or not using a password based on if it's on a leaked list seems like Vizzini's paradox from The Princess Bride. If the attacker expects everyone to not use passwords on the list, maybe they'll only attack passwords NOT on the list, meaning it might actually be better to only use passwords that ARE on the list. But if you think they think you'll think that, you might want to only use passwords NOT on the list. But if you think they'll think you'll think that they think you'll think that, then you might want to only use passwords that ARE on the list. But if you think they'll think... – dwizum Jun 04 '19 at 17:49
  • 3
    @dwizum, the problem is, the users don't know there's a contest of wits at all. Users reuse passwords, plain and simple. There's no need for the attacker to be a mastermind of dizzying intellect. – Ghedipunk Jun 04 '19 at 17:52
  • I agree that's the truth. My comment was mostly in jest, to point out the weakness of "password strength" as a concept in general, since any attempt that tries to beat brute force is essentially based on *assumptions* about human behavior. – dwizum Jun 04 '19 at 18:53
  • There is no need to trust external parties for production password, just use a random generated password and be done. You can certainly use external tools to validate samples and generation rules (although it’s not really needed or proofing anything). Also keep in mind that if you use unique random passwords there is no problem with password hacks as only the already compromised service would be affected. – eckes Jun 05 '19 at 09:41
  • 5
    You're not using HIBP to validate a good password, you're using it to exclude bad ones. You have to use some other method to evaluate password strength. – Neil_UK Jun 05 '19 at 10:00
  • @SteveSether While I agree that password strength meters are usually garbage, if they see a pattern that isn't from the method you used to generate your password, that doesn't necessarily mean the pattern doesn't exist. It could mean that there exists some low-entropy generation method that is *also* capable of generating the same password as the high-entropy method you used. e.g. randomly picking 12 printable ASCII characters has an entropy of ~79, but "the password" is still terrible. A password meter shouldn't be *trusted*, but the user should understand *why* it thinks a password is bad. – Ray Jun 05 '19 at 14:30
  • @Ray While it's strictly true that randomly picking 12 printable ascii characters COULD generate something like "password1234", or some other common pattern, it's exceedingly unlikely to happen. Also, the problem with finding the pattern after the fact is that it's cherry picking. i.e. 74nbtiMM1984Q5IB DOES have the year 1984 in it. It's still a damn good password, but a dumb pattern matcher would say it has a "pattern" in it. If you use the fact you "found a date in it" and reduce the entropy appropriate for that pattern, it's cheating. See "texas sharpshooter" for more information. – Steve Sether Jun 05 '19 at 14:43
  • @SteveSether I agree completely. I say only that the user should understand the reason *why* the password meter dislikes the candidate password. If it's a stupid reason, like in the "74nbtiMM1984Q5IB" example, the user is quite safe in ignoring the recommendation. I suppose that formally, what I'm saying is that the entropy to consider isn't just $-\sum_i P(password_i \mid generator) log(P(password_i \mid generator))$, but rather something like $-\sum_j \sum_i P(password_i \mid generator_j) P(generator_j) log(P(password_i \mid generator_j) P(generator_j))$. – Ray Jun 05 '19 at 14:49
  • @Neil_UK That's EXACTLY right. It's a way to filter out passwords that should never be used. – barbecue Jun 05 '19 at 15:23
  • @Ray Then I agree with you. I'd only add that the strength meters I've seen tend to have this sort of definitive quality to them. People tend to want these sort of binary good/bad answers, so strength meters provide this overly simplistic world view. I'd be happier with a system like you describe, where the tool uses less definitive wording like "possible pattern detected", or "entropy guess". It puts the onus of truth back on the question asker rather than the tool.. Much like google search. Google doesn't really give you answers, it just asks you more questions. – Steve Sether Jun 05 '19 at 16:16
  • Many of you are thinking from the perspective of validating passwords for yourself. But that is not the only time you need to validate passwords - you may have to validate passwords for others, for instance users of your site, or of your organisation. In retrospect I should have made that more clear from my question that that was my case. Oh well it is too late now. – Nacht Jun 05 '19 at 22:53

10 Answers10

68

"Strong" has always had the intention of meaning "not guessable". Length and complexity help to make a password more "not guessable", but a long, complex, but commonly used password is just as weak as Pa$$w0rd.

If a password is in the HIBP list, then attackers know that the password has a higher likelihood of being chosen by people, hence, might be used again. So those lists will be hit first.

So, if your password is on the list, then it is "guessable".

If your password is not on the list, then from a dictionary attack approach, it is less guessable and not what others have chosen, and by implication (for as much as that's worth), is "less guessable". Many other factors, of course, can make your password "more guessable", even if it is not on the HIBP list.

As always, a randomly generated password is the most "unguessable" and a maximum length and randomly generated password is extremely difficult to bruteforce. And if you are randomly generating it, then why not go max length?

schroeder
  • 125,553
  • 55
  • 289
  • 326
  • 27
    I think the confusion is compounded by "password strength" often being described by "entropy", and misapplication of Kerckhoffs's principle: the strength of a password is a property not of how you select it, but of how an attacker will attack it. Just as the attacker is trying to guess how the password was selected, the user can try to guess how the attacker will brute force it. – IMSoP Jun 03 '19 at 17:58
  • 14
    You mentioned it, but it may help OP to realize: Even if you created the strongest, most unguessable/crackable password in the universe for www.example.com, and *that* site gets hacked and the passwords are released for folks to buy/download, then that strong password is effectively worthless. All a hacker has to do (and likely will do) is download the "known cracked website passwords" and loop through that. It'll find that super strong password and try it - so they don't even have to *guess* the password. It's on a list. Therefore, worthless. – BruceWayne Jun 03 '19 at 21:29
  • @BruceWayne I think your comment there is what has helped me understand. I have been comparing this HIBP method to zxcvbn, and thought that since zxcvbn contains a list of commonly-used words in password, it would solve this problem. But I see now that this is not enough. It would seem both of these validations are necessary. – Nacht Jun 03 '19 at 23:17
  • I suppose I was thinking, if HIBP is not good enough on its own, what good is it at all? But I see that it is indisposable. – Nacht Jun 03 '19 at 23:18
  • "As always, a randomly generated password is the most "unguessable"" statistically, yes. Unless you happen to randomly generate words that are in the dictionary or on such lists. A modicum of common sense is still required. – Mast Jun 04 '19 at 12:21
  • Wouldn't always going max length make it lower entropy than if you have a random length (but still some arbitrarily high enough minimum length)? – Drew Jun 04 '19 at 16:22
  • 1
    @Drew - only by a very trivial amount. Even if you restrict the alphabet to single-case ascii letters, you will only lose about 1 part in 27 of the possible passwords. – Martin Bonner supports Monica Jun 04 '19 at 17:01
  • 1
    @Mast The chance of choosing a 20 character password at random, and ending up with a sequence of words is non-zero, but I think it is negligible. – Martin Bonner supports Monica Jun 04 '19 at 17:02
  • 1
    Random characters are poor passwords since they are also unrememberable – Stian Jun 04 '19 at 17:08
  • 6
    @StianYttervik Not if you use a password manager; in that case, you only need to memorise one master password, and all the services you use only see unique, long, random strings of characters. The tradeoff is that the password manager is a single point of failure, but it's less vulnerable than using the same memorised password directly for multiple services, where a breach in one would compromise your accounts on all the others. – IMSoP Jun 04 '19 at 17:21
  • @IMSoP The most important thing to remember is that a password manager typically invalidates a lot of two factor authentication systems. If you have your password in a password manager accessible on your phone and you get two factor authentication on your phone it's essentially "one thing you have" rather than "two things you have". In those cases it's better to have "something weak you know" together with "something you have" (your phone). – David Mulder Jun 05 '19 at 09:07
  • @DavidMulder Only if you have your password manager stored locally on your phone in an unencrypted form, with no authentication required to unlock it. Otherwise, it's "something you have" (your phone), plus "something strong you know" (your master password) or "something you are" (your fingerprint) to unlock the password manager. – IMSoP Jun 05 '19 at 09:25
  • @IMSoP One typically assumes the "something you have" is compromised. So whether that's a case of "installed malware" or "device stolen" (hot, so stolen and straight away inspected and used). Plus, fingerprint is left everywhere, so that's not really "something you have" anyway. No need to have it saved in unencrypted form if you will happily unencrypt it on the fly for the malware. – David Mulder Jun 05 '19 at 10:48
  • 1
    @DavidMulder I disagree, but this should be discussed on a separate question. – IMSoP Jun 05 '19 at 10:49
30

To answer this question properly, you need to think like the hacker who wants to work out your password.

But to avoid having to dive straight into a mathsy way of thinking, let's start instead by thinking about a competitor on the Lego Movie game show "Where are my pants?"

Obviously, when the competitor wants to find their clothes, the first thing they'll do is go to their wardrobe. If that doesn't prove fruitful, they might check their drawers, followed by the chair in the corner of the room, followed by the laundry basket, and perhaps the dog's basket if the dog is of the naughty pants-stealing sort. That'll all happen before they start looking in the fridge.

What's going on here is of course that the competitor will look in the most likely places first. They could have systematically worked through every square foot of the house in a grid, in which case they would on average have to check half the house. On the other hand with this strategy they have a good chance of getting it on the first go, and certainly wouldn't expect to cover half the house.

A hacker ideally wants to do the same thing. Suppose they know that the password they are after is 8 lowercase letters long. They could try working through them one at a time, but there are 208,827,064,576 possible options, so a given completely random guess has about a 1 in 208 billion chance of being right. On the other hand, it's well known that "password" is the most common password. (except when it's banned) In fact looking at the data from haveibeenpwned, the chance of the right answer being "password" is about 1 in 151. Not 151 billion, just 151. So that's over a billion times more likely than some random guess, and they'd be stupid not to start with it. (And obviously, since you want your password not to be found, you want to avoid picking what they'd start with)

Now, the question is whether that generalises beyond "password." Is it worth their while working through a list of leaked passwords? For a bit of information, consider this quote from the original release write up.

I moved on to the Anti Public list which contained 562,077,488 rows with 457,962,538 unique email addresses. This gave me a further 96,684,629 unique passwords not already in the Exploit.in data. Looking at it the other way, 83% of the passwords in that set had already been seen before.

What that tells us is that, roughly speaking, a randomly selected password has a better than 80% chance of featuring in the list. The list has a few hundred million entries, compared with a few hundred billion options for random 8 letter passwords. So, roughly speaking our hacker trying 8 letter passwords would have a 0.1% chance without the list in the time they could get an 80% chance with the list. Obviously they'd want to use it. And again, you might as well avoid it. After all, you still have hundreds of billions of options to choose from, and you can get thousands of billions by just going to nine letters!

That's the justification for checking the list.

Now your first worry is that "there will always be very easy to crack passwords that aren't on the list." That may be true. For example, "kvym" is not on the list. It's only 4 letters. There are only half a million passwords that are 4 lowercase letters or shorter, so if people are likely to prefer short passwords then a hacker would blaze through them in a fraction of the time it would take to finish the leaks list. It's likely that they'd try both.

The answer to that is obvious. Use both rules. Don't use a password that has appeared in a breach, and don't use a password that is very short. If you have a random password of any significant length, you have more than enough options that a hacker has no shortcut way to find.

Josiah
  • 1,848
  • 9
  • 14
27

It's definitely one of your validation steps, but can't be fully relied on.

Given the fact that most users reuse passwords, and build passwords using a relatively small base of words, a dictionary attack is a particularly effective means of guessing passwords. Since HIBP is regularly updated, it will have many passwords in frequent use, and thus probable candidates that a dictionary attacker would try. Thus, it is a good starting point to check. However, just because your password is not in the list, it doesn't mean your password won't be guessed easily. It's just that known passwords would be high on their list of passwords to try along with text mined from the internet, combinations of words with digits/symbols, transpositions, etc. As more password leaks happen, HIBP and other such tools become more useful, and hackers' lists of passwords to try become more effective to them as well.

I was quite surprised to see some passwords I know are quite easily guessed and are definitely being used in multiple sites, not on the HIBP list, so I can vouch for it not being the determinant of password strength (just like the example in the question). However, if I have come up with what I think is a strong password, and it's on the list, I would definitely not use it.

tech_enthusiast
  • 435
  • 1
  • 5
  • 19
  • 1
    This answer seems to best sum up what I have learned from the answers and comments on this question. Thanks Kristopher – Nacht Jun 05 '19 at 05:59
15

Others go into why it's a good idea. I'll take a different direction.

From a compliance standpoint, the relevant NIST standards, NIST Special Publication 800-63, Digital Identity Guidelines specifically requires that when users set their passwords, it shall be checked against a list of previously compromised passwords. The relevant section is SP 800-63B, Authentication and Lifecycle Management, section 5.1.1.2, which says

When processing requests to establish and change memorized secrets, verifiers SHALL compare the prospective secrets against a list that contains values known to be commonly-used, expected, or compromised. For example, the list MAY include, but is not limited to:

  • Passwords obtained from previous breach corpuses.
  • Dictionary words.
  • Repetitive or sequential characters (e.g. ‘aaaaaa’, ‘1234abcd’).
  • Context-specific words, such as the name of the service, the username, and derivatives thereof.

If the chosen secret is found in the list, the CSP or verifier SHALL advise the subscriber that they need to select a different secret, SHALL provide the reason for rejection, and SHALL require the subscriber to choose a different value.

By definition, anything found via the Pwned Passwords API are "values known to be [...] compromised."

If your organization has to worry about compliance, be aware that the two main standards for passwords are incompatible. The Payment Card Industry Digital Security Standard (PCI-DSS) says that passwords must be changed every 30 days, must be a combination of upper case, lower case, numbers, and symbols, etc., while the NIST standard says that passwords should not arbitrarily expire based on dates, and should not have complex rules about the class of characters allowed, but should be flexible enough to allow users to use any combination of character classes.

It is up to your organization to determine which standarsd to comply with, of course.

If you are developing for an agency under the US Department of Commerce, you must follow the NIST standards, full stop. It's the law. (And with all things regarding the law, check with your organization's legal department, don't trust me blindly.)

If you are working on any system that processes payment information, you are very strongly encouraged to follow the PCI-DSS. If you just have a web store, and are using a third party payment processor, then this doesn't apply to you. It does not have the weight of law, but you should check with your lawyers, as not following the PCI-DSS may expose you to being found negligent if things go wrong.

If none of these apply, then for me, the NIST standards make the most sense. Have several thorough discussions with your security team, do research, and figure out what makes the most sense to you.

As an example of figuring out what makes the most sense to you, in my organization, we do not reject passwords that had less than 10 hits in the Pwned Passwords API. We still show a warning message letting the user know that, even though the password was seen in a breach, we still accepted it. And, that they should consider switching to using a password manager to generate truly random passwords. I'm lucky enough to be in an organization where we can talk to the users, and we can have honest discussions about password management. Others will have to adjust their approaches to meet the needs of their organization.

Ghedipunk
  • 5,935
  • 2
  • 23
  • 34
  • 1
    Although NIST is _in_ Commerce, under FISMA its standards apply (with some lag) to all Federal government and contractor systems except 'national security' systems which are under NSA instead -- and NSA _mostly_ aligns its standards with NIST. To what extent anybody else (state/local, foreign, or private) should follow them is a matter of judgement. – dave_thompson_085 Jun 05 '19 at 00:20
  • @dave_thompson_085, thanks for the clarification. I'll look up FISMA and see if I can work it into the answer. And yes, following the NIST standards only makes it easier for US federal agencies -- the rest of us still have to do our own thinking. – Ghedipunk Jun 05 '19 at 15:43
4

Let's do the math:

Let's say every person on earth has used ~1000 passwords so far. That makes approximately 10 trillion passwords, which is ~243 if I am not mistaken. Choosing any existing password at random is thus about as good as a truly random 8-9 character case-sensitive character password. Not very good. See this answer.

That basically means that, in theory, not only should one not reuse a password, one should not reuse a password that has been used by anyone ever. Passwords that have been used before are basically one big dictionary attack waiting to happen.

Michael
  • 2,432
  • 2
  • 20
  • 37
kutschkem
  • 666
  • 5
  • 12
2

I have to admit I'm a bit lost in what strong means nowadays. I like to think that strong means a complex and long password. But that doesn't make a good password since it can possibly still be guessed easily.

As you already note: "a hacker would start with this list when brute forcing". So if your password occurs in this list, your password will be quickly guessed and this means it is not a good password.

There's an explanation on the website when you enter a string that's not in the list:

This password wasn't found in any of the Pwned Passwords loaded into Have I Been Pwned. That doesn't necessarily mean it's a good password, merely that it's not indexed on this site.

Using the HIBP list is a way of checking how easy your password will be guessed, but is not an indication of its strength. You need to use a password strength checker for this, which often will not check the leaked password lists. HIBP password list and a password strength checker complement each other.

LVDV
  • 125
  • 3
  • 2
    Password strength checkers have very limited utility as they assume certain criteria for brute-forcing and may not check any dictionary at all. They are useful for illustrative purposes, but not for choosing a strong password. – schroeder Jun 03 '19 at 09:51
  • 1
    @schroeder I don't see how your comment adds to what I already said. Can you explain? – LVDV Jun 03 '19 at 11:23
  • 4
    "You need to use a password strength checker for [an indication of its strength]" - password strength checkers should not be used for this and are not good at determining strength. They are illustrative at best, good for learning the basics of the effects of making certain changes to passwords. I just Googled "password strength checker" and the top hit returned "very strong, 82%" for the input of `Pa$$w0rd`. – schroeder Jun 03 '19 at 12:07
  • 1
    Checking strength is not the thing to do. The thing to do is to generate passwords that have strength. – schroeder Jun 03 '19 at 12:08
  • In the beginning of my answer I explained what I understand under strong, which is length and complexity. A password strength checker can help you define this. Pa$$w0rd is a strong password by my definition (although a bit short), but it is predictable which makes it a bad and ineffective pw. I can have a strong password according to your definition of length, complexity and predictability, but it still wouldn't be a good password if I've been using it for 10 years and for 50 different sites. That's why I prefer the simple term "good" when talking about the final effectiveness of a password. – LVDV Jun 03 '19 at 12:37
  • Even without dictionaries, even without knowledge that people often replace characters by ch4r4cter$ that look similar in order to fulfill some password requirements or believing that makes it hard to crack, there is an inherent problem: The result depends completely on the assumptions about the space of possible passwords and maybe even the exact brute-force attack against it - in which order it traverses that space. And on top of that, it comes on some kind of ordinal scale that looks nearly meaningless to me, or even worse - as quoted - pretends to map precisely to a numeral scale ("82%"). – Leif Willerts Jul 11 '19 at 22:08
1

Once password is sent to some random password checking site, it is no longer secure. Using such sites is definitely not a good idea with passwords you (are going to) use.

There is nothing preventing such site from adding the password you tested directly into wordlist, and then selling to hackers.

Again: using such sites with real passwords is IMHO very bad idea.

Firzen
  • 111
  • 1
  • 5
    This is sort of not a problem with HIBP. You can certainly go to [the website](https://haveibeenpwned.com/Passwords) and input your password. In this case I agree with you. But if you use the HIBP API you can be fairly sure your password is still secure. The API uses the [k-anonymity](https://en.wikipedia.org/wiki/K-anonymity) system where you only need to share the start of a hash of your password. The API returns all hashes which match that also start with that hash, leaving you to verify whether or not the password has been compromised. HIBP would never know your password. – Michael Hancock Jun 05 '19 at 15:01
  • 1
    @Michael Hancock Yes, if you are not sending actual password, it is way safer of course. :-) But that's not the case of the website. I would only be putting actual password for test into trusted open source, and preferably desktop applications. – Firzen Jun 05 '19 at 15:16
  • 3
    This would be a good answer if it discussed the difference between the API (and its protections against this) and using the website directly. However, the question just talks about *using the list* (not the website) and about *a specific service* (not "some random password checking site"), so IMO this answer jumps to a wrong conclusion. – IMSoP Jun 05 '19 at 17:08
1

There are many good answers on this page, but I don't see anyone considering the concept of credential stuffing.

It relies on the fact that many users have the same username (email address, really) and password on multiple site. So you can grab a list of username/passwords (similar to what HIBP uses), and simply fire off all the pairs on the list against the web site you want to break into.

By ensuring that none of your users have passwords present in any of the lists known to HIBP, you very effectively block this attack.

Geir Emblemsvag
  • 1,599
  • 1
  • 11
  • 14
1

But the inverse is where I am concerned - there will always be very easy to crack passwords that aren't on the list. "longishpassword" at this time has not had an account using this password that was hit by a leak. This does not mean however that were a leak of hashes to happen, this password would be safe. It would be very easy to break.

You are 100% right that absence from HIBP's Pwned Passwords database doesn't guarantee that a password is strong. However, I think you're underestimating the enormous value of checking passwords against the HIBP database. The point is that the case that you're concerned about—a weak password that's not in HIBP's database—is considerably less common than weak passwords that are in the list.

Troy Hunt (the creator of HIBP) writes extensively about his projects, and his 2018 blog entry "86% of Passwords are Terrible (and Other Statistics)" gives what I think should be an extremely eye-opening example (edited for brevity):

But I always wondered - what sort of percentage of passwords would [Pwned Passwords] actually block? I mean if you had 1 million people in your system, is it a quarter of them using previously breached passwords? A half? More? What I needed to test this theory was a data breach that contained plain text passwords, had a significant volume of them and it had to be one I hadn't seen before and didn't form part of the sources I used to create the Pwned Passwords list in the first place.

And then CashCrate [a big breach and leak] came along.

Of those 6.8M records, 2,232,284 of the passwords were in plain text. So to the big question raised earlier, how many of these were already in Pwned Passwords? Or in other words, how many CashCrate subscribers were using terrible passwords already known to have been breached?

In total, there were 1,910,144 passwords out of 2,232,284 already in the Pwned Passwords set. In other words, 86% of subscribers were using passwords already leaked in other data breaches and available to attackers in plain text.

So while you are right to think that Pwned Passwords doesn't solve the whole problem, the volume of low-hanging fruit that it addresses is enormous. Combine it with a scientifically well grounded password strength checker like zxcvbn and you bite off another big chunk:

password:               longishpassword
guesses_log10:          8.09552
score:                  3 / 4
function runtime (ms):  2
guess times:
100 / hour:   centuries (throttled online attack)
10  / second: 5 months (unthrottled online attack)
10k / second: 3 hours (offline attack, slow hash, many cores)
10B / second: less than a second (offline attack, fast hash, many cores)

And after you've knocked off the low-lying fruit you probably hit rapidly diminishing returns.

Luis Casillas
  • 10,361
  • 2
  • 28
  • 42
-2

I would argue that simply ruling out billions of perhaps strong passwords that are on this list from previous breaches is not necessarily useful as in the context of your environment it might just make it very hard to select one when billions are excluded already especially if people have to remember it for one reason or the other (can't use password manager for example).

I think this should also be put in the context of whether you also employ MFA in which case knowing the password only get's you so far. Also, brute-force attacks can be effectively countered by employing account lockout rules for wrong password entries.

bfloriang
  • 205
  • 1
  • 6
  • 4
    A password in any breach is no longer a strong password. You always must assume, that an attacker tries passwords from breaches first. First the attacker does not judge each password if it is in the breach because it is a common one or if it is one of the stronger ones, second you may be the user with the password in the breach and third it is cheap to extend a wordlist with a million passwords and has a huge benefit for an attacker. So you must assume that any password in a breach will be tried. – allo Jun 04 '19 at 11:15
  • I agree with @allo here and would even emphasize this: If a password appears in a list from a breach, I consider it compromised. If an attacker has access to a hashed passwords file (or can access the account without rate limit), they will certainly try password lists first. – Dubu Jun 04 '19 at 12:33
  • Treat the master passwords list as a dictionary. A very *good* dictionary that has a high likelyhood of containing a password. If you are going to protect against a dictionary attack, then you would *most definitely* want to protect from using the password list as the source of that attack. If you want to protect the users from using `password123` as their password, then you also *most definitely* want to protect against a password list attack that will have stuff way more common than this. – VLAZ Jun 05 '19 at 15:01
  • All of these things don't take into account the (hopefully) setup "number of failed attempts before lockout/delay" it's all very well saying superpassword123! has appeared in the list once is "compromised" but you'd probably not try it as the 1st 100+ attempted passwords, not using it because it's appeared once is a bit crazy (assuming you've not used it yourself before elsewhere, means nothing if someone else has really) [superpassword123! is used as an example, not as a case of a proper smart to use password] – Stephen Jun 06 '19 at 13:03