Earlier, I went to a site that required a number and special character, and it got me thinking – wouldn't that make the password easier to brute force? If you assume most passwords have around 12 characters, wouldn't requiring a number remove about 90% of the possible passwords, making brute force much faster?
-
1You forget the chance of a number or character being in a specific position is the same about ( 1:58 ) this is spread across each position ( 12 characters ) I won't perform the math but the password is not made "easier" to bruteforce by having a number required in the password. At the end of the day its unlikely a good 12 character password would be bruteforced anyways. – Ramhound Feb 08 '13 at 12:25
-
Isn't this a duplicate of http://security.stackexchange.com/q/7198/13909 – MCW Feb 08 '13 at 17:33
-
1@MarkC.Wallace they both have to do with passwords, that's the only similarity I see. Considering this _is_ the security site, I'd say two questions about passwords are okay. – tkbx Feb 08 '13 at 18:01
-
1Requiring a number in a 12-character password randomly generated from ASCII printable symbols removes 26% of randomly generated passwords `(85/95)**12`. (There are 95 printable ascii chars, and 85 of them aren't numbers; so the chance they all are not numbers is 85/95**12.) This barely makes a dent in the entropy - changes from ~78.8 bits to ~78.4 bits. – dr jimbob Feb 08 '13 at 18:30
-
I don't see the difference; they're both questions about how significantly a given constraint on password complexity raises the risk. As you say, they're both on topic, but I think the answers are the same. (which is to say, mathematically, yes, practically, no.) – MCW Feb 08 '13 at 18:42
4 Answers
It removes a lot of passwords only if you consider that all combinations of 12 letters were possible passwords. In practice, users don't consider random passwords; they want to use passwords with a meaning and there are not many of those.
In other words, what matters is not the number of passwords which fit into the password entry field, but the number of distinct passwords that the user chooses among (the "generation process"). Requiring an extra digit enlarges that space for most users. Users for which the forced digit reduces the space of possible passwords are users who choose really random passwords, and even at 11 letters and 1 digit, these will be strong.
- 322,884
- 58
- 787
- 955
From a purely theoretical standpoint, yes. The set of all strings of a given length will be larger than the set of all strings with any further restriction on them.
But the bank is gambling that the space of passwords that the user would practically choose from is actually much smaller than the set of all possible passwords, and that by mandating a number and special character they are increasing the set of passwords the user will choose from since the user may not have included those restrictions on their own. (All other things equal.)
And while a reasonable percentage of the available passwords will indeed be eliminated by the requirements this doesn't really matter, practically or theoretically, because password strength is measured logarithmically and 90% doesn't change a lot on the logarithmic scale. Assuming that the set of passwords with restrictions is only 90%* of what it could otherwise have been, this is only a difference of about 3.3 bits of information. At a length of 10 total characters and about 72 possibilities per character (52 alphabetic, 10 numbers, and 10 special characters), the set of all possible character strings has about 63 bits of space. So with the restrictions we have 59.7 bits of password space instead of 63 bits. Not a big deal.
(* The OP's 90% estimate is a very large over-estimate for how much the password space is reduced. See these comments or other answers for better approximations of how the password space is actually reduced.)
- 1,842
- 12
- 19
-
But realistically, aren't almost all password breaches either brute force (number of characters being the only factor), or social engineering (the password itself not being a factor)? On the off chance that someone could guess someone's password, couldn't they guess it just as easily with a 1 at the end? – tkbx Feb 08 '13 at 02:00
-
3For not too short passwords dictionary based attacks (words from some dictionary plus combinations and variations thereof) are more important than brute force trying any character combination. – mkl Feb 08 '13 at 10:25
-
@B-Con - The actual calculation is straightforward to do; and shows the requirement of a number eliminates about 26% of password; reducing the entropy by about 0.45 bits – dr jimbob Feb 08 '13 at 18:34
-
@drjimbob: It's worth noting that my assumptions and yours differ on password length and character sets, and yours shift the calculation towards less entropy loss. My agreeing with 90% was more targeted at worst case for password lengths and restricted character sets. – B-Con Feb 08 '13 at 19:23
-
@B-Con - If your symbol set with numbers is 72 chars, the chance any given character is a non-number is 62/72 ~ 86%. The chance that all 10 chars in the pw not being numbers is (62/72)^10 ~ 22%; quite different from 90%. (To get about 90% reduction, your password length would need to be about 1). The new password space then shrinks by 22%, so is (1-22%) of the original and since lg(xy) = lg(x)+lg(y) that means the entropy decreases by `0.4 ~ -log(1-.22)` bits. – dr jimbob Feb 08 '13 at 19:54
-
You forgot to account for the special character, (52/72)^10 = 3%. But 3% tells us little about passwords containing both. Instead, I was interested to note that giving the mandatory characters fixed positions gave a trivial lower bound of (10/72)^2 = 0.02 = 2% (since they are not actually fixed it will obviously be much greater) and ln(.02) is -3.9, so we can't lose more than about 4 bits. Since that is probably negligible, almost any estimate is in the right ballpark. The takeaway was that it doesn't matter what the exact restrictions are in this type of case, the impact is negligible. – B-Con Feb 08 '13 at 21:05
-
@B-Con - First, use base-2 log (`lg`) to calculate informational entropy `lg(0.02) ~ -5.6` bits. Next, OP claimed "wouldn't requiring *a number* remove about 90% of the possible passwords", which is wrong -- it removes ~25% (0.5bits) of the passwords using his assumptions. Third, there's 33 keyboard symbols but if you imagine only ten (why?), requiring a number and symbol reduces 72^10 to 72^10-62^10-62^10-52^10, or a reduction of 1 bit. Allowing 33 spec chars it reduces from 95^10 to 95^10-85^10-62^10-52^10 (0.6 bits). At 12-chars (password length OP assumes) this reduces by ~0.45 bits. – dr jimbob Feb 08 '13 at 22:32
-
1) You're right, I hit the wrong log. 2) Probably not what was meant, since it was the numbers and special character restriction they actually encountered, so I'd think that's what they're interested in. 10 is because many password requirements restrict the special characters allowed in passwords, with verbage similar to "only numbers, letters, and - , _ . + are allowed". I've seen even fewer than 10 special characters permitted. 3) Very true, it is indeed a simple calculation. Remember inclusion/exclusion, though, the (-62^10-62^10-52^10) is double-counting. (Doesn't effect much, though.) – B-Con Feb 08 '13 at 23:34
My answers are "maybe", and "no". If you consider a brute force attempt which covers the complete possible set of passwords:
- if the attacker knows the restriction, then many passwords can be discarded, in one sense this doesn't really make the set smaller, he still has to iterate from " " to "~~~~~~~~~~~9" (for example), it just makes it sparse
- the restriction eliminates the "low hanging fruit", i.e. straight dictionary passwords
So strictly, yes, if you look at ideal bits of entropy per character, but really only maybe. You can find a more empirical analysis of this in NIST SP-800-63-1 (Appendix A).
Correctly calculating permutations with restrictions is tricky. If we assume 96 valid characters:
96^12 ~= 2^79 ~= 6.13E+23
96^11 * 10 ~= 2^75.8 ~= 6.38E+22
(96^11 * 10 )/96^12 ~= 0.104
That ratio indicates a loss of ~89.5% (with a more realist set of 64 characters it's about 85%) -- but that's not the right calculation, that's the number of permutations for a digit in a specific position.
(Aside: I'm disregarding passwords shorter than 12 characters for simplicity as it barely affects the numbers:
96^1 + 96^2 + 96^3 + 96^4 + 96^5 + 96^6 + 96^7 + 96^8 + 96^9 + 96^10 + 96^11
~= 6.45E21
which is ~1% of the total: for passwords from 1-12 characters, 12 characters ones account for ~99%. As the character set size gets smaller the ratio increases, it's ~1.5% for 64 set size, it doesn't change much as password length varies in the range [7-16] in either case.)
There are two obvious ways to calculate the correct number of permutations with a digit-required restriction:
- calculate the permutations for a digit in each place, and sum them
- calculate the size of the total set, subtract the disallowed ones
Option 1 quickly gets out of hand when you try to form permutations with no overlaps. Option 2 is near trivial (at least for simple cases like "must have at least one digit"):
96^12 - 86^12 ~= 2^78 ~= 4.49E+23
(96^12 - 86^12 ) / 96^12 ~= 0.73
This shows for 12-character passwords chosen from a set of 96 characters you loose ~27% of the total number of permutations with the "at least one digit" restriction. Repeating with a 64 character set for passwords:
( 64^12 - 54^12 ) / 64^12 ~= 87%
So you only loose about 13% in that case.
Requiring digit & punctuation is a little trickier, my calculation (96 chars, length 12, >=1 digit, >=1 punctuation) is 4.46E+23 or ~73% of 96^12, ever so slightly less than the digit case.
Brute force is not "much faster", only ~25% faster, which is quite acceptable given the benefits.
- 7,977
- 26
- 37
These sorts of rules are sensible mainly as they prevent many weak passwords in practice, and require that you considered using special/numbers symbols when generating your password. Any password like a dictionary word is automatically excluded requiring at least the addition of a number and symbol somewhere in the password. Yes you can still construct weak passwords following the rules, but you have dramatically increased the space to do so.
Let's do some math assuming you were randomly creating an 12 character password that was randomly generated. Let's say you are randomly picking symbols out of a set of symbols of size N; e.g., N=26 if you only allow lowercase letters (also 26 if you only allow uppercase letters); N=52 if you allow both lower and uppercase letters; N=95 if you allow all printable ASCII characters etc. That is there are 26 lowercase letters, 26 uppercase letters, 10 numbers, and 33 other printable ascii symbols.
You can easily find that the number of available passwords is N^12 for an 12-character password (there are N choices for the first letter; N for the second/third/fourth, etc.) so the number is N*N*N*N*N*N*N*N*N*N*N*N = N^12
. The entropy of a password relates to the base-2 logarithm of the number of possible passwords (that is lg(N^12)
).
Type of password | N | number of 12-char passwords | Entropy
--------------------------------------------------------------------
number only | 10 | 1 000 000 000 000 | 39.9
lowercase only | 26 | 95 428 956 661 682 176 | 56.4
symbol only | 33 | 1 667 889 514 952 984 961 | 60.5
lower+number | 36 | 4 738 381 338 321 616 896 | 62.0
number+symbol | 43 | 39 959 630 797 262 576 401 | 65.1
lower+upper | 52 | 390 877 006 486 250 192 896 | 68.4
lower+symbol | 59 | 1 779 197 418 239 532 716 881 | 70.6
lower+upper+num | 62 | 3 226 266 762 397 899 821 056 | 71.4
lower+symbol+num | 69 | 11 646 329 922 777 311 412 561 | 73.3
lower+upper+sym | 85 | 142 241 757 136 172 119 140 625 | 76.9
lower+upper+num+sym| 95 | 540 360 087 662 636 962 890 625 | 78.8
Now if you required that you must have at least one lowercase, uppercase, number, and symbol in your 12 character password, that effectively excludes the previous possibilities. Imagine the simplified scenario, where you originally allowed only lowercase passwords, and then decided to require at least one number and one lowercase letter in the passwords (and still forbid symbols/uppercase letters), the number of 12-char passwords would not be 36^12 passwords, but would now be 36^12 - 26^12 - 10^10 passwords - though this only shrinks the effective password space by 2.01 percent. However, the entropy of the password barely changes from 62.04 to 62.01. When brute forcing passwords entropy matters -- a small constant factor like 2% faster time to brute force doesn't matter.
In our case requiring at least one lower+upper+num+sym in our passwords results in:
95^12 - 85^12 - 2*69^12 - 62^12 - 2*59^12 - 52^12 - 43^12 - 2*36^12 - 33^12- 2*26^12 - 10^12
= 367598836933644823317658 possible passwords
The multiplication by two for some values accounts for how my table only considered lowercase-only cases; also have to consider in uppercase-only. This reduces the entropy from 78.8 bits into only 78.3 bits. Yes it did eliminate 31.9% of possible passwords (and ~26% of it came from requiring a number), but it made a minimal dent into the intrinsic entropy. Now compare that to a scheme where your original password was randomly generated from lowercase letters and symbols. That has an entropy of 62 bits instead of 78.3 bits, meaning its about 65000 times easier to brute force.
Granted its almost always better to just increase password length than increase the size of your symbol set. E.g., a random 17 digit lowercase password has ~80 bits of entropy (e.g., ajctzdtrtenwutuxc
) slightly beating a 12-digit password with 95 symbols 4eT5*\'W]";vu
which may be harder to remember/type and have some symbols blacklisted.
- 38,936
- 8
- 92
- 162