0

I was reading about the Principle of Key Sizes and came across this table:

Efforts required to break a key

The deduction I made off this is that after a certain key size, increase in a single bit causes Time required for search to get doubled from the previous one.

Then, I came across this video How to crack Passwords in which Prof. Pound tell's that your key strength is somewhat related to the character set of your password/Key.

For Instance if my password is of 8 length and its character set is lowercase characters, then total attempts required to crack my password will be 26^8 = 208827064576 (character_set ^ keysize) (on the worst case).

So I was wondering, whether the increase in Key_Size by bits, and this character set theory is somewhat related or not. If yes, then why most people measure strength of key by the number of bits it has, rather then the character set it uses?

And what is the most common character set used in password/key cracking, as most character tables (ASCII/UTF8) contains a lot of NPC's and/or Emoji's etc. so are they excluded during the key search process or not?

Samuel Philipp
  • 640
  • 6
  • 18
Vasu Deo.S
  • 185
  • 1
  • 8
  • This is an answer to a different question, but I believe it might help you anyway: https://security.stackexchange.com/a/208951/10863 Please let me know if this indirectly answered your question. – Luc May 16 '19 at 20:02
  • Keys are random binary (e.g., search every bit as 0 or 1). Passwords are usually printable ASCII. I would generally advise against using non-printable characters and unicode in passwords due to potential implementation issues, unless you are very confident in how the password input and hashing is handled. Applications may silently filter out unicode characters. Or the application may not consistently normalize unicode characters; e.g. one device you may type `ñ` represented as `0xC3B1` (U+00F1) and another as 0x6ECC83 (U+006E (n) plus combining tilde) and result in different hash. – dr jimbob May 16 '19 at 21:13
  • Possible duplicate of [Calculating password entropy?](https://security.stackexchange.com/questions/21050/calculating-password-entropy) –  May 17 '19 at 08:02

3 Answers3

2

I think you might be confusing "size/length" with "entropy". The text string 123456, when encoded in either UTF8 or other common encodings (ISO-8859-1 or "Latin", etc.), will be 8 x 6 = 48 bits in length. However, its entropy will be... ridiculous. I can't even estimate it, all I know is that any script kiddie will definitely be able to attack you successfully with such a weak password. The "size/length" in bits is related to the encoding using for the text. The "entropy" on the other hand is related to the randomness, or in other words the number of possible combinations that an attacker will have to try to be able to guess it.

A key is just a series of bits: 011001100010101011101... In a 128-bit key of course there are 128 bits. If those bits are truly random, then you will have an extremely strong key, containing 128 bits of entropy. But in reality keys are often (but not always and not necessarily) generated starting from a password or passphrase, because people will need to remember them. How can you remember 128 bits easily? That's impossible.

So people will just pick a password, and then an algorithm called "key derivation function" will transform the password into a key (a series of random bits). So what happens is that an attacker won't try to bruteforce the generated password, because they know that it's impossible to bruteforce 128 random bits in a reasonable time. So what the attacker does is try to bruteforce your password, generate the related key for each password, and try. Your password is very likely to contain much less than 128 bits of entropy, that's why the attacker will try to bruteforce the password instead of the key. For example, a 12-character password containing random uppercase/lowercase characters and numbers is going to have 62^12 = 3.23 x 10^21 = about 2^71 = about 71 bits of entropy.

Attackers usually have dictionaries and are going to bruteforce passwords using the most common words and patterns, rather than trying all the possible combinations at random. So for example they might try abc123, dogcatbird11, 5tr0ngPa55w0rd, etc. They might try to include some common symbols in the possible character set, for example question marks, exclamation marks, etc. but leave out other less common characters. In the end, it all depends on how much time and resources the attacker is willing to waste. Bruteforcing all the possible 8-character passwords might be feasible, but as soon as the complexity increases the attacker will have to make some specific choices, unless they feel like waiting for centuries in front of their computer.

reed
  • 15,538
  • 6
  • 44
  • 65
  • why are password's transformed into a key? Why not simply use the password bit's as a key? Like for example password is `A` then why can't we use ASCII/ANSI equivalent of this as an key (01000001) – Vasu Deo.S May 16 '19 at 20:24
  • Btw can this key derivation function be considered somewhat as a hash function, which takes in input bits, and converted it into a series of pseudo random bits (unique to that input bits) – Vasu Deo.S May 16 '19 at 20:32
  • 2
    @VasuDeo.S, for several reasons. First of all, keys usually need to be of a specific length, for example exactly 128 bits, or exactly 256 bits, etc. So we need a way to transform every password, of any length, into a key of a specific length. A hash for this would work, yes. However, usually key derivation functions have "key stretching", used to make it more difficult to bruteforce the password. This is done by making the KDF waste more resources (like computing several rounds of hashes so it's going to take longer), so the attack will be slowed down. – reed May 16 '19 at 20:37
  • 1
    @VasuDeo.S You don't use ASCII as a key because ASCII only has 95 printable characters that each take 8 bits. For a 128 bit key that means you only get `log2(95^16) = 105.117689733` bits of entropy. A KDF can take a longer printable key and turn it into a 128 bit key. You _can_ however encode a 128 bit key as hex using 32 ASCII characters, but that's not really using ASCII as the key, it's just a way to represent the key using ASCII. KDFs that are meant to be used for low-entropy passwords are also intentionally slow, to add "effective entropy" by making it take longer to brute force. – AndrolGenhald May 16 '19 at 20:37
  • 1
    @VasuDeo.S, I added some more info at the beginning of my answer, saying that I suspect you might be confusing "size" with "entropy". It might be confusing because often when people talk about "bits" in a password they are referring to its entropy. – reed May 16 '19 at 20:39
  • @reed Rectify me If i am wrong. User's choose their desired passwords, these passwords get's converted into a random key stream (using something like hashing) and is then used to encrypt data. If the attacker wants to obtain the password back, he/she goes for permutations/combinations of passwords, and passes each of them via a KDF, obtain's its bit stream and compares it with the bit stream of the password. So In general, it's the character's that are bruteforce'd rather then the bits. – Vasu Deo.S May 16 '19 at 20:49
  • @VasuDeo.S Correct, this works because passwords are often chosen very poorly. – AndrolGenhald May 16 '19 at 20:53
  • Just one last question, how does the attacker get the hold of the key stream of the password at first place. What if he doesn't have it and he wants to decrypt a file from scratch. Like how will he compare the key stream now? – Vasu Deo.S May 16 '19 at 20:54
  • @VasuDeo.S, no, the attacker obviously doesn't have the right key! They need to find the right key, but instead of bruteforcing the key (and with each key try to decrypt the data, over and over), they bruteforce the password (and with each password they calculate the corresponding key using the KDF, which in turn will be used to try to decrypt the data). It's one more step needed. Which seems a waste of time, but it's not, because bruteforcing the password will generally be much easier (less entropy) than trying to bruteforce the long random key. – reed May 16 '19 at 21:01
  • @VasuDeo.S For passwords used for login, this is generally a database-dump of password hashes. For encrypted files, a known-plaintext allows detecting if the decryption is correct (many files formats start with a certain sequence of bytes, some file formats are legal xml, etc). – AndrolGenhald May 16 '19 at 21:02
  • Thanks for taking time to answer my question – Vasu Deo.S May 16 '19 at 21:11
2

The deduction I made off this is that after a certain key size, increase in a single bit causes Time required for search to get doubled from the previous one.

Not just after a certain key size, that's how it works in general. Assuming a random key, if it's 8 bits there will be 28 possible keys, and it will take about 27 = 128 tries on average to guess it (on average you'll guess it after searching about half the keyspace). Each additional bit in the key doubles the keyspace, and thus doubles the expected time to guess the correct key.

For Instance If My password is of 8 length and its character set is lowercase characters, then total attempts required to crack my password will be 26^8 = 208827064576 (character_set ^ keysize) (on the worst case).

So I was wondering, whether the increase in Key_Size by bits, and this character set theory is somewhat related or not.

Yes, this is the same idea. If you break a key into bytes (sets of 8 bits), then each byte added multiplies the keyspace by 28 = 256. For random lowercase passwords, each character added multiplies the password space by 26 (or about 24.7).

If yes, then why most people measure strength of key by the number of bits it has, rather then the character set it uses?

Because keys aren't related to character sets. Keys are random binary. You can represent them as 0s and 1s, or as hexadecimal, or any number of other ways, but what matters is the keyspace, which is directly related to the key length in bits.

And what is the most common character set used in password/key cracking, as most character tables (ASCII/UTF8) contains a lot of NPC's and/or Emoji's etc. so are they excluded during the key search process or not?

That depends on the password cracking tool and configuration being used, though printable ASCII is common given that most passwords are ASCII. Many password cracking tools also prioritize common patterns and mutations of previously cracked passwords. Passwords are generally crackable only because they are chosen poorly.

Keys generally aren't attacked directly, as 128 bit keys are impossible to brute-force. An old 64 bit key might be brute-forced directly, in which case there's not really a character set used per se, it's just iterating all possible sequences of 64 bits.

AndrolGenhald
  • 15,506
  • 5
  • 45
  • 50
0

This is confusing because keys don't use UTF8 or ASCII. The reason keys strength is shown in bits is because they are generally hexadecimal (or binary expressed as hexadecimal) data types. A hexadecimal key only contains 16 characters (0-9,A-F). One character per 4 bits. Because these are all useable values, there are no weird characters like ♥ or ╟ to be excluded. As such a 128-bit key is actually 32-characters vs the 16-characters you would see in UTF8 or ASCII password of equivalent bit-strength.

Passwords normally use 8-bit character sets instead because people want to type the fewest number of characters possible for the most complexity. Because people think in terms of characters, not bits, a password strength is evaluated by easily typed characters (0-9,a-z,A-Z,!@#$%^&*()-_=+{}[]:;"'<>,.?/~`) = 92. Even though 92 is a lot less than the 256 that are technically available, it's still much bigger that the 16 in a hex character set.

Keys are stored values that you never need to type in; so, a good bit-density is better than a good character density.

Nosajimiki
  • 1,809
  • 6
  • 13
  • one character per 4 bit? Can you please explain this line. I am not able to understand how a single character (ex 'a') is represented in hexadecimal basetype, and by how many bits? – Vasu Deo.S May 16 '19 at 19:49
  • Why are you comparing 92 to 16? I could just as easily say keys can be expressed in base 4, so there are only 4 possible characters vs 92, but the comparison is meaningless. What matters is key length, which is independent of how the key is expressed. – AndrolGenhald May 16 '19 at 20:02
  • Each bit you add to a character set doubles the number of values it can represent. 1 bit = 2 options, 2 bits = 4 options, 3 bits = 8 options, etc. Hexadecimal is a 4-bit character set. So the hex binary for 'A' = '1010'. but the ASCII binary for 'A = '01000001'. Same character but takes more bits to store. – Nosajimiki May 16 '19 at 20:03
  • @AndrolGenhald I compared 92 to 16 as to why a person would prefer UTF8 for a password over Hex which is why we use it for passwords even though the usable bit-density is lower (in terms of what a normal user can type in as a password). – Nosajimiki May 16 '19 at 20:15
  • @Nosajimiki Ok, that makes a bit more sense. So you're comparing typing an ASCII password with 92/256 characters used vs hex with 16/16 characters used. – AndrolGenhald May 16 '19 at 20:25
  • Yes, just trying to compare and contrast why we use 4 bit character sets for keys vs 8 bit character sets for passwords in reference to the OP's question: "So I was wondering, whether the increase in Key_Size by bits, and this character set theory is somewhat related or not. If yes, then why most people measure strength of key by the number of bits it has, rather then the character set it uses?" – Nosajimiki May 16 '19 at 20:54