I'd like to know what it means to say "the cryptosystem C uses keys with a length of x bits". I do not understand what the bits length means... doesn't it depend on the encoding? The same word encodes to bit strings of different lengths in utf8, iso and unicode, so is there a general encoding used to define the length of a key? Or does "length of x bits" mean something completely different?
6 Answers
For symmetric algorithms (symmetric encryption, Message Authentication Code), a key is a sequence of bits, such that any sequence of the right length is a possible key. For instance, AES is a symmetric encryption algorithm (specifically, a block cipher) which is defined over keys of 128, 192 and 256 bits: any sequence of 128, 192 or 256 bits can be used as a key. How you encode these bits is not relevant here: regardless of whether you just dump them raw (8 bits per byte), or use Base64, or hexadecimal, or infer them from a character string, or whatever, is up to you.
There are a few gotchas with some algorithms. The prime example is DES, a predecessor to AES. DES is defined to use a 64-bit key. However, if you look at the algorithm definition, you see that only 56 of these bits are used; the other 8 are simply ignored (if you number bits from 1 to 64, these are bits 8, 16, 24, 32, 40, 48, 56 and 64; they are supposed to be "parity bits" depending on the 56 others, but nobody really bothers with setting or checking them). So it is often said that DES has a 56-bit key. This is because key length is related to security: if an algorithm accepts keys of length n bits, then there are 2n possible keys, and thus trying them all (attack known as "exhaustive search" or "brute force") has time proportional to 2n (with n big enough, i.e. more than about 85, this is technologically infeasible). In that sense, DES offers the security of a 56-bit key (256 "really distinct" possible keys). Yet, if you use a library implementing DES, that library will expect DES keys as sequences of 64 bits (often provided as 8 bytes).
Another algorithm with a special rule is RC2. It accepts keys of length 8 to 128 bits (multiple of 8 only); but it also has an extra parameter called effective key length denoted by "T1". In the middle of the processing of the key, an internal value is "reduced" to a sequence of T1 bits, which means that subsequent encryption will depend only on the values of T1 specific bits in that internal value. The resistance of RC2 against exhaustive search is then no more than that offered by a T1-bit key, because one can try all possible sequences of T1 bits for that internal value. Yet, RC2 still has an actual key length (the length of the sequence of bits which is provided as key) which can be greater than T1.
For asymmetric algorithms (also known as public-key cryptography, encompassing asymmetric encryption, digital signatures, some key exchange protocols, and a few more esoteric algorithms), keys work by pairs consisting in a public key and a private key. These keys are mathematical objects with some heavy internal structure. The "key length" is then a conventional measure of the size of one of the involved mathematical objects.
For instance, a RSA public key contains a big integer called the modulus, as well as an other integer (usually small) called the public exponent. When we say a "1024-bit RSA key", we mean that the modulus has length 1024 bits, i.e. is an integer greater than 21023 but lower than 21024. Such an integer could be encoded as a sequence of 1024 bits, i.e. 128 bytes. Yet, the public key must also contain the public exponent, so the actual encoded length will be greater. And the private key is, on a theoretical point of view, knowledge of how the modulus can be factored in prime numbers; the traditional encoding for that knowledge is that of the prime factors, along with a bunch of helper values which could be recomputed from the factors (but that would be slightly expensive) and may help in executing the algorithm faster.
For distinct key types which work over distinct mathematics, other "lengths" are used, so you cannot directly compare security of algorithms by simply comparing the key lengths. A 256-bit ECDSA key is vastly more secure than a 768-bit RSA key. Also, the mathematical structure inherent to public/private key pairs allows for much faster attacks than simply trying out bunch of random bits, and there are many subtle details. See this site for explanations and online calculators for the various set of rules about comparing key sizes that many regulatory organizations have come up with.
- 322,884
- 58
- 787
- 955
Consider the simple encryption of shifting every letter one to the right in the alphabet, so that A becomes B, B becomes C, and so forth. So that: HELLO becomes encrypted as: IFMMP
I can shift one letter to the right, two letters to the right, or up to 25 letters to the right. Consider the "number of letters to the right" as the "key". It takes 5 bites to hold a number between 0 and 25. Therefore, the key size is "5 bits".
Or, instead of shifting to the right, let's say that we randomly re-order the alphabet, and do a one-to-one mapping: ABCDEFGHIJKLMNOPQRSTUVWXYZ RJQFGSKPBTODUZLNHYAVXEMWIC In this case, we look up the letter 'H' in the table, see that it gets transformed into the letter 'P', and that the message: HELLO becomes: PGDDL
Let's say that the random reordering of the alphabet is done with a mathematical sequence based on a 5 bit number, such that every number generates a different sequence. Thus, the 5-bit number is again the "key".
The problem with a 5 bit key is that it only has 32 combinations. If I know the encryption algorithm, I can all 32 keys until I find the right combination. The larger the key, the harder this becomes -- EXPONENTIALLY. A 6 bit key has 64 combinations, a 7 bit key has 128 combinations, and so forth. A 10 bit key has a thousand combinations, a 20 bit key has a million combinations, a 30 bit key has a BILLION combinations.
Let's say that you have a computer that can test a billion keys per second trying to brute force all combinations. That means you can break a 30 bit key in just one second. But, that means it will take you a billion seconds (or 34 years) to break a 60 bit key.
Every 30 bits we add makes it a billion times more difficult. A spy agency like the NSA can crack 60 bit keys using supercomputers, but a 90 bit key is a billion times more difficult to crack, and a 120 bit key would be a further billion times more difficult to crack than a 90 bit key.
That's why older WEP (40 bits) and DES (56 bits) are considered obsolete: we can crack them with desktop computers by trying all combinations. That's why modern crypto, such as AES, uses 128 bits. We can't brute force crack those algorithms.
- 3,893
- 1
- 15
- 14
-
1I would argue that you don't need any "supercomputer" to break an ~60 bit symmetric encryption key these days, just a bit of patience and some money (or these days, most likely even just a bunch of off-the-shelf home PCs will do quite nicely). http://en.wikipedia.org/wiki/Data_Encryption_Standard#Brute_force_attack – user Nov 16 '11 at 13:32
The length of a key in bits nothing more than a specification of its size. A 128 bit key takes up 16 bytes of space -- just the raw bits that the processor uses. There is nothing special, no translation.
An encoding is a mapping of bits to something that has meaning. For example, the 8 bit sequence 01100001 (0x61) in ASCII is the letter "a". Because keys are random data and they have no meaning, encoding only comes into play for translating binary data into something printable, for example BASE64. Thus, instead of "what sequence of bits must I have to write the letter a?" -- which can vary with different encodings, a key is "what's this bit? what's this bit? what's this bit?".
- 38,170
- 9
- 94
- 172
The same word encodes to bit strings of different lengths in utf8, iso and unicode, so is there a general encoding used to define the length of a key?
A key is not a word. It is a sequence of bits. The "general encoding" of which you speak is no encoding.
On another note, there are code-point->string encodings called UTF-8 and ISO-*, but unicode is not an encoding. It is a character set.
Maybe defining some terms might help.
A "character set", for example unicode, is a set of symbols usually identified by integer index. In unicode those symbols are also called code-points (strictly unicode scalar values).
A "string" is a sequence of symbols chosen from an alphabet. Unless otherwise specified, the sequence is of finite length and the alphabet is a subset of unicode but the term can also be used for other alphabets : a "byte string" (where the alphabet is the set of integers representable in 8 bits [0,255] or [-128,127]) or "UTF-16 string" (where the alphabet is the set of integers [0,65535]).
An "encoding" is a (usually reversible) mapping from one kind of string to another. UTF-8 maps "code-point" sequences (aka regular strings) to byte strings.
- 3,873
- 18
- 25
I do not understand what the bits length means... doesn't it depend on the encoding?
Text - a sequence of characters and other glyphs from multiple languages - is an abstraction. Humans know how to deal with text, but computers don't - until we make up a way for computers to do it. That way is called an "encoding" - an encoding is a way to map a sequence of characters to a sequence of bytes and to map a sequence of bytes back to a sequence of characters. In one encoding, a character can be encoded into one bytes, while in another encoding, the same character will be encoded into four bytes.
Simply, text can be encoded into a sequence of bytes. Text is abstract, though, and must be encoded in order to be dealt with.
But a sequence of bytes is just a sequence of bytes, and has got nothing to do with text, and has got no concept of encodings whatsoever.
Sequences of bits, so long as their lengths are always multiples of 8 bits, are equivalent to sequences of bytes. So when someone speaks of a 16-byte key, it's equivalent to a 128-bit key.
When dealing with cryptographic functions, one should always take care to deal with sequences of bits or bytes only. If one has text that one wants to encrypt or hash, he should first take care to encode the text first into a sequence of bytes, either using the same encoding in all cases or noting down which encoding was used in this case. One should only encrypt or hash bytes, never text: and encodings are how to translate text into bytes. Likewise, if one has a key represented in the hex or base64 text encodings, one should take care first to decode the key back into bytes.
- 2,817
- 21
- 13
The actual meaning of that is how many bits will constitute the desired key to proceed with the encryption or decryption algorithms. Suppose the key size is 256 bits meaning that, if you take an integer which is grater than the 2^255 and lower than 2^256. In between the integer you have to take it as a public or private key.
- 9,149
- 11
- 44
- 62