8

I recently read about how HTTPS work and I have some questions to clarify. Pardon me if this sounds silly but I just need to get this clear. Correct me if I am wrong.

I got to know that as part of the beginning of TLS handshake there is a asymmetric encryption where public key from the Certificate is used to encrypt the client generated key before it is sent to the server and only server can decrypt it using its private key.

But subsequent messages (HTTP requests) use symmetric encryption with the client generated key and both client and server use this key to encrypt and decrypt application data.

There is a famous theory in cryptography saying "Repetition is not good" where if a single message is repeated in a encrypted message it is easy to crack it. If this is true all messages encrypted using client generated key will have HTTP/1.x in it as it is part of both HTTP request and response.

So in theoretically a Man in the middle with this knowledge can possibly find patterns in encrypted HTTP requests and responses and find out HTTP/1.x string in those and brute force to generate the client key which was used to encrypt these messages.

Am I correct or is this utter non sense, any answer or guiding would be highly appreciated.

shazin
  • 189
  • 3
  • 2
    This seems to be well-covered by http://security.stackexchange.com/q/20803/971 and http://security.stackexchange.com/q/33752/971 and https://en.wikipedia.org/wiki/Initialization_vector. [This site is for IT Security professionals](http://security.stackexchange.com/help/on-topic) -- I encourage you to take advantage of existing resources. Also, I recommend that you cite your sources. You claim there's a famous theorem in cryptography that repetition is bad, but there is no such theorem. You're probably misremembering something (ECB mode?). – D.W. Feb 27 '15 at 22:52

2 Answers2

7

First, a couple of points.

  1. In any cryptographic system that does not provide information-theoretic perfect secrecy, the key can always be brute-forced. One of the key components of determining whether a cipher is secure is whether or not it is feasible to brute-force keys, and if it is, the cipher is not secure. So, brute-forcing the key isn't an issue.

  2. Leaked information about the plaintext does not necessarily compromise the key. It compromises security for sure, but knowing that the ciphertext of two messages does not mean that you learn anything about the key. So, while brute-forcing the key is still always possible, this doesn't give you any advantage. It's still just as hard to determine the correct key as if there were no duplicate ciphertexts.

The canonical example of this flaw can be found in using a block cipher in ECB (electronic code book) mode. In this mode, each block of plaintext is directly encrypted using the key, so if there are multiple blocks of plaintext that contain the same data, they will result in identical blocks of ciphertext. It's easy to tell that they're duplicates, which is exactly what you're concerned about.

The way that we fix this (and the mechanism that is also implemented in TLS (HTTPS) block ciphers) is to add an additional component of randomness, called an initialization vector. This is a non-secret random value that is introduced into the encryption process, and is different for every encrypted message. Because the key and IV are now unique for each message, the ciphertext will also be unique, even in the case of two identical messages. As long as you never reuse an IV with the same key, there will be no pattern to detect, and an attacker will never be able to determine that two messages are the same.

Xander
  • 35,616
  • 27
  • 114
  • 141
  • Who chose the IV? Is it the server, the client or both at same time? – Gudradain Feb 27 '15 at 16:37
  • @Gudradain It's a combination, but the exact process [depends on the TLS version](http://crypto.stackexchange.com/a/9868/4123). – Xander Feb 27 '15 at 16:41
5

The asymmetric cryptography establishes a shared secret, which is called, in TLS terminology, the master secret. The master secret is fixed throughout the session; a TLS session consists in one or several connections (opening a new connection while reusing the master secret is called session resumption and uses the "abbreviated handshake").

For each connection, several keys are derived (through a sort-of hash function called the PRF) from the master secret; these keys include the encryption and MAC keys for processing both directions of the traffic. The derivation uses as inputs the master secret and the "client random" and "server random"; these "randoms" are values that the client and server send each other at the beginning of the handshake (in the ClientHello and ServerHello messages). Thus, even though the master secret is reused, each new connection will have its own set of encryption keys.

Within each connection, data is sent as individual packets called records. All the records in a given direction (client to server, or server to client) will be encrypted with the same secret key; however, the encryption mechanism uses a per-record state (an implicit or explicit Initialization Vector -- depending on protocol version -- in case CBC or GCM encryption is used; a running state for RC4 encryption) that ensures that even if two successive records contain the exact same clear data, the encrypted versions will be distinct, and eavesdroppers won't be able to detect that repetition. Moreover, the MAC added in each record is computed over a combination of the clear data and the record sequence number, so that any attempt at duplicating, dropping or reordering records will be reliably detected.


As for brute force, it is not noticeably impacted by any repetition. In fact, for any brute force attack analysis, we already assume that the attacker knows a substantial amount of cleartext data and the corresponding ciphertext. The symmetric key size is such that even under these conditions, brute force remains infeasible (there are so many possible 128-bit keys that chances of finding the right one are abysmally low, regardless of how many plaintext/ciphertext pairs are available).

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955