8

I know "How TLS works" has been discussed numerous times here and crypto, but I am still somewhat confused and would like to summarize what I know so far 1 in this giant blob of text with the hope that one day this becomes helpful.

There are two popular TLS key-exchange methods: RSA and DH. In either case, the typical TLS Handshake looks like this:

  1. Client sends a ClientHello message which contains the maximum TLS version that it supports and a list of cipher suite in the order of preferences. In addition, a random 28-byte value called ClientHello.random is also transferred.

  2. The server replies with a ServerHello message with the best cipher suite and version it can support, along with its own random 28-byte value called ServerHello.random and a digital certificate.

  3. The client verifies the server's digital certificate against its trusted CA store. Then the client creates a pre_master_secret, encrypts it with the server's public key extracted from the server's digital certificate, and sends that back to the server. This is known as the ClientKeyExchange.

  4. The server decrypts the message using its private key, and then generates a master secret.

The way master_secret is generated in TLS 1.2 is as follows 2:

  master_secret = PRF(pre_master_secret, "master secret",
                      ClientHello.random + ServerHello.random)
                      [0..47];
  1. Afterward, the client also sends ChangeCipherSpec record (6 bytes) to the server, indicating it wants to use symmetric encryption as well as Finished message.

  2. The server responds back with ChangeCipherSpec as well as Finished message.

  3. From this point onward, all traffic will be communicated over TLS and are encrypted.

Question 1: What is "master secret" in the derivation? In other words, what is the actual value?

Question 2: How does client encrypt its message? What key/secret is used? I am not sure the role of master_secret?

Question 3: In 3 (RFC section 7.3), it says the following, what are those "random values" and what are their purpose?

  • Generate a master secret from the premaster secret and exchanged random values.

Question 4: I often read the term "session key". What is it? Is it the master_secret?


RSA

I know that RSA can be used in step 3 above for integrity control, meaning using public-private key asymmetric encryption so pre_master_secret is not readable in plaintext.

DH The major weakness of using RSA is that using server private key. An attacker can record all traffic and decrypt traffic if the server's private key is compromised. So to provide forward secrecy 4, DH can be used.

DH works on the principle of discrete algorithm. The mathematical properties allow both sides (the client and the server) to generate its own secret (a, b respectively), and derive to the same shared secret given p, G and g^X mod p (where x is a and b respectively) over the public channel (the world can read them) 5.

Question 4: I believe all subsequent traffic will be encrypted using the shared secret, correct?


Perfect Forward Secrecy (PFS)

To ensure no one can read prior logged traffic, PFS was introduced. Basically, instead of using a long-lived shared key, client and server generate short-lived session keys which are discarded from memory.

Question 5: What are the short-lived keys? X (a, and b respectively) of client and server's?


PSF and RSA

My understanding is RSA is used for authentication (what 'server' sends is coming from 'server', not MiTM). There's HMAC for integrity checks, which is generated during key exchange.

Question 6: Is that right?

I might be omitting other relevant questions/details. But any response is appreciated.

CppLearner
  • 209
  • 3
  • 8
  • 4
    You might find some answers here : https://tls.ulfheim.net/ "The Illustrated TLS Connection, Every byte of a TLS connection explained and reproduced." – A. Hersean Mar 18 '19 at 15:15
  • Thanks! It does help a bit, but I am still unsure how to apply them to different wording. :/ I will read it again a few times, but I hope someone could enlighten me. – CppLearner Mar 18 '19 at 16:53

1 Answers1

7

Frankly I don't expect this to be terribly useful. Be that as it may ....

As a preliminary matter almost everything you say is for TLS up to version 1.2 only. TLS version 1.3, which makes fairly major changes in the protocol, was released last year (after a long delay) and is now in the process of spreading; based on historical experience it is likely that TLS<=1.2 will be pretty much gone in something like 3 years. To be fair, most of the resources you can easily find online, notably including the Ursine Epics at #20803, pre-date 1.3.

In either case, the typical TLS Handshake looks like this:

Not either, what you describe covers only RSA; DH is different. More below.

  1. ... In addition, a random 28-byte value called ClientHello.random ....
  2. ... ServerHello message ... along with its own random 28-byte value called ServerHello.random and a digital certificate.

The fields named .random are actually 32 bytes, split into a 4-byte timestamp (which is not random if your computer's clock is even vaguely correct, as it should be) and 28-bytes of actual random data. The value used in key derivation etc. is the 32-byte value.

Strictly speaking the server certificate is not in the ServerHello message, it is in a separate message. However, both these messages, plus ServerKeyExchange when applicable and ServerHelloDone always, can be part of one record and are usually part of one TCP-level transmission. More substantively, if the server cert requires one or more intermediate or 'chain' cert(s) to be verified, which is almost always the case nowadays, that(those) chain cert(s) should be included as well; there are many Qs on several Stacks about "browsers consider my server connection secure but $other_sw gives $some_error" and this is often due to not correctly configuring a chain cert. (The A often varies depending on the server software involved.)

  1. The client verifies the server's digital certificate against its trusted CA store. Then the client creates a pre_master_secret, encrypts it with the server's public key extracted from the server's digital certificate, and sends that back to the server. This is known as the ClientKeyExchange
  2. The server decrypts [premaster] using its private key, and then generates a master secret

The client verifies the server cert, (usually) via its chain, against the client's truststore, AND verifies that the server cert matches the name (or possibly address) of the server the client wants to connect to. (If we want to connect to HonestBank.com, and trying to connect gets us a cert that was issued by a trusted CA to WeAreCrooks.com, we don't want to send our bank info on that connection.)

Both the server and the client derive master_secret from premaster and the 2 random's.

If the server requests client authentication, also called client certificate or 'two-way' or 'mutual' authentication, the client actually sends Certificate before ClientKeyExchange and CertVerify after. This is all explained in 5246, but is rarely used.

  1. Afterward, the client also sends ChangeCipherSpec record (6 bytes) to the server, indicating it wants to use symmetric encryption ....
  2. From this point onward, all traffic will be communicated over TLS and are encrypted.

After CCS all traffic is encrypted and authenticated; both are important. The methods vary: older ciphersuites use a (pure) cipher to encrypt and a (separate) HMAC to authenticate (HMAC = Hash-based Message Authentication Code); 1.2 also had new (in 2008) authenticated ciphers, officially called AEAD = Authenticated Encryption with Additional Data, which do both encryption and authentication in one combined operation; compare section 6.2.3.3 to the immediately preceding sections.

Question 1: What is "master secret" in the derivation? In other words, what is the actual value?

It's different for every session, and nobody except the two endpoints (client and server) should know it, hence 'secret'. (Although sometimes debugging features let you extract it; there are several Qs on those.) Its value is computed using the formula you posted, from 8.1. In case the 'search' function on your browser is broken and some flaw in your display makes the table of contents invisible, PRF abbreviates Pseudo(R)andom Function and is explained in section 5.

Question 2: How does client encrypt its message? What key/secret is used? I am not sure the role of master_secret?

The master_secret is used to derive multiple working keys, or more exactly secrets; see section 6.3. The client uses the 'client_write_key' to encrypt, and the server uses it to decrypt. For ciphersuites that use IVs, which in 1.2 is only some AEAD ones, they also use the the client_write_IV. For ciphersuites that use HMAC, which is the non-AEAD ones, the client uses client_write_MAC to generate the HMAC, and the server uses it to verify. See Are session keys just the symmetric keys? or cross https://crypto.stackexchange.com/questions/1139/what-is-the-purpose-of-four-different-secrets-shared-by-client-and-server-in-ssl .

Question 3: In 3 (RFC section 7.3), it says the following, what are those "random values" and what are their purpose?

  • Generate a master secret from the premaster secret and exchanged random values.

This is exactly the formula you posted in your 3 from section 8.1. The ClientHello.random sent to the server, and ServerHello.random sent to the client, are exchanged random values, and are combined with the (shared) premaster_secret to generate the (also-shared) master secret.

Question 4: I often read the term "session key". What is it? Is it the master_secret?

It can be either the master_secret, or the derived working keys/secrets (plural), or both. In particular, session resumption (aka re-use) in TLS<=1.2 is done by saving the session-id (in ServerHello) and the corresponding security parameters including the master-secret, and then using them on a subsequent or even concurrent connection.

RSA

I know that RSA can be used in step 3 above for integrity control, meaning using public-private key asymmetric encryption so pre_master_secret is not readable in plaintext.

'Plain' RSA keyexchange does use RSA encryption, which is a type of asymmetric encryption aka public-key encryption, so that pre_master_secret is not readable. Although asymmetric or public-key cryptography does use public and private keys, we don't normally say 'public-private key'. I have no clue what you mean by 'integrity control'; RSA encryption does not much resist an adversary manipulating the ciphertext, which allowed an attack by Bleichenbacher that remains an issue. The (only) protection on a plain-RSA handshake is the PRF values in the Finished messages, which functions as a kind of MAC (as long as at least one endpoint is honest and correct).

DH The major weakness of using RSA is that using server private key. An attacker can record all traffic and decrypt traffic if the server's private key is compromised. So to provide forward secrecy 4, DH can be used.
DH works on the principle of discrete algorithm. The mathematical properties allow both sides (the client and the server) to generate its own secret (a, b respectively), and derive to the same shared secret given p, G and g^X mod p (where x is a and b respectively) over the public channel (the world can read them) 5.

That's discrete logarithm. More exactly, Diffie-Hellman ephemeral provides forward secrecy; it is the 'ephemeral' that is critical. 1.2 (and earlier) also defines static (non-ephemeral) DH keyexchanges, but these are practically never used and serve mainly to cause confusion. (They are deleted entirely in 1.3.) There are technically two variants: the original DH-ephmeral using integers, designated DHE in TLS; and the elliptic-curve version, designated ECDHE. Although the same principles apply to both, the actual code (and data) to implement them is quite different.

Question 4: I believe all subsequent traffic will be encrypted using the shared secret, correct

Not directly. The secret generated by [EC]DHE agreement is used as the premaster-secret, in the same fashion as above: first derived to the master secret, then to the working keys/secrets. Compare sections 8.1.1 and 8.1.2, immediately after the excerpt you posted.

Perfect Forward Secrecy (PFS)
To ensure no one can read prior logged traffic, PFS was introduced. Basically, instead of using a long-lived shared key, client and server generate short-lived session keys which are discarded from memory.
Question 5: What are the short-lived keys? X (a, and b respectively) of client and server's?

Yes. Except that although generic DH is often described in terms of a/A and b/B (canonically, Alice and Bob), the TLS specs use different notation. For integer-DHE in 5246, the public keys for server and client respectively are dh_Ys and dh_Yc; (corrected!) the corresponding private keys presumably are Xs and Xc, but are not shown. For ECDHE in 4492, the server public key is simply named public while the client one is named ecdh_Yc (even though in ECC generally we use X,Y for coordinates of a point, and call the privatekey (integer) d and the publickey (point) Q) and again the private keys are not shown.

PSF and RSA
My understanding is RSA is used for authentication (what 'server' sends is coming from 'server', not MiTM). There's HMAC for integrity checks, which is generated during key exchange.
Question 6: Is that right?

(That's PFS. Or just FS.) I'm not at all sure what you're saying, so to mostly repeat what I said before:

  • for RSA keyexchange, the premaster is encrypted by RSA with the server's publickey in its certificate, and the only integrity check on the handshake is Finished, which uses PRF, which is based on but different from HMAC

  • for [EC]DHE keyexchange, the keyexchange parameters are signed with the server's publickey in its certificate. That key and thus the signature may be RSA (in either case), or it may be DSA (also called DSS for historical reasons) or ECDSA depending on the keyexchange

  • regardless of the keyexchange, depending on the cipher either HMAC or AEAD (but not both) is used to authenticate the data traffic

dave_thompson_085
  • 10,064
  • 1
  • 26
  • 29
  • Hi Dave, thank you for taking your time to answer all of my questions. I have two follow ups. First. question 1. Looking at section 8.1(https://tools.ietf.org/html/rfc5246#section-8.1), I was asking what the string "master secrets" is in the concatenation. Is it really just the string "master secret"? And second, so to confirm, all subsequent en/decryptions are done using the derived working keys, correct? – CppLearner Mar 19 '19 at 17:54
  • Yes 8.1 is exactly "master secret" and 6.3 "key expansion" (both in ASCII) and 7.4.9 similarly for finished. Yes, encryption/decryption _and_ authentication/verification use the working keys. The only exception, sort of, is if you do renegotiation -- that creates a _new_ set of premaster, master, and working keys and subsequently _those_ are used. But since the 2009 Apache attack renegotiation is often prohibited, even though rfc5746 fixed most of it and rfc7627 belatedly the rest. – dave_thompson_085 Mar 21 '19 at 18:57