24

Looking into the details of Pretty Good Privacy, I'm confused as to the reasoning behind encrypting the message with a session key and the key with the recipient's public key via RSA. I fail to see how this is any safer than straightforwardly RSA-encrypting the actual message.

It may very likely be that I am simply too naive in terms of security and encryption, but if I can obtain your RSA private key, why does it matter whether that gives me access to the message body or the other key that will unlock the message body?

Adi
  • 43,953
  • 16
  • 137
  • 168
Ken Bellows
  • 343
  • 1
  • 2
  • 6

4 Answers4

35

RSA isn't really built to encrypt large pieces of plaintext. Each RSA "round" can encrypt 117 bytes of data, and to encrypt more, you'd have to use some chaining mode. Currently, this means extra overhead, slowness (remember, RSA is pretty slow), and security uncertainty (RSA-chaningMode hasn't been scrutinized as other types of encryption schemes). <- BAD

By using hybrid encryption (symmetric + asymmetric for the symmetric key) you get the benefit of asymmetric, namely, not having to worry about exchanging keys; and symmetric, very fast and has been well-vetted. <- GOOD

For more information please check: How can I use asymmetric encryption, such as RSA, to encrypt an arbitrary length of plaintext?

Adi
  • 43,953
  • 16
  • 137
  • 168
  • RSA is not a block cipher. There's no need to do any sort of sophisticated block cipher chaining mode with RSA. Granted if you had a message too large for RSA and wanted to avoid hybrid encryption for some bizarre reason, you could in principle do something similar to ECB (which is weak for any deterministic block cipher) by splitting up the plaintext, but it would be slow. – dr jimbob Jun 17 '13 at 02:54
  • 1
    @drjimbob I don't see how your comment offers anything. It's just repeating _almost exactly_ what I already have in the answer. I explained both situations for large plaintext (hybrid: good. RSA-only: bad and slow). – Adi Jun 17 '13 at 02:58
  • Sophisticated chaining modes only need to be introduced for block ciphers. A block cipher means it is both deterministic and symmetric. RSA is not deterministic (in practice as randomized padding needs to be used) or symmetric (all the block cipher diagrams assume the same encryption/decryption key). All of the analysis of block ciphers is meaningless when attempted on RSA. – dr jimbob Jun 17 '13 at 03:03
  • Or see [for example Thomas Pornin's SO answer: "Chaining mode such as ECB makes no sense for RSA, unless you are doing it wrong."](http://stackoverflow.com/a/2856628/457571) – dr jimbob Jun 17 '13 at 03:06
  • 1
    @drjimbob wow! You _really_ want someone to be wrong. Buddy, I said that a chaining mode is needed if you have a large plaintext and you're not using hybrid encryption. I explicitly said that using a chaining mode with RSA hasn't been well-vetted and thus, implicitly saying, it's not the right way. I even linked to question where the answers say pretty much the same. I really don't see our disagreement here. We seem to be saying the exact thing and agreeing on all points. What's wrong? But, yeah, I'll baby-proof it for you. – Adi Jun 17 '13 at 03:11
  • @drjimbob Done. – Adi Jun 17 '13 at 03:12
  • Let's try and stay civil and my repeated part was me agreeing with that part of the argument. I never downvoted you, but disagreed with bringing up chaining modes (and linking to block cipher modes of operation). Disagreement of ideas is good; how else do we learn? My point was that its weird to discuss chaining modes for RSA. E.g., applying CTR mode to RSA won't work as normally you XOR the N-block plaintext with something like `E(k,0)++E(k,1)++...E(k,N-1)` (++ being concatenation). This simply won't work in a secure manner with asymmetric keys (similarly OFB and CFB make no sense). – dr jimbob Jun 17 '13 at 04:54
  • (If you use don't use random padding in RSA, any attacker with your public key can recreate the stream and decrypt. If you do random-pad there's no way for someone with the private key to recreate the same stream.) I'm not trying to prove you wrong. The more I think of it, I was wrong with my first comment. ECB should still be avoided as eavesdroppers could reorder/drop parts of the longer message if it aligned properly. On further thought, you could probably do something like CBC using RSA (if you don't care about being slow) and wasting space if you properly pad (e.g., OAEP) each block. – dr jimbob Jun 17 '13 at 04:55
11

RSA is a slow algorithm, especially for decryption. You have to compute m = c^d mod N where d is a roughly the size of the modulus. For RSA-1024/2048/4096 (where 1024/2048/4096 is the number of bytes of the modulus N), even with fast modular exponentiation by repeated squaring it requires roughly 1.5 lg N (~1500/~3000/~6000) multiplications of N-bit integers. Multiplication of an N-bit integer is not a primitive operation but scales as O(N^1.6) for an n-bit integer (hence each multiplication is roughly 84/256/776 times slower than a 64-bit multiplication primitive).

Note: RSA encryption is significantly faster than decryption as with proper padding there's no weakness in choosing a small public key exponent like e=65537 where your exponentiation step only requires ~17 multiplications; in fact e=3 requiring only two multiplications would work perfectly fine with proper padding.

Furthermore, one application of b-bit RSA meaning you had to generate two primes of roughly have that length each) can only decrypt a message of up to about b-128 bits (using random pad of 128 bits) or roughly 112/240/496 bytes as you only uniquely decode a number between 0 and N. This answer with 8-bit ASCII encoding would need 7449 bytes.

Therefore what customarily happens is you just encrypt a symmetric key as symmetric key will be much faster and using an encryption mode (e.g., CBC, CTR) can encrypt messages of arbitrary size.


Timing information to justify above analysis

I randomly created a 4096-bit RSA keypair (N,e) and (N,d) (using import rsa; rsa.newkeys(nbits=4096)) and then did timings to compare "textbook" (weak - without random bytes padded) RSA encryption / decryption to AES encryption.

The setup for timing RSA is given below encryption/decryption:

>>> encryption = "pow(m,e,N)"
>>> decryption = "pow(c,d,N)"
>>> setup =  """N = 650022002033202164791638561174816123258916492020045683486079596172818033518252144860009135975860423604411399274822133841546708494765449009472683563317182198759309684914356008659922538583279515419473227210977474088638850782595732910857797571156318425669817244314450827318145758475770172116229990603884173470104902799641093008914867436680133631971559712828465243806241512864086546090728020169682901285441149037545107765998640217114788723715575098893307499744794060936075307573516246976128444578474543743151813381476261995022381469996299316134038828450334308951123635607639233510908350391842701316709834010335732144351192249236953330927673972067894385163240378028762678952835544528074929660347162229394149982403274141710996800529609644247153732931740566379474447028470447165473543764460712691302632667942726305345440613110797535733482536487121364569032950159436896042439632454394173272272373593467791776848612054118558722637286863035388661691125609333937871344046274188817824229897604542696165894703693345718292207375417809743565484488696015562272970275127353631067240331072960179800283770558952793368549384848991363395614092594834723746484679602433778811932755737361396956451757817273043140309312219744371338360280947729616063853738641

e = 65537

d = 163256819101675505935126275492278764497028632049833711493978518287449606050176698725845711249563797130302144316394896364372168726426893063398086141449880510117616574052677112204439095245140620165777031598832556014144612720776443287192263118867708337069520909431555619232749121633751575949969416441703671137188560661642925231956585104715733090960096935672315454064890600755952584778878850298197667808388563912873529057301030226798746088352508752731797937742028323588321094384242144517250929974849184277771012531228150089843422784017258750683831715157735366668225506845014904307329056055723208632277201486994123654288753880392020556964543443597828229493325467616467308118864809565200248599428162198788308364968380312577988297310500286974136209331121821893719017982045957945241683426015912952303440579610553088057331105677592306795472995839704525934407019706992740653789223747419222232691256397983243926154436341276701812083801394467756814989897357722664273229362826815661611670740504736110949984419908621950792127278167263015929591331285519955311164883226113070661776218890597216617564529563317568347817244502395174702566462626236588608296425638109493962233171124725237010153412040065506840529586822535766472003407847003802265141256193

m= int("This is a secret message provided as a test for RSA-4096 and AES encryption.  To be fair it is 496 bytes long.  Keyboard mashing follows: asdbklajsdgl;sdjgl;kasjdgiowepgj opijg aslekgjase;lgk jasel; elask;gj lask;lgjl;ske gjasle;k asel;k gjl;asekgj asljsel;k jseal; gkjasel; kjsael;kjg aal;se asegkl;j asel;kjasegl;k jasegl;k jasel;gk jasegl;k jasel gkjsael;gkjasel;gkj sea;lekgjseal;kjasegl;asekgjasel;kgjsael;jkasel;gjkgasel;asfasfl;kjasf;lkjsadfkljdl;kfaskl;jfasldk;fkfkjl;akasdfasdfasdfkjkjga".encode('hex'),16)
c = pow(m, e, N)
"""
>>> import timeit
>>> timeit.repeat(encryption, setup=setup, repeat=3, number=1)
[0.0044488906860351562, 0.0045480728149414062, 0.0044538974761962891]

>>> timeit.timeit(decryption, setup=setup, repeat=3, number=1)  
[1.9231810569763184, 1.9048171043395996, 1.9280149936676025]

4096-bit RSA encryption takes ~4.5ms per run and decryption takes 1.9 seconds (~500 times slower than encryption as d is much larger than e). Encrypting this message with RSA would require 16 blocks and take ~30 seconds to decrypt. Repeating for other RSA sizes, I found 1024-bit RSA takes ~0.5 ms for encryption and 40ms for decryption. 2048-bit RSA takes ~1.6ms for encryption and ~270 ms for decryption.

Now let's look at AES-128, using pycrypto library.

setup = """
from Crypto.Cipher import AES 
from Crypto import Random
from hashlib import sha256

def aes_encrypt(plaintext, key):
    block_size = AES.block_size
    iv = Random.new().read(block_size)
    cipher = AES.new(key, AES.MODE_CBC, iv)
    pad_len = block_size - (len(plaintext) % block_size)
    padding = ''.join([chr(pad_len)]*pad_len)
    encrypted_msg = iv + cipher.encrypt(plaintext + padding)
    return encrypted_msg

def aes_decrypt(encrypted_msg, key):
    block_size = AES.block_size
    iv = encrypted_msg[:block_size]
    cipher = AES.new(key, AES.MODE_CBC, iv)
    padded_msg = cipher.decrypt(encrypted_msg[block_size:])
    pad_len = ord(padded_msg[-1])
    msg = padded_msg[:len(padded_msg)-pad_len]
    return msg

key = sha256("secret passphrase").digest()[0:16]
message = "This is a secret message provided as a test for RSA-4096 and AES encryption.  To be fair it is 496 bytes long.  Keyboard mashing follows: asdbklajsdgl;sdjgl;kasjdgiowepgj opijg aslekgjase;lgk jasel; elask;gj lask;lgjl;ske gjasle;k asel;k gjl;asekgj asljsel;k jseal; gkjasel; kjsael;kjg aal;se asegkl;j asel;kjasegl;k jasegl;k jasel;gk jasegl;k jasel gkjsael;gkjasel;gkj sea;lekgjseal;kjasegl;asekgjasel;kgjsael;jkasel;gjkgasel;asfasfl;kjasf;lkjsadfkljdl;kfaskl;jfasldk;fkfkjl;akasdfasdfasdfkjkjga"
cipher = aes_encrypt(message,key)
"""

>>> timeit.repeat("aes_encrypt(message,key)", setup=setup, repeat=3, number=1)
[6.198883056640625e-05, 6.198883056640625e-05, 6.008148193359375e-05]

>>> timeit.repeat("aes_decrypt(cipher,key)", setup=setup, repeat=3, number=1)
[1.0967254638671875e-05, 9.059906005859375e-06, 9.059906005859375e-06]

Bottomline, AES encryption is roughly 8/25/500 times faster than RSA and AES decryption is about 4000/25000/200000 times faster than RSA (at 1024/2048/4096 bit encryption respectively). Hence, hybrid encryption always makes sense whenever you need more than one block.

dr jimbob
  • 38,936
  • 8
  • 92
  • 162
3

From something I wrote 3 years ago: "There are 2 basic types of encryption that we use every single day of our lives. Everyone uses them. Even my Grandma.

Type 1 is Symmetric, aka Single Key, aka Secret key, aka super-fast... Note how all the S words go together :)

Type 2 is Asymmetric, aka Public Key/Private Key (or just public key for short), aka Two Key, aka really-fricking-slow encryption. This is what people generally THINK they are using when they use PGP. (they are, but they are also using Symmetric encryption to do the bulk of the work)" http://www.infosecisland.com/blogview/4497-Public-Key-Private-Key-Secret-Key-Everyday-Encryption-.html

Basically, as dr jimbob showed above, public key is too slow to use for the bulk encryption. Mixing it with Symmetric fixes that, and the reason we don't use symmetric alone is there is the key exchange problem. How else do we ensure that our intended recipient gets the key and they are the only one who gets the key?

Rod MacPherson
  • 1,067
  • 7
  • 11
1

Another reason why this is useful is because it increases flexibility. For example, encrypting only the session key allows you use multiple key pairs with relavitely low overhead. You just add an encrypted session key per keypair.

Tim Lamballais
  • 282
  • 1
  • 4