9

I am encrypting chat messages with RSA thus speed is no issue.

It was working fine before I tried transmitting a long message. Now I need to cope with the fact RSA is block, instead of stream, based. Easy: chop the bytes into smaller packets.

But no:

[...]splitting the input message into 245-byte chunks and encrypting each of them more or less separately. This is a bad idea because:

  • There can be substantial weaknesses in how the data is split and then rebuilt. There is no well-studied standard for that.[...]

-- Thomas Pornin

The project is only a hobby learning experience stuff. Still, I don't want to learn wrong.

How dangerous is it? What are some simple alternatives?

Vorac
  • 1,907
  • 3
  • 20
  • 29

3 Answers3

7

You would have a problem if you cut your 245 byte message into 16 bytes chunks and the last one gets encrypted as 5 bytes payload + 11 bytes of zeroes. Since there is only 40 bits of payload, these last five bytes might be decrypted using brute force. And that might be enough to cause damage.

You get around this by not adding 11 bytes of zeroes, but 11 random bytes.

Or you might have some repeated text or in general dependencies between your 16 byte chunks. So you don't want to just split your text up and encrypt each single chunk on its own, but encrypt the first chunk, then a combination of first and second chunk, and so on.

Of course even though speed is not a problem, it's always best to use a standardised solution, and that is to create a stream, create a key for it, and encrypt only the key with RSA, which is what everyone else does. It won't save you time, but it's safer because millions of users would have run into problems before you if there were any problems.

gnasher729
  • 2,107
  • 11
  • 16
  • 1
    This doesn't make sense. If the “RSA” encryption is some home-made non-randomized public key operation like textbook RSA, then all messages are subject to brute force guessing, not just the last one. If it's some standard RSA encryption like PKCS#1, you can't verify guesses without knowing the private key, no matter how long the plaintext is. – Gilles 'SO- stop being evil' Dec 29 '22 at 15:31
  • 2
    @Gilles'SO-stopbeingevil' If only 40 payload bits make it into the last message, and no randomness, then it can be bruteforced in 2^40 time instead of 2^128 – user253751 Dec 29 '22 at 16:02
  • 1
    @user253751 No, that's not how asymmetric encryption works. It's impossible to validate a guess. (For PKCS#1v1.5 and RSA-OAEP, the way it works internally is that each plaintext is encoded with a random nonce. To validate a guess, you need either the public key *and the nonce*, or the private key. To find the nonce from the ciphertext, you need the private key.) – Gilles 'SO- stop being evil' Dec 29 '22 at 17:56
  • 1
    @user253751 Also, a 128-bit message can be guessed in far less than 2^128 attempts. The average number of required guesses is the message's entropy. English text has about 2 bits of entropy per letter, and context can make it much less. – Gilles 'SO- stop being evil' Dec 29 '22 at 18:01
4

Gnasher729 points out the biggest problem with splitting everything into small packets... There is a 2nd problem with this technique, which is a bit more esoteric.

Encryption techniques can be put into 2 broad categories, substitution block ciphers, and mutating substitution block ciphers. A substitution block cipher converts a meta character (1 to N bits of information where N is the block size of the cipher) to the same meta character A -> B every single time, which is nice be cause it makes decrypting/encrypting the message fast, and parallelizable. But, it allows your message to be analyzed and broken using statistics + apriori knowledge about the message.

A mutating substitution block cipher is a block cipher that encrypts things differently as it goes thru the message by using some, hard to reverse engineer, saved state (a CPRNG, previous cyphertext etc...) so that your meta characters map differently in the throughout the message A->B then A->C. This makes your message much more secure because statistical analysis is far more difficult. The downside of course is that your message can only be decrypted/encrypted as a whole message. And poorly chosen mutating saved state can make the message less secure, ENIGMA is a good example of a poorly chosen mutating cipher... Which is why you shouldn't roll your own Crypto.

You could get around this particular weakness by generating a new RSA key for every single payload. But I don't recommend that.

Edit: Realized you wanted a recomendation, which this fails to provide... This is my recomendation.

Use RSA to establish a communication channel and then send a randomly generated key using RSA.

Use that randomly generated key to encrypt the message you want to send using a symmetric encryption algorithm, like AES GCM (do not use AES CBC, it is flawed).

This is relatively easy to implement.

Questor
  • 141
  • 3
  • 1
    Double Ratchet & Co are far beyond the scope of the project. But thank You for teaching me what that class of algorithms is named! – Vorac Dec 29 '22 at 06:07
  • 1
    I am not certain if those are the "proper names" but they are how I think of these 2 useful generalizations. – Questor Dec 29 '22 at 20:59
1

How dangerous is it?

If the messages are too small, and you apply just RSA over them (without any padding schemes for RSA) then the message itself can be recovered (if it is small enough or the key is big enough) by applying lattice reduction algorithms. The theory is quite complex, you can read more about it here, but the just of it is that RSA shouldn't be used all by itself and without any padding schemes such as OAEP.

What are some simple alternatives?

One simple scheme I see would be to use Diffie-Hellman (with or without Elliptic Curves, also, bear in mind that the key exchange needs to be resistant to man in the middle attacks) to generate shared symmetric keys and then use AES with a modern mode of operation, such as AES-EAX.

But since you said this is a hobby project for learning new things, I'd recommend using the above method, but substituting Diffe-Hellman with Kyber, it has been selected last year to become a NIST standard, meaning that in a few years it might become widely used.

Zicar
  • 56
  • 1