2

I am tempted to write my own which covers:

  • Checksum to ensure data is not tampered with.
  • Long and multiple rotating keys so that it is (practically) impossible to decrypt.
  • Will use /dev/random for the initial seed

Other more elusive items that likely won't be addressed:

  • Based on the size chunks of data and connection pattern, an interceptor can guess things about the data without seeing the data; what protocol, how is it being used.
  • Source and destination IPs give an identity clue.
  • TCP and IP headers are liable to contain OS information.

Although I am very confident,, there is a risk that there is something I am missing that I may never realize: Lessons learned and misconceptions regarding encryption and cryptology

So what would I use if I was to avoid writing my own? The important factors are

  • Speed
  • Checksum
  • I would like to know how much data it takes before the key runs out and is going to be re-used.

Or is writing my own encryption is a good idea in this case?

700 Software
  • 13,897
  • 3
  • 53
  • 82

2 Answers2

7

Writing your own encryption is never a good idea. Even trained cryptographers (I mean people who have studied the subject for years, have a big shining diploma [a PhD] to say it, and, more importantly, have done and published actual research) do not think about using an algorithm or protocol that they have designed before having submitted it for inspection by their peers for several years.

On the more anecdotal:

  • "Checksum" is a large term; cryptographers prefer Message Authentication Code, which is much more precise.
  • "Long and multiple rotating keys": this sounds like a Dan Brown novel, and, security-wise, that's not a good thing.
  • Do not use /dev/random, use /dev/urandom.
  • "How much data it takes before the key runs out and is going to be re-used": this sentence makes sense only if you consider the kind of encryption systems that were used during World War I, that is, before the invention of the computer.

I do not know of any kind way of stating this: you feel confident, but you really should not.

Use TLS (the new, standard name for SSL).

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955
  • It seems obvious that the OP meant MAC, but I think checksum just refers to a value calculated over a chunk of data that is used to check if the data has any errors. i.e. CRCs and even parity (one-bit). – this.josh Jul 11 '11 at 17:18
4

is writing my own encryption is a good idea in this case?

No. Take advantage of the knowledge, experience, and work of the professionals who have spent decades designing encryption algorithms.

impossible to decrypt.

Any encryption can be decoded just by trying every possible key with an algorithm and checking to see if the output makes sense. So we typically refer to how much work an attacker would be expected to perform before decrypting a message.

An algorithm is considered secure against a certain class of attacker when it is estimated the attacker could not be expected to decrypt the message within a relevent time period.

If you are protecting against a single person with a small amount of retail equipment, the expected work factor is small. If you are protecting against the computing resources of a industralized country, the expected work factor is huge.

Speed

Speed depends on the key size. Several modern algorithms support multiple key sizes. Based on Performance Evaluation of Symmetric Encryption Algorithms it looks like Blowfish is fast, but Blowfish has not been as thoroughly vetted as AES.

ensure data is not tampered with.

Checksum

Checksums, hashes, and Message Authentication Codes (MAC), do not necessarily relate to the encryption algorithm you choose. However some algorithms have modes that perform encryption and authentication. Galois/Counter Mode is such a mode that is used in a number or standards. AES-GCM is an example of an algorithm providing encryption and authentication.

how much data it takes before the key runs out and is going to be re-used.

As @Thomas-Pornin said this is not applicable to modern encryption algorithms. Common practice is to use random data to generate keys, and random data doesn't really run out. You might be refering to the issue of not repeating a Initialization vector (IV, also known as Nonce), but I'm not sure.

Based on the size chunks of data and connection pattern, an interceptor can guess things about the data without seeing the data; what protocol, how is it being used.

Yes, these are generally known as side-channel attacks. There are measure you can take to mitigate some side-channel attacks and they mostly involve protecting the transmission.

Source and destination IPs give an identity clue.

Yes, but attempting to conceal your presence is likely more effort than it is worth.

TCP and IP headers are liable to contain OS information.

Network packets may assist an attacker in fingerprinting your system, but I think it is more valueable to focus on good operation security, especially the policies and procedures that impact your security.

So what would I use if I was to avoid writing my own?

To answer that we should model the work factor of the expected class of attacker, the desired latency and throughput, and available computing resources. However, my guess is that some mode of AES is suitable for you.

this.josh
  • 8,843
  • 2
  • 29
  • 51
  • To clarify where you were not sure what I was talking about "how much data it takes before the key runs out": I was really not knowing where to go, but I understand that if I have 1KB of data and a 1KB key, then it is impossible to decrypt. Only by re-use of the key is it possible to decrypt. However, I missed the point. If I used up a 1KB key, there is no way to pass a new one except by re-using the last one. A way to resolve this might be to do some weird stuff with key rotation, but then again, it may not be worth while. Also, I don't yet understand what IV and Nonce are. – 700 Software Jul 19 '11 at 15:54
  • By impossible to decrypt I mean, by brute force you could find a possible decrypted copy, but with a little more brute force you would find another possible copy, repeatedly. The only clue that you would have found the right one is the length of the file. And knowing the content is, say, 1019 bytes long is not going to give you much of a clue as to which possibility is correct. – 700 Software Jul 19 '11 at 16:02
  • Your first describes a one-time pad. A one time pad has no cryptographic weakness, and is often described as unbreakable. For a one-time pad to work the key must be as long or longer than the data. Additionally the key, or any significant piece of the key must never be reused. In order to use a one-time pad, both send and receiver must have identical copies of a set of keys. If they run out of keys they must arange to meet and create a new set. The requirement to generate a large number of keys and then meet securely to synchronize the keys means one-time pads are infrequently used. – this.josh Jul 19 '11 at 16:26
  • Most encryption systems use small (when compared to the data) keys. In one-time pad the algorithm is trivial, in more common encryption the algorithms are complex. The algorithm allows us to take a fixed size key and use it to encrypt an arbitrarly long piece of data. So, in more common encryption, the length of the data doesn't prevent encryption. Also more brute force is not meaningful. Brute force means the process of trying every possible key. There are no more keys than every possible key. – this.josh Jul 19 '11 at 16:34