15

I am using zip 3.0.0 on macOS High Sierra and Ubuntu. Here is my zip version on macOS:

$ zip --version | head
Copyright (c) 1990-2008 Info-ZIP - Type 'zip "-L"' for software license.
This is Zip 3.0 (July 5th 2008), by Info-ZIP.
Currently maintained by E. Gordon.  Please send bug reports to
the authors using the web page at www.info-zip.org; see README for details.

Latest sources and executables are at ftp://ftp.info-zip.org/pub/infozip,
as of above date; see http://www.info-zip.org/ for other sites.

Compiled with gcc 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14) for Unix (Mac OS X) on Feb 22 2019.

Here's the one on Ubuntu:

$ zip --version | head
Copyright (c) 1990-2008 Info-ZIP - Type 'zip "-L"' for software license.
This is Zip 3.0 (July 5th 2008), by Info-ZIP.
Currently maintained by E. Gordon.  Please send bug reports to
the authors using the web page at www.info-zip.org; see README for details.

Latest sources and executables are at ftp://ftp.info-zip.org/pub/infozip,
as of above date; see http://www.info-zip.org/ for other sites.

Compiled with gcc 6.3.0 20170221 for Unix (Linux ELF).

I have read this answer at https://security.stackexchange.com/a/186132/108239 which recommends against using zip for encryption.

However, in the environment I am in, I need to send a file securely to non-technical users. Here are my constraints:

  • I am allowed to send my recipients an arbitrarily long password.
  • I am allowed to send them zip files (encrypted or unencrypted).
  • I am not allowed to ask my recipients to install additional software.
  • I only care about confidentiality of the content of the file.
  • I do not care about the confidentiality of the filename or file metadata.
  • I do not care about integrity or non-repudiation.

Given these constraints, so far I have been sending files this way:

zip -e secret.zip secret.txt

I use a 80-character long randomly generated alphanumeric (A-Za-z0-9) password to encrypt the secret file. The zip utility does not accept passwords any longer. Trying to do so results in the (line too long--try again) error.

This uses the following crypto method:

$ 7z l -slt secret.zip | grep Method
Method = ZipCrypto Deflate

My questions:

  • Is a 80-character long randomly generated alphanumeric password strong enough to compensate for the weak cipher technology of zip utility?
  • What is the minimum entropy that a password should have to make it secure enough to be used with the zip utility? To define "secure enough", say, cracking the zip file should take 10 or so years with the current computing power (ignore an increase in computing power for now for the sake of simplicity).
Lone Learner
  • 968
  • 1
  • 9
  • 18
  • You can compare various official recommendations for the length of your password (measured in entropy, not in number of characters) : keylength.com/en/compare (you are looking for the column "symmetric"). In addition, sogis.eu recommends at least 125 bits. – A. Hersean Aug 14 '19 at 09:44
  • I know the bounty says "drawing from credible and/or official sources", but I'm not sure that is needed for this answer. Most of it can be derived from logic (a little bit of math and knowledge of current password cracking speeds) and the generally known vulnerabilities present in zip's encryption (which you already know of). – Luc Aug 14 '19 at 10:23
  • @Luc The reason why I mentioned "drawing from credible and/or official sources" because I think to reliably answer this question, we need to know how weak the cipher technology of ZIP encryption is and how many bits of entropy in a password would be sufficient to compensate for such weakness. Your answer provides a very fine analysis assuming 128 bits of entropy is sufficient. However, I think we need some kind of credible source to claim that 128 bits of entropy is sufficient for ZIP encryption. What if the ZIP encryption is so weak that we need, say, 160 bits of entropy instead of 128? – Lone Learner Aug 15 '19 at 06:51
  • Are you using ZipCrypto or AES? – forest Aug 15 '19 at 07:01
  • @forest I am using the `zip -e secret.zip secret.txt` command mentioned in my question. What does that do by default? Is there a way I can figure out what cipher it uses to encrypt the content of the ZIP? – Lone Learner Aug 15 '19 at 07:45
  • @LoneLearner I have no idea. Perhaps 7zip will show what encryption it's using? If it is using ZipCrypto, then the encryption is essentially useless and it won't last for 15 minutes, much less 10 years. – forest Aug 15 '19 at 07:46
  • @forest I am indeed using `ZipCrypto Deflate`. Here is the output of `7z l -slt secret.zip | grep Method`: `Method = ZipCrypto Store`. Are you saying that no matter how long the password is, it can never be made to last 10 years? If that's true and if you can explain why it is so, I think that would be a good answer to this question. – Lone Learner Aug 15 '19 at 07:49

3 Answers3

11

If you are using ZipCrypto for encryption, then any password length is insecure.

You are using the ZipCrypto cipher, rather than the more secure AES. ZipCrypto is extremely weak, as it is based internally on a non-cryptographic construction called a CRC. It is highly vulnerable to a known-plaintext attack, which in practice does not require knowing exactly what your Zip file contains, only a minuscule portion. This can typically be satisfied by using the header of a file, which is typically known for the vast majority of commonly compressed files.

The attack, discovered in the 90s, is detailed in A Known Plaintext Attack on the PKZIP Stream Cipher. With 13 bytes of known plaintext, the full key can be recovered with a complexity of 238, which is not much. In fact, it took only a few hours on a personal computer, and that was with 90s hardware! Attacks only get better over time, and the attack against ZipCrypto is no exception to this trend: An Improved Known Plaintext Attack on PKZIP Encryption Algorithm.


If you switch to the newer Zip encryption format which supports AES, then you can securely encrypt files and trust that they will stay confidential indefinitely. Assuming you want a minimum of 128 bits of security and are choosing a password from a set of 62 symbols (an alphanumeric password), you would want to use at least 22 characters, as log2(6222) ≈ 131 bits and, obviously, 131 ≥ 128. Zip encryption of any kind does not provide integrity, but you specified that you only need confidentiality.

forest
  • 65,613
  • 20
  • 208
  • 262
  • Is this true for arbitrarily long password too? Is it impossible to make a very long password that would increase the complexity of recovery to 2^80 or 2^128? – Lone Learner Aug 15 '19 at 08:05
  • 3
    @LoneLearner Correct. You cannot use a longer password and achieve that level of security. This attack breaks the raw internal key, not the password you give it which is used to derive the key. – forest Aug 15 '19 at 08:05
  • What if there is no known plaintext whatsoever in the file? What if the file contains random bytes in it? Is ZipCrypto safe then? – Lone Learner Aug 15 '19 at 08:08
  • 1
    @LoneLearner Completely random, as in you're just sending a raw, random file? I guess it would be harder to break, but likely not impossible. ZipCrypto uses CRC32 for encryption, which was never designed for security and is very easy to break. If you need more than a few seconds of security, don't use ZipCrypto. – forest Aug 15 '19 at 08:09
  • Yes, a raw random file. Say, something like `head -c100 /dev/urandom > secret.txt`. Do you know how it would still be easy to break the encryption? The attack you have mentioned is a known-plaintext attack and in this case there would be no known plaintext. – Lone Learner Aug 15 '19 at 08:11
  • 1
    @LoneLearner In the case of true random data, then there's no way to break _any_ cipher, as it would be impossible to tell if a decryption attempt with a given key succeeded or not. That's not what ciphers are for. Don't try to hold on to ZipCrypto. It's completely dead and has been for more than two decades. – forest Aug 15 '19 at 08:12
  • "indefinitely"? That's an interesting idea. Do you think we will never get a working quantum computer? – Luc Aug 15 '19 at 08:51
  • 1
    @Luc My personal opinion on the feasibility of high-speed cryptanalytic quantum computers aside, it turns out that Grover's algorithm is really, really bad at parallelization, so you'd need to search a 64-bit keyspace with only one quantum computer. Even today for the fastest classical computers and then some, there's no way to brute force a 64-bit key with a single serial processor. That means that, even if a quantum operation was a billion times faster than the fastest classical operations (which seems unlikely even in optimistic scenarios), AES128 would be quite secure from it. – forest Aug 15 '19 at 08:53
  • @Luc Quantum computers mostly break _asymmetric_ (public-key) crypto, because those tend to depend on factorization being hard, and with Shor's algorithm factorization becomes easy. AES with decent keysizes (128+) remains unbroken, even under quantum computing, since we have no equivalent for that. That's not to say one is impossible, but quantum computing _as it currently stands_ poses little to no threat to modern symmetric algorithms. – Nic Aug 16 '19 at 01:23
  • @NicHartley Well Grover's algorithm reduces the keyspace from 2^128 to 2^64 which would render it insecure assuming a quantum operation is as cheap as a classical operation, but it's iffy whether or not that is the case due to the parallelization issues I mentioned above (and a few other reasons). – forest Aug 16 '19 at 02:43
  • @forest Right, that's what I mean by "as it currently stands" -- if you encrypt data with a symmetric algorithm considered secure today, the introduction of quantum computing won't significantly impact the security of the data, _as we currently understand_ "quantum computing" and its impacts on crypto. The only quantum attacks on e.g. AES don't actually make it take any less _time_, only fewer _iterations_, under our current understanding of how that quantum computing would work. That could change, obviously, which is why I hedged it like that. – Nic Aug 16 '19 at 02:59
1

This is two questions in one. One is easy, one is hard. Let's start with the easy one:

How long should my encryption password be for it to remain safe for 10 years?

It is well known that a random 128-bit secret is plenty of entropy to not be cracked, even if it is protected by the fastest possible (but not vulnerable) hashing algorithm. The reason I bring hashing up is because there is a lot of research into, good software for, and a lot of benchmarks for password hash cracking. Decrypting a zip file is basically doing the decrypt operation and checking if you can decrypt a byte of which you know what the value should be. (See section 6.1 of the zip format specification, or the Python implementation.) The only speed indication regarding the cracking of zip files on a GPU is a marketing claim of 10 billion attempts per second. This is mention in some small, grey text with no reference or any more information about their setup. It's a data point to consider, but not a very reliable one. Hashing is similar to this in that hashing algorithms are designed to be very fast, and you also apply some operation and check the result.

For example, MD5 crackers can reach hundreds of billions of attempts per second on moderately expensive hardware. Let's use this as our basis rather than the 10 billion marketing claim, since we don't know what year that was or what setup they used.

Even if we assume that Moore's law holds (it doesn't) and can be interpreted as "computer speed doubles every 18 months" (it can't), then in 10 years, we would have had 6.7 doublings. For convenience's sake, let's say we run at that speed the whole time, so we run at a few hundred billion times 26.7, so a few trillion attempts per second (let's say 50 trillion). In 10 years, there are 3600*24*365.25*10 = 316 million seconds. That's 50e12 * 316e6 = 16 sextillion (16e21) guesses... so we will have checked about 73 bits of entropy, or 0.000000000000005% of the possible keys (because log(16e21)/log(2) is ~73 and 16e21/2128*100 is the aforementioned percentage).

If you want to play it risky, you could go for a password containing about 80 bits of entropy, and an attacker would have a small (but non-negligible) chance of cracking it. More realistically, you would want to have some margin of error, also in case the encryption algorithm is weakened. I would advise to stick with 128 bits of entropy as a good idea. This also matches what a research project by the European Union concludes: "For general long-term protection (30 years), 128 bit keys are recommended (level 7)." (Source: me on Wikipedia)

If we use a-z, A-Z, 0-9, then each position of your password can have 26+26+10 = 62 possible values. To compute how many characters we need for 128 bits, we can use: log(10^128*log(2))/log(62) = 21.5 characters (because log(62^21.5)/log(2) = 128).

Answer: 22 randomly generated characters, consisting of the characters A through Z, a through z, and 0 through 9. Example: M89bqltyuPQ0g34Uv2CR6b. No need for an 80-character password like you suggested!

Now the difficult question:

Will zip encryption be safe for 10 years?

There are different kinds of encryption supported by the zip format. I wrote this section under the assumption that zip software uses AES by default these days, not the ancient and broken CRC-based crypto, which forest points out it actually does use by default.

So if you configure the software to use AES:

This is an assumption, but it seems to have withstood quite some time and scrutiny by now. Although there are some attacks known, it appears unlikely that it will be weakened very significantly in the next decade. A quantum computer changes things somewhat, but nobody can say what the timeline for that will be. A decent quantum computer in the next decade, especially available to those who do not have billions of euros to invest in one, sounds very optimistic.

If you are worried about quantum computers, use software that only uses symmetric encryption algorithms and hashing algorithms of at least 256 bits (so your key should contain that much entropy, and the software must use it), such as AES-256 and SHA-256. Depending on which implementation you use, zip might qualify: according to the specification, format version 5.1 adds support for AES-256. The original encryption is symmetric encryption and will not break as catastrophically as asymmetric encryption would, but it uses too short keys. While a quantum computer will weaken symmetric encryption and hashing algorithms, Grover's algorithm tells us that it is basically in the order of halving the number of bits, so a password containing 256 bits of entropy today will, for a quantum computer, be as strong as a password containing 128 bits of entropy.

(Note that you should also apply good password management: don't reuse the password elsewhere or other things that people commonly do wrong with passwords. But how to manage your passwords is out of scope for this answer.)

Answer: If you use zip with AES, then probably yes, but it is impossible to say for sure.

Luc
  • 32,378
  • 8
  • 75
  • 137
  • Quantum computers are atm a scam, nothing more than marketing. And even if properly implemented they will have extremely limited use. – Overmind Aug 14 '19 at 12:55
  • @Overmind At the moment, definitely yes. Doesn't mean they're not worth considering though, there are some prototypes that have some extremely limited capacity and stability (much less than an ordinary computer), but they're getting there... maybe not in 10 years, but almost certainly before the end of the century (assuming we manage to not gas our planet to death). – Luc Aug 14 '19 at 14:02
  • Thank you for the detailed answer. I have a few questions about your answer: Why do you start with the rate of attempts in cracking an MD5 hash? How does it relate to the rate of attempts in cracking an encrypted-ZIP password? I believe an assumption your answer is making is that the rate of attempts for cracking an encrypted-ZIP password is either about the same or less than the rate of attempts for cracking an MD5 hash. Is this assumption valid? What if we can make a significantly higher number of password attempts on a ZIP? – Lone Learner Aug 15 '19 at 06:46
  • I guess what I am trying to enquire is that while 128 bits of entropy is sufficient for MD5 hash (ignoring other weaknesses of MD5) and AES, is it also sufficient for the password of an encrypted-ZIP file? Are there any known weaknesses in an encrypted-ZIP file that would deem 128 bits of entropy insufficient? – Lone Learner Aug 15 '19 at 06:47
  • 2
    @Luc If OP is using ZipCrypto as he says in the comments, then this answer is wrong as attacks exist which are more efficient than brute force. It uses CRC32 for encryption, after all! – forest Aug 15 '19 at 07:57
  • 1
    _Shor's algorithm tells us that it is basically in the order of halving the number of bits_ – Do you mean Grover's algorithm? – forest Aug 15 '19 at 08:14
  • @forest Correct on both counts! Updated the answer, thanks. – Luc Aug 15 '19 at 08:47
-1

Is a 80-character long randomly generated alphanumeric password strong enough to compensate for the weak cipher technology of zip utility?

Firstly you should be encrypting with a minimum of AES 128, given the 10 years you may want to consider 256 (I’m going leave out quantum possibilities for a decade).

I’m not aware of your zip package, sorry, but if it can’t support that as it’s too old. You might want to check for a newer package as older apps are likely to contain vulnerabilities which leaves you open to being exploited at some point.

Choosen a long and pseudorandom key is yes important. Go as long and random as you can. The safe storage of your key is just as important over 10 years.

What is the minimum entropy that a password should have to make it secure enough to be used with the zip utility? To define "secure enough", say, the zip file should remain uncracked for 10 or so years.

Difficult to say as 10 years could uncover assorts of extra computing power and vulnerabilities within the encryption algorithms. Go long again, and use well trusted pseudorandom generators.

ISMSDEV
  • 3,272
  • 12
  • 22
  • 2
    Why do you think AES-128 is not enough for 10 years and recommend AES-256 instead? As per https://crypto.stackexchange.com/a/48669 AES-128 takes more than the age of the universe to crack. – Lone Learner Aug 09 '19 at 06:20
  • 2
    I’m not saying it isn’t good enough. Just given the choice I would go 256. Depends what type of threat actor you are wanting now, and in the future, to protect against. – ISMSDEV Aug 09 '19 at 07:21
  • @LoneLearner While indeed AES-128 may be sufficient (and to be realistic, it probably will be), the downside of AES-256 is that it will take slightly longer to encrypt and decrypt. If your zip file is not abyssimally huge, or has to be encrypted/decrypted thousands of times per day, then AES-256 offers "slightly better defense against an already tiny attack surface" at the cost of taking slightly longer. –  Aug 14 '19 at 09:40