7

I often use GnuPG to encrypt files with a passphrase. I'm very paranoid, and afraid that the key derivation used by GnuPG isn't slow enough.

For this reason, I've decided to do my own key derivation before feeding the passphrase into GnuPG.

I have a Python script that runs 10 million rounds of PBKDF2-SHA512 with a static salt and outputs the result hex encoded. I call this 'pre derivation'.

I first run that on the passphrase, and feed its output to GnuPG as an intermediate passphrase.

Am I on the right track, or am I wasting my time doing this, or even worse, weakening the encryption security?

Jens Erat
  • 23,816
  • 12
  • 75
  • 96
  • 14
    *"I don't trust a well-known security tool, so I write my own"* - Usually this is a *really* bad idea. – SEJPM Mar 13 '16 at 22:09

2 Answers2

12

Apart from the fact you'd better not deploy custom crypto code anyway, you're reinventing the wheel. OpenPGP's string-to-key functionality is configurable and can be adjusted to your needs, while not losing compatiblity. I'm not discussing your choices in the number of cycles here, although they seem a little bit harsh. I'd recommend reading At what point does adding more iterations to PBKDF2 provide no extra security? on this topic.

From man gpg:

--s2k-cipher-algo name

Use name as the cipher algorithm for symmetric encryption with a passphrase if --personal-cipher-preferences and --cipher-algo are not given. The default is AES-128.

--s2k-digest-algo name

Use name as the digest algorithm used to mangle the passphrases for symmetric encryption. The default is SHA-1.

--s2k-mode n

Selects how passphrases for symmetric encryption are mangled. If n is 0 a plain passphrase (which is in general not recommended) will be used, a 1 adds a salt (which should not be used) to the passphrase and a 3 (the default) iterates the whole process a number of times (see --s2k-count).

--s2k-count n

Specify how many times the passphrases mangling for symmetric encryption is repeated. This value may range between 1024 and 65011712 inclusive. The default is inquired from gpg-agent. Note that not all values in the 1024-65011712 range are legal and if an illegal value is selected, GnuPG will round up to the nearest legal value. This option is only meaningful if --s2k-mode is set to the default of 3.

To wrap up, following options will have the same effect:

gpg --s2k-mode 3 --s2k-digest-algo SHA512 --s2k-count 10000000 --symmetric

--s2k-mode 3 is GnuPG's default (and only reasonable setting for this option); I did not include --s2k-cipher-algo as this is not relevant for key derivation (and not handled by the "pre-derivation" you described, anyway). Alternatively, you can set this as default in your gpg.conf:

s2k-mode 3
s2k-digest-algo SHA512
s2k-count 10000000

Those options can not only be used for symmetric encryption of messages/files, but are also used for passphrase protection of private keys.

Jens Erat
  • 23,816
  • 12
  • 75
  • 96
  • I tried modifying the s2k number, but even with 10 million it still completes almost instantly. pbkdf takes multiple seconds on my machine. – Paranoid GPG Mar 13 '16 at 22:43
  • 4
    @ParanoidGPG, This may be due to the fact, that Python isn't exactly the optimal choice for performance critical code (such as this one) – SEJPM Mar 14 '16 at 00:01
  • 8
    To add to what @SEJPM wrote, an attacker who's sufficiently funded to brute force your passphrase will not be stupid enough to run your slow Python implementation of the KDF but will instead run an optimized version. A KDF that's *slow for you* is not helpful; it's only helpful to be slow when it's *impossible to make fast* in some sense. – R.. GitHub STOP HELPING ICE Mar 14 '16 at 01:33
2

While protecting your passphrase is the most critical part maintaining security for GPG, the odds are that you are weakening existing security rather than strengthening it.

It is critical that you try to minimize the time your passphrase is in your computer's memory. I'm certain that GPG appropriately zeroes out memory to minimize exposure. But memory managed languages like Python don't really facilitate secure string manipulation. This means that your initial passphrase, as well as the one that you pass to GPG, will likely be left in your computer's virtual memory (read that as on-disk). How likely that is to be extracted is up to debate, but it is certainly much greater than the time that GPG will leave your key lying around.

You also don't mention how you feed the SHA512 hash into GPG. If you are printing it out and maybe even copying-and-pasting it into GPG, that would be terrible. (Copy-buffers should never be used for secret data.)

So I recommend that you use existing GPG options (see @JenErat's answer for an excellent overview) and try to remember that you shouldn't roll your own crypto.

Neil Smithline
  • 14,702
  • 4
  • 38
  • 55