2

I am trying to encrypt one of my primary Linux partition (5GB) by following this example to encrypt.

I want to confirm some of my understandings:

  1. Salt is automatically generated by cyrptsetup with luks?
  2. --iter-time and --hash only affects the time to "open" the encrypted partition? Such that it doesn't affect read latency after the encrypted partition is open.
  3. What other parameters in cryptsetup...luksFormat can affect the decryption speed besides --cipher? Basically, I want to increase the security without adding penalty to read I/O.

Another question is: What's the "volume key"? The manpage states that --use-random and --use-urandom tell how to generate the "volume key". How it is used?

update:

# cryptsetup benchmark
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1       292898 iterations per second for 256-bit key
PBKDF2-sha256     342224 iterations per second for 256-bit key
PBKDF2-sha512     239619 iterations per second for 256-bit key
PBKDF2-ripemd160  213472 iterations per second for 256-bit key
PBKDF2-whirlpool  260580 iterations per second for 256-bit key
argon2i       6 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
argon2id      6 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
#     Algorithm |       Key |      Encryption |      Decryption
        aes-cbc        128b       887.6 MiB/s      2616.0 MiB/s
    serpent-cbc        128b        92.4 MiB/s       560.2 MiB/s
    twofish-cbc        128b       173.1 MiB/s       164.0 MiB/s
        aes-cbc        256b       744.4 MiB/s      2671.6 MiB/s
    serpent-cbc        256b        93.0 MiB/s       560.2 MiB/s
    twofish-cbc        256b       173.1 MiB/s       164.0 MiB/s
        aes-xts        256b      1568.8 MiB/s      1568.1 MiB/s
    serpent-xts        256b       561.1 MiB/s       550.7 MiB/s
    twofish-xts        256b       334.1 MiB/s       163.0 MiB/s
        aes-xts        512b      1379.8 MiB/s      1377.8 MiB/s
    serpent-xts        512b       560.9 MiB/s       550.5 MiB/s
    twofish-xts        512b       333.9 MiB/s       163.0 MiB/s
HCSF
  • 153
  • 6
  • please indicate the version of LUKS you are using (ie. v1 / v2) as well as the version for `cryptsetup` - AFAIK grub only supports LUKS1 at this point in time – brynk Feb 20 '21 at 23:23
  • ps. the encrypt/ decrypt speed has nothing to do with the 'unlock the volume' speed - while both are dependent on your hardware, the 'unlock' speed is 'tunable' and should be as long as your patience can bear – brynk Feb 20 '21 at 23:24
  • @brynk I just added the version of `cryptsetup` to the title. According to [this](https://www.phoronix.com/scan.php?page=news_item&px=Cryptsetup-2.0-Released), `cryptsetup` supports LUKS2. And by default I think `cryptsetup` uses LUKS2? Yes, grub only supports LUKS1. But the partition I am going to encrypt doesn't contain OS stuffs. Purely data. I don't mind unlocking manually after reboot. So grub's support for LUKS2 shouldn't matter in my case? Thanks! – HCSF Feb 21 '21 at 03:05

1 Answers1

3

First of all, your runtime encrypt-decrypt speed will be dependent on how fast your CPU/ core can handle AES, and whether it's implemented in software or on the hardware - the method by which the volume key is unlocked has almost no bearing on runtime operations.

I'll propose LUKS2 in mode aes-xts-plain64 (or aes-xts-plain depending on your kernel) which is AES-256, and a secondary 256-bit key for the 'tweak' that XTS mode uses to determine the IV from the on-disc block. (TODO REF.)

To unlock the volume key I propose Argon2id, ie. cryptsetup --pbkdf=argon2id, tuned as hard as your patience can tolerate.

Your 5GB volume doesn't introduce any problems - it's large enough for the standard key-slot area^ and small enough that you don't need to use aes-xts-plain64, however, you can use plain64 if your kernel supports it, but is limited to more recent kernels, from the manpage: XTS mode requires kernel 2.6.24 or later and plain64 requires kernel 2.6.33 or later.

You should leave the source of random bits as the default /dev/urandom unless you have a good reason to change it. I don't believe cryptsetup allows you to provide other files to source random bits, however, you can provide the master key file - this is the "volume key" that you ask about. Left alone, the new volume master key is created from random sources. If you do specify the volume key via cryptsetup --master-key-file then be sure it is high-entropy. A circumstance where this might be useful is when you wish to be able to (re)produce the volume key from another technique.

I propose the use of at least SHA2-256 as the digest hash. I use SHA2-384 as this is a truncated SHA2-512 hash, which is slightly more efficient on 64-bit architecture, among other reasons.

cryptsetup --type=luks2 --pbkdf=argon2id --type=luks2 --pbkdf=argon2id --pbkdf-memory=1048576 --pbkdf-parallel=4 --pbkdf-force-iterations=8 --hash=sha384 luksFormat /dev/new_luks_partition1

Memory is in KiB, so 1 GiB, and the number of iterations are forced to be 8, rather than allowing the software to auto-tune. (If you want auto-tuning don't set --pbkdf-force-iterations.)

Also, be sure to add at least a second key-slot for recovery purposes (you can use the same password if you wish, as a new salt is created for each key slot, but keep the hardness tuning the same) - you can add up to eight of these by default. Each key slot basically holds an encrypted copy of the volume key. This is in addition to taking a header backup, described later.

cryptsetup --batch-mode --type=luks2 --pbkdf=argon2id --pbkdf-memory=1048576 --pbkdf-parallel=4 --pbkdf-force-iterations=8 --hash=sha384 --priority=ignore luksAddKey /dev/new_luks_partition1

Now that the volume is created and one or more additional key-slots are populated, take a back up of the volume header. Note this backup file is still encrypted.

cryptsetup luksHeaderBackup --header-backup-file=~/luks-202102 /dev/new_luks_partition1

diff -sq <(cryptsetup luksDump ~/luks-202102) <(cryptsetup luksDump /dev/new_luks_partition1)

To produce a plain-text copy of the volume key, use the following (you'll need to re-type your password):

cryptsetup luksDump --dump-master-key ~/luks-202102

Good discussion on the LUKS header (2019): What does LUKS header contain?

More on aes-xts-plain vs aes-xts-plain64, viz. cryptsetup FAQ s5.15

First, "plain" and "plain64" are both not secure to use with CBC, see previous FAQ item. However there are modes, like XTS, that are secure with "plain" IV. The next limit is that "plain" is 64 bit, with the upper 32 bit set to zero. This means that on volumes larger than 2TiB, the IV repeats, creating a vulnerability that potentially leaks some data. To avoid this, use "plain64", which uses the full sector number up to 64 bit. Note that "plain64" requires a kernel 2.6.33 or more recent. Also note that "plain64" is backwards compatible for volume sizes of maximum size 2TiB, but not for those > 2TiB. Finally, "plain64" does not cause any performance penalty compared to "plain".

More on GRUB: There is support for LUKS2 in GRUB, however, I found I still had to use LUKS1 in the past when I was following this: "Ubuntu 20.04 with btrfs-luks full disk encryption including /boot and auto-APT snapshots with Timeshift" Mutschler 2020.

Something to note is that when using LUKS1 and GRUB, the time it takes PBKDF2 to run is much slower, for eg. I find that I can stomach a cryptsetup --type luks1 --pbkdf pbkdf2 --pbkdf-force-iterations 2000000 without GRUB, but only say --pbkdf-force-iterations 400000 if using GRUB.

brynk
  • 1,016
  • 4
  • 14
  • Thanks for your detailed explanation. I am on 3.10. I am probably going to install elrepo's kernel-ml 5.11 because there is a patch to increase the performance in dm-crypt by dropping the workqueue for encryption and decryption. – HCSF Feb 21 '21 at 08:27
  • Few questions: 1. master key = volume key? 2. you said `a new salt is created for each key slot`, so salt is auto-generated? But according to [section 5.8 in this article](https://gitlab.com/cryptsetup/cryptsetup/-/wikis/FrequentlyAskedQuestions), it said "a reasonably-sized salt value (256 bit, e.g.) this is quite infeasible", which implies we can specify the size of the salt? but how? – HCSF Feb 21 '21 at 08:27
  • 3. On my machine, aes-cbc-plain64 seems to be significant faster than aes-xts-plain64 for decrypting (which is my main concern), any concern you would raise with cbc besides [malleability](https://crypto.stackexchange.com/questions/5587/what-is-the-advantage-of-xts-over-cbc-mode-with-diffuser)? – HCSF Feb 21 '21 at 08:27
  • 4. I noticed that you specified `--pbkdf-parallel=4 --pbkdf-force-iterations=8` instead of `--iter-time`. Then, how do I quantify a good value for `--pbkdf-force-iterations`? And let's say to simplify, we can just look at how long for each iteration takes, and I am willing to wait for a total of 3 mins to unlock the volume. How can I test this? – HCSF Feb 21 '21 at 08:32
  • I'll edit my answer for clarity and add'l detail once I have a chance, but in the first instance - 1) yes, master == volume key; 2) yes a salt is created for each key slot, as well as the volume master key; 3) that is no surprise as my understanding is that xts is two encryptions - *I need to add my reference*; – brynk Feb 21 '21 at 09:40
  • No worries and thanks for the lead on your kernel. 4) three mins is an awfully long time to wait to find out you mistyped! What I specified is ~2secs on *ol'grindy*, my gen6 i5. I aim for about 3 seconds on less critical applications, and ~9 secs with an additional keyfile for my higher-security concerns. My passwords are (almost) exclusively higher-entropy ones that come out of a password database (*KeepassXC* using `.kdbx4`, argon2id and a key file). If you're on a desktop/ laptop machine, you can afford to bump the memory usage up to four or eight GiB. Iterations can come up to 10 or 15. – brynk Feb 21 '21 at 09:50
  • There's another performance optimisation that may have made it into the kernel by now, whereby the thread or core that performs the de-/ encryption handles the read/ write, with less ipc/ queueing ... ["Speeding up Linux disk encryption"](https://blog.cloudflare.com/speeding-up-linux-disk-encryption/) *Korchagin 2020* – brynk Feb 21 '21 at 09:56
  • My machine is a server grade machine with 128GB RAM (will add probably 256GB later) in a datacenter. And during certain holidays, the datacenter will block out all external accesses for up to 10 days. Hence, I would rather to increase the security by increasing the protection on the key(s). So you think bumping up the memory to like 64GB is better in terms of security? – HCSF Feb 21 '21 at 09:59
  • yes, the link in your comment just now is the one I read. that's why I tried to use elrepo's kernel-ml, which is 5.11 now that includes the patch mentioned. – HCSF Feb 21 '21 at 10:00
  • since the iteration you suggested isn't measured in time, is there a way to find out before I actually pick the # of iterations? Just trying to pick one that is close to 3 mins. – HCSF Feb 21 '21 at 10:02
  • More RAM requirements for a2id will slow things considerably, maybe to the point of frustration. You want to be able to keep your iterations above three, and there are some thoughts out there that say one thread, 10+ iterations. One more link for XTS mode: ["Who Needs a Tweak? Meet Full Disk Encryption"](https://medium.com/asecuritysite-when-bob-met-alice/who-needs-a-tweak-meet-full-disk-encryption-437e720879ac) *Buchanan 2019* – brynk Feb 21 '21 at 10:06
  • One more thought - you're working remotely, so you can increase the entropy in your password considerably: 15-chars lowercase+numeric, with algebra thrown in randomly, is a computational cliff of hardness that can still be typed accurately by mere mortals .. some may even be able to train themselves to remember it – brynk Feb 21 '21 at 10:16
  • I am not an crypto expert. Based on your link to Buchanan's article, it seems like he is saying CBC is more secured than EBC. But CBC relies on the previous plaintext block to encrypt next block, hence, it creates some performance issue. XTX solves this performance issue. Then, XTX should be faster. But for some reason, my `cryptsetup benchmark`'s output says otherwise. And Buchanan doesn't seem to suggest XTX is more secure? – HCSF Feb 21 '21 at 10:21
  • you mentioned you use a key file. Do you think it is safer to scp a key file each time when I need to unlock than just enter a self-believe high entropy passphrase? – HCSF Feb 21 '21 at 10:27
  • you said, `there are some thoughts out there that say one thread, 10+ iterations`. Any link to share? Want to have a read. Thanks. – HCSF Feb 21 '21 at 10:30
  • Yes you're interpretation is correct, barring XTX being faster - the 'tweak', as it's called, is a preliminary encryption of the sector number to produce the IV, then the 'actual' de-/encryption occurs to handle the blocks in the sector - I had it in my head that the preliminary encryption had to occur for every block, but it seems to only occur per-sector, so a minimum overhead of one additional encryption per sector. KEYFILE - if you can place it in RAM for the duration it is needed, but there are other ways. – brynk Feb 21 '21 at 10:38
  • 1
    ARGON2 1 thread + more iters no ref at this stage, in the *Argon2id* spec, [s9.2 Security against time-space tradeoff attacks](https://tools.ietf.org/id/draft-irtf-cfrg-argon2-05.html#rfc.section.9.2) - **in fact now that I re-read this, if you up the RAM you must also up the iterations**: *to eliminate time-memory trade off attacks, the binary logarithm of memory minus 26*, which informed my minimum 4 iterations for 1GiB, so, `log2( RAM-in-bytes ) - 26` – brynk Feb 21 '21 at 10:42
  • thanks for sharing. I will test out `fio` with `cryptsetup --type=luks2 --pbkdf=argon2id --pbkdf-memory=16G --pbkdf-parallel=4 --pbkdf-force-iterations=8 --hash=sha384` with `--cipher=aes-cbc-essiv` and `--cipher=aes-xts-plain64` to see which one gives a better decryption performance. And also repeat the test with kernel-ml 5.11. – HCSF Feb 21 '21 at 14:32
  • Just FYI, the max value for pbkdf-memory in `cryptsetup` 2.0.3 is 4GB. And to get the synchronous feature in the link you posted, it requires `cryptsetup` 2.3.4, which is quite painful to build on CentOS 7. – HCSF Feb 23 '21 at 08:28
  • @brynk Actually, that refers to Argon2i specifically if I'm not mistaken. The advice given for **Argon2id** is either **2GiB + 1** iteration / t / pass / whatever you want to call it (the different terminologies from the rfc to the paper to even within the paper were a bit confusing, I hope I'm reading this right) **64MiB + 3** iterations/passes/t-value. This is from the short recommendations section in the latest RFC, two paragraphs down (section 7.4; note that section 9.2 from your draft version got renamed to section 7.2): https://tools.ietf.org/html/draft-irtf-cfrg-argon2-13 – Luc Apr 05 '21 at 23:08