53

When reading some documentation about the security of a product, I found that the vendor uses the SHA-2 of a password to encrypt data (AES-256), instead of using this password directly.

Are there any advantages of doing so?

An attacker is not going to crack the encrypted data using this SHA-2-as-a-password key but rather exhaust the password keyspace (if feasible) and try its hash. Therefore the only reason I can think of is that there is an extra computational step (the creation of the hash). I would have rather increased the password entropy if the point is to computationally complexify the attack.

WoJ
  • 8,968
  • 3
  • 33
  • 51
  • 12
    using a password as the key without any intermediate transformation is an extremely bad practice, what happens if your password is longer than the key? – Richie Frame Feb 15 '18 at 12:32
  • 13
    AES-256 expects a (random) 256-bit input as its key. Do you know anyone with a password uniformly distributed across 256 bits? – Stephen Touset Feb 15 '18 at 21:45
  • Is this a password you enter every time you want to encrypt a resource or is this service using a stored password? If the latter this may simply be a consequence of their not storing passwords in plaintext. – J.R. Feb 15 '18 at 14:03
  • So instead they store the encryption key in plaintext? – AndrolGenhald Feb 15 '18 at 14:24

4 Answers4

98

It sounds like a primitive version of a key derivation function (KDF), in particular they probably could have avoided reinventing the wheel by using PBKDF2.

There are several reasons why you don't want to use the password directly as an AES key.

  1. To distribute the bits. The main property here is that a hash function's output is, statistically speaking, uniformly distributed. People tend to pick passwords that aren't fully random, in particular, most passwords would only contain characters you can type in a keyboard. When used as an encryption key, a non-statistically random key may expose weaknesses in the encryption function.

  2. To fit the keys to the encryption key length. Most passwords are going to be either longer or shorter than the key space of the encryption function. By hashing your password, the exact key length will be exactly the size of the input key of your encryption function. While the entropy of the derived key doesn't increase, this avoids the likelihood of exposing weakness in the encryption function if you just simply zero pad the password or worse truncate the password.

  3. To slow down key derivation decryption. Per your description, the software is only using a single SHA256 round, which is not much. But with proper password based KDF, like PBKDF2, there are usually tens of thousands or hundreds of thousands of rounds of the underlying hash function. This slows down computing the keys, increasing the effective strength of passwords without increasing its length.

  4. To keep the user's plain text password out of memory, thus preventing it from being accidentally dumped to disk during hibernation or crash dump. While this wouldn't protect the hash from being used to decrypt the data you're encrypting, it will prevent the password from being reused to decrypt other files (which presumably uses different salt) or being tried on your online accounts or other devices that you use.

Tom K.
  • 7,965
  • 3
  • 30
  • 53
Lie Ryan
  • 31,279
  • 6
  • 69
  • 93
  • 9
    I can't think of any modern cipher where a non-statistically random/uniform key would expose weaknesses in the cipher. AES256's key schedule is slightly weak to similar keys, but it is never really an issue the way it is used. If a cipher _needs_ a uniform key, it is pretty badly broken. – forest Feb 15 '18 at 12:51
  • The product could have been designed before PBKDF2 became widely known. – user253751 Feb 15 '18 at 21:49
  • @immibis It's also extra complexity: you have to store a salt and an iteration count (which must be variable, to defeat precomputation attacks) together with the ciphertext. – Henno Brandsma Feb 15 '18 at 22:31
  • @HennoBrandsma Doesn't `bcrypt` also store the salt and iteration count (log 2)? I don't think that's a significant barrier. – corsiKa Feb 15 '18 at 23:01
  • @corsiKa bcrypt also has to store its parameters, sure. I was just hypothesising that the designers wanted no complexity, then hashing is simplest. – Henno Brandsma Feb 15 '18 at 23:05
  • @forest, you are correct that the structure of a key shouldn't matter, but consider cases like RSA where you definitely need some preprocessing (although that isn't relevant to this case). Also note that while AES ideally should be immune to related key attacks, AES-256 has deficiencies in its key schedule. – Jeffrey Goldberg Feb 16 '18 at 06:20
  • @JeffreyGoldberg I was talking only about a modern symmetric cipher. Asymmetric cryptography is much more complex and does require preprocessing. – forest Feb 16 '18 at 06:29
  • Yeah, my RSA thing doesn't really apply. But the AES-256 key schedule weakness certainly does. – Jeffrey Goldberg Feb 16 '18 at 07:19
  • 4
    @xDaizu: SHA2 output isn't hex, but a 256-bit string. The hex format you often see is only a representation/encoding of that bit string. When using the output of SHA256 for the key of AES256, you wouldn't use the hex representation of SHA256, but just bit string directly. – Lie Ryan Feb 16 '18 at 09:06
  • @xDaizu All noteable modern encryption and hashing algorithms deal directly in binary data... Not hex strings... Your point makes no sense. – Luke Park Feb 16 '18 at 09:55
  • The idea about adding computational cost to brute forcing simpler passwords/keys is insightful. Nobody really wants to remember a 32 character password. – trognanders Feb 17 '18 at 08:13
14

SHA-256 will generate a 256-bit hash from arbitrary length passwords. This hash can technically (as in it's the right length) be used as a key for AES-256.

Without more context, I'm guessing that they went for the simplest way to generate a 256-bit key.

As you mentioned, the weak point here is the password, and a single SHA-256 of the password is too cheap to prevent brute-force attacks on the password.

Instead, one should use a password-based key derivation function (PBKDF). One also shouldn't be using the key directly, but instead use it to encrypt keys generated using a better CSPRNG.

You can find a very good discussion of this topic in https://crypto.stackexchange.com/questions/22678/how-secure-is-it-to-use-password-as-aes-key.

Marc
  • 4,151
  • 1
  • 18
  • 23
5

Without any context, it's hard to answer. This could simply be a naive password-expansion mechanism or it could be something else. For instance, it could be that another party will need to decrypt the data and therefore store the necessary key.

By using a hash, it would then provide some level of protection for the original user's password. Not much, mind you, but still far better than simply storing the password itself.

Stephane
  • 18,607
  • 3
  • 62
  • 70
-1

I think the main reason for SHA-256 a password is quite simple. You dont want to know the password. If you know the password you need to take some extra precautions in order to protect it propperly. Since a breach could expose the password it would lead to possible other attacks.

Also usually the software does not use the password itself to generate the hash, but some random junk is usually added to it to ensure that 2 users who picked "Pa55w0rd" dont show up with the same hash.

This added salt makes it very hard for the attacker to guess the password, even if he has access to the final hash. There are precomputed resuilts for simple SHA-256 operations out there, and they cover a great deal of the normal key space already (Called rainbow tables) These could be used to find a password related to an account. But these tables fail badly if you add a few bytes of random junk in fromt of the password, and they are even less usefull is each user has its own "random junk"

The only "drawback" is that you cant recover your password, and only reset it. (I hate online services that can send me my password since it shows they dont care)

As others mentioned this results in a convinient 256bit blob of data, which could be used as an 256 AES key. It also means that a password reset would invalidate all encrypted data (Which you might want depending on the data)

  • I think you misunderstood the question. I am not asking why a password should be hashed when stored but why it is transformed to a SHA-2 hash before being used as a key to encrypt something (which the accepted answer addresses nicely) – WoJ Feb 20 '18 at 13:23