48

Nota bene: I'm aware that the good answer to secure password storage is either scrypt or bcrypt. This question isn't for implementation in actual software, it's for my own understanding.

Let's say Joe Programmer is tasked with securely storing end user passwords in a database for a web application; or storing passwords on disk for logins to a piece of software. He will most likely:

  1. Obtain $password from the end user.
  2. Create $nonce as a random value about 64 or 128 bits large.
  3. Create $hash = SHA256($nonce$password) and store $nonce together with $hash in the database.

Question one:

Why isn't the following substantially better than the above?

  1. Create $long_string once and only once. Store this as a constant in the application code. $long_string could f.x. be 2 kilobyte of random characters.
  2. Obtain $password from the end user.
  3. Create $mac = HMAC-SHA256($long_string)[$password] (i.e. create a MAC using the end user password as key) and store this $mac in the database.

I would imagine the HMAC has the following benefits?

  • Collisions are less frequent?
  • It is computationally somewhat more expensive than plain hashing? (But not anywhere near scrypt, of course.)
  • In order to succeed with a brute-force attack within a reasonable time, and attacker would need to gain access to two things: 1) the database, where $mac is stored, and 2) the application code, where the original $long_string is stored. That's one better than a hash function, where the attacker only needs access to the database?

But still, nobody seems to suggest using an HMAC, so I must be misunderstanding something?

Question two:

What would the implications of adding a salt value $nonce be?

  1. Create $long_string once and only once. Store this as a constant in the application code.
  2. Obtain $password from the end user.
  3. Create $nonce as a random value about 128 bits large.
  4. Create $mac = HMAC-SHA256($long_string)[$nonce$password] and store $nonce and $mac in the database.
Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955
  • Related (but not the same): [Password Hashing add salt + pepper or is salt enough?](http://security.stackexchange.com/questions/3272/password-hashing-add-salt-pepper-or-is-salt-enough) – Jacco Jun 14 '11 at 10:23
  • BTW: PBkDF2 uses HMAC iterated with salt and is quite often used for password hashing – eckes Aug 09 '18 at 21:50

2 Answers2

36

The point of the salt is to prevent attack cost sharing: if an attacker wants to attack two passwords, then it should be twice as expensive than attacking one password.

With your proposal (your "question 1"), two users with the same password will end up using the same MAC. If an attacker has read access to your database, he can "try" passwords (by recomputing the MAC) and lookup the database for a match. He can then attack all the passwords in parallel, for the cost of attacking one. If your long_string is an hardcoded constant in the application source code, then all installed instances of the application share this constant, and it becomes worthwhile (for the attacker) to precompute a big dictionary of password-to-MAC pairs, also known as "a rainbow table".

With a nonce (your "question 2") you avoid the cost sharing. The nonce is usually known as a "salt". Your long_string and the use of HMAC does not buy you much here (and you are not using HMAC for what it was designed for, by the way, so you are on shaky foundations, cryptographically speaking). Using a salt is a very good idea (well, not using a salt is a very bad idea, at least) but it does only half of the job. You must also have a slow hashing procedure. The point here is that the salt prevents cost sharing, but does not prevent attacking a single password. Attacking a password means trying possible passwords until one matches (that's the "dictionary attack"), and, given the imagination of the average human user, dictionary attacks tend to work: people just love using passwords which can be guessed. The workaround is to use a hashing process which is inherently slow, usually by iterating the hash function a few thousand times. The idea is to make password verification more expensive: having the user wait 1ms instead of 1µs is no hardship (the user will not notice it), but it will also make the dictionary attack 1000 times more expensive. Your long_string may be used for that, provided that it is really long (not 2 kilobytes, rather 20 megabytes).

HMAC may be used instead of a raw hash function to strengthen a password-verification system, but in a different setup. Given a system which checks passwords with salts and iterated hash functions, you can replace the hash function with HMAC, using a secret key K. This prevents offline dictionary attacks as long as you can keep K secret. Keeping a value secret is not easy, but it is still easier to keep a 128-bit K secret than a complete database.

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955
  • Thank you! I'd like to check my understanding of your last paragraph. Fx in the case of an webapp. Is it: a) create secret key **$K** and store this as a constant in the webapp code. b) Create **$nonce**. c) Create **$mac** as HMAC-SHA256($nonce$password)[$K] d) Store **$nonce** and **$mac** in the database. Is that correctly understood? (And I understand you're not *recommending* this, only mentioning it as a possibility.) –  Apr 19 '11 at 13:19
  • 2
    "secret" and "store as a constant in the code" do not work well together. A secret key should be kept in RAM only; it may be tolerated that the secret key is stored in a private file somewhere (that's how SSH server keep the server key) but, depending on the operating system, this may or may not be easy. The whole point of using that key is that it is "more secret" than the database contents. – Thomas Pornin Apr 19 '11 at 14:11
  • 1
    @Jesper and of course, this brings you back 'round to key management, which is a sticky subject all on it's own. – AviD Apr 20 '11 at 18:36
  • 1
    @ThomasPornin upvoted but I would argue that the point of the key is to be "differently secret", rather than "more secret". Breaking into the database (e.g. with SQL injection) does not necessarily bring you any closer to the source code / binaries. I just don't see why not do it. – Ohad Schneider Aug 12 '17 at 11:16
3

HMAC doesn't buy you much here though I guess it acts as a replaceable hash function (replaceable by changing the key). Anyway it just seems like overkill here.

However, the first scheme you propose above fails catastrophically if $long_string becomes known to an attacker, which it probably will be since the code is probably not regarded as secret (and it's generally poor practice to rely on secrecy of the code anyhow).

You should be using the second scheme as $nonce is needed to protect against precomputed attacks.

frankodwyer
  • 1,907
  • 12
  • 13