10

Why does this example from the PHP manual give different results every time it runs?

echo password_hash("rasmuslerdorf", PASSWORD_DEFAULT);

And then, password_verify() knows that ALL those hash match "rasmuslerdorf"! It is like magic to me even the doc stated clearly:

Note that password_hash() returns the algorithm, cost and salt as part of the returned hash. Therefore, all information that's needed to verify the hash is included in it. This allows the verify function to verify the hash without needing separate storage for the salt or algorithm information.

This function is safe against timing attacks.

echo password_verify ( 'rasmuslerdorf' , '$2y$10$EMawXU7qNS4GzU2Do8bByeb7sSQZxecvmZ6mBrToxsOaY7RMAIGua' );  //=>true
echo password_verify ( 'rasmuslerdorf' , '$2y$10$0vMA2k7LxTBstI/J7clkkuZZ/XtuS1fklVuoM6sl4Fc/aj1avQa5u' ); //=>true
echo password_verify ( 'rasmuslerdorf' , '$2y$10$iuE2EzHMNONAWFKh/4Wyl.dcBxgFaNzAh32va0/gyE4ScqnNr/Uc.' ); //=>true

What is going on? How does password_verify() know some crazy string match 'rasmuslerdorf' but hackers don't?

Anders
  • 65,052
  • 24
  • 180
  • 218
Phung D. An
  • 1,051
  • 2
  • 11
  • 13
  • 3
    They're salted. I think if you read through [How to securely hash passwords?](https://security.stackexchange.com/a/31846/151903) you'll get a better idea of what's going on. – AndrolGenhald Jun 20 '18 at 16:37
  • @AndrolGenhald : read it many time, guess I have to read it again. Thanks – Phung D. An Jun 20 '18 at 17:05
  • 1
    [This answer](https://security.stackexchange.com/a/51983/151903) may have a more easily understood explanation of salts. – AndrolGenhald Jun 20 '18 at 17:15
  • "Hackers" do know this, which is why the database/location of where your hashes should be secured. – Shane Andrie Jun 20 '18 at 17:21

2 Answers2

13

The password_hash function, internally, carries out these steps:

  1. It picks a fresh random salt each time you call it.
  2. It applies a costly hash function that takes as input the random salt, the password, and other algorithm parameters (e.g., cost factors).
  3. It combines the algorithm parameters, random salt and hash output into an output string that can be parsed to recover them individually.

The random choice in #1 is the reason why it produces different output each time even though you supply the same input. The formatted output from step #3 is what allows password_verify to know the random salt that was chosen by password_hash.

Luis Casillas
  • 10,361
  • 2
  • 28
  • 42
  • 3
    That's why you often get hash results starting with the same first characters (`$2y$10$`): it indicates the hash type and cost (AFAIK). – Xenos Jun 21 '18 at 07:56
4

The key to creating a useful cryptographic hash is for the algorithm to be non-reversible, but consistently repeatable. That is:

  • Given just the output of password_hash('rasmuslerdorf'), you can't get back the string 'rasmuslerdorf'
  • Given the input 'rasmuslerdorf', you can generate the same output as a previous call to password_hash('rasmuslerdorf')

This allows you to check the user's password attempt against the hash, without ever being able to retrieve their password.

The simplest way to get a repeatable hash is for it to depend only on the input - so, password_hash('rasmuslerdorf') would always return the same value. But that means an attacker can calculate the hashes for a bunch of common passwords, and search a stolen database for a match.

So instead, a good hash algorithm adds a salt, which is just a random string added to the password, to make the hash different each time. In order to repeat the hash function later and get the same answer, you need to know which salt was used when it was stored.

This is what password_hash actually outputs, combined into one string:

  • The result of hashing the input with a particular salt
  • The random salt that was used
  • The hash algorithm that was used
  • Any other options that control the hash algorithm (e.g. the number of rounds used to deliberately slow down hashing)

This output will be different each time, so we need a different function for repeating the hash with a user's input to see if we get the same answer; this is what password_verify is for:

  • From the stored string, find the algorithm, options, and salt
  • Run the hash algorithm with those parameters, and the password the user tried to log in with
  • Check if the result matches the hash part of the stored string

If running the same algorithm with the same options and hash results in a different result, the user must have entered the wrong password.

IMSoP
  • 3,790
  • 1
  • 15
  • 19