5

Hashing multiple times (rounds) seems like a standard practice in password hashing to increase the work factor.

Surely, everyone will agree increasing the work factor to compute the password hash is a good thing but I was wondering if we were not reducing the random factor of the hash at the same time.

My theory

Let's take an hash with a n-bits output.

Is it possible that you will never obtain some of the n-bits possible values when you input every possible n-bits possible values?

This would mean that some hash values are more likely than others since they can be obtained from multiple entry values. It also means that the hash function is creating subcycle which can be very bad.

My test

I wanted to check if my theory had some merits but, obviously, I didn't have the computing power necessary to analyze a real hash function so I created a simplistic one.

My function output the first 8 bits of SHA512.

I then input every possible 8 bits values into this function and saved the results.

Definitions

  • Root : An 8 bit value that you never get
  • Path : All the possible values that you can obtain for single entry if you hash multiple times
  • Cycle : Values that repeat themselves when you hash multiple times

The results

  • Number of Roots : 97
  • Longest Path : 36
  • Average Path Length : 17.21
  • Number of Cycle : 5
  • Longest Cycle : 6
  • Number of values in the Cycles : 13
  • Longest Path without cycle values : 30
  • Average Path Length without cycle values : 11.72

An alarming conclusion

These results mean that if you hash one time you have 159 possibilities, but if you hash 30 times you only have 13 possibilities left.

Other tests

I added a salt that I prepended to the value on every round. I thought that it might make a difference but I obtain quasi similar results with different salt value.

My questions

I know that my experiment is over simplified but :

  • Is it possible that hashing multiple times reduce the security of a hash function if you hash too many times?
  • Is there any studies done on this subject?
Gudradain
  • 6,941
  • 2
  • 26
  • 44

1 Answers1

2

See this question for details. Roughly speaking, when iterating a hash function in a space of size N, then, after an average of roughly sqrt(N) steps, you enter a cycle whose length is of size roughly sqrt(N). With a, say, 160-bit hash function (e.g. SHA-1), both sizes will be about 280, i.e. way too large for problems to actually occur.

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955