7

It's established wisdom to hash password multiple times with a salt to increase the time it takes per brute force iteration. At the same time (unless the algorithm guarantees otherwise) there's a minuscule but non-zero chance of ending up with a fixed point or cycle, whereby (for example)

hash(hash(hash_so_far + salt) + salt) = hash(hash_so_far + salt)

or

hash(hash(hash(hash_so_far + salt) + salt) + salt) = hash(hash_so_far + salt)

etc.

Such identities or short* cycles would be security black holes - more iterations wouldn't mean harder to crack - and an algorithm where such cycles are common would be worse than useless, leading to a couple obvious questions:

  • Are the mean and median cycle lengths known for common algorithms such as MD* and SHA-*?
  • Are there useful* algorithms which guarantee that the smallest cycle is the entire output space?

* AFAICT, any useful hashing algorithm (finite output, deterministic) would have at least one "trivial" cycle, when the output space has been exhausted.

l0b0
  • 3,011
  • 21
  • 29
  • 1
    As a succinct answer-as-comment, if a hash algorithm is demonstrated to have this property, called "sticky-state", it is understood to be insecure. KDFs are built on top of secure hashes that have no known cyclical states, and are rigorously tested both theoretically and empirically to ensure that the derivation process itself does not produce cyclical state. – KeithS Dec 27 '12 at 16:13

1 Answers1

7

Suppose that you have a pseudorandom function with an output of n bits. A good hash function with a given salt ought to behave like a PRF.

The general, average structure of a PRF, with regards to repeated application, is to have one big "cycle": if you hash and rehash repeatedly, you will, at some point, enter a cycle and obtain a value which you already encountered. The average length of the cycle will be about 2n/2. The number of steps you will have to perform before reaching the cycle will be also about 2n/2. Almost all initial values will bring you, ultimately, to this inner cycle. There can be a few initial values which will not lead you to this cycle; there could even be some fixed points (there is a probability of 63.2% that there exists at least one fixed point); but the number of such potentially troublesome points will be no more, on average, than 2n/2.

To sum up, if n is large enough that a probability of 2-n/2 can be neglected, then things are fine. In particular, if n = 256 (you are using SHA-256), then you can stop worrying. Even with MD5 (n = 128), short cycles are sufficiently rare that they will not imply any actual security issue.

A full-space cycle would mean that your hash function is not a PRF, but a permutation -- and a permutation with only one cycle. Building a one-way permutation is possible, but quite harder than a one-way function for which it seems that you can just throw together some bitwise operations and rotations; for a permutation, you need mathematics (e.g. with exponentiation modulo a non-prime integer) and this will imply some extra trouble (the permutation will be on average one-way, but there may be values for which it will not be).

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955