5

Can I brute-force a password hash even if I don't know the underlying algorithm?

For example if I get hold of a database with password hashes and the used hash algorithm is unknown, like a random mix of SHA/MD5/Bcrypt/custom/Salt/Pepper.

Will a password cracking expert still be able to brute force the password hashes?

I also guess (for a public web site) that you can at least create your own "account" with a known password, so you have perhaps one known input->hash pair. Will that be of any use?

user316
  • 183
  • 1
  • 2
  • 5

4 Answers4

1

You question does not make a lot of sense to me. If you already know the hash and know what input generates it, what is the point of your bruteforce?

On the other hand bruteforcing something means to apply some procedure many times with different inputs and comparing the output with your hash. If you do not know what procedure to use, there is no way you will be able to apply it many times.

But if you know the input and the hash, you can bruteforce it in the beginning to find what algorithm is used. For example

sha1(i), sha1(md5(i)), md5(i) and so on trying many combinations of possible hash functions. You might guess partially what is used. So if you output is 160 bits long, you can guess that may be the last step was sha1.

guntbert
  • 1,855
  • 2
  • 18
  • 21
Salvador Dali
  • 1,745
  • 1
  • 19
  • 32
  • 1
    He is asking how hard/possible is it to break the algorithm, given that there are infinite possibilities e.g. `output = Gost*779317(Ecoh*441(Sha*33931(LOOP(274 times, Spectral*99228(Skein*991(Md5*77302(LOOP(931 times, Whirlpool*5503(Gost*7738(Sha*209(input)))))))))))` etc. The attacker knows the input and the output, With that, can the attacker guess the algorithm? – Pacerier May 14 '14 at 07:04
  • 1
    @Pacerier if there are infinite possibilities, the probability of guessing it is approaching to 0 :-) – Salvador Dali May 14 '14 at 07:08
  • 1
    Zz, don't twist and avoid the question. The input is still `x` bits, the output is still `y` bits. He is asking can the attacker find a function `F()` such that he can invent his own inputs and outputs which match the supposedly *secure* algorithm. Basically, the question is asking Does doing those mixture of hash functions adds to security? Is it vulnerable (exists a short-cut such that the attacker can grab `F()` without having to brute force)? – Pacerier May 14 '14 at 07:12
  • @Pacerier "Basically, the question is asking Does doing those mixture of hash functions adds to security?" are you sure this is what he is asking? I failed to understand that this is exactly what was asked. – Salvador Dali May 14 '14 at 07:18
  • Yes. Substitute "brute force" to "break" in the question and you can understand it. Its clear he wants to know whether the algorithm is breakable (secure). If there exist a short-cut such that the attacker can derive `F()` without having to brute force, then yes, it's breakable. – Pacerier May 14 '14 at 07:22
  • @Pacerier thanks. May be you are right. I asked OP if my answer is on the right track. If no, I would remove it. – Salvador Dali May 14 '14 at 07:26
  • You can leave the comments here so they can help the next answerer. Allow me to explain furthur. You have a function `output = input * 5 * 3 * 6 * 3 * 10 * 3 * 6....`. You can times and times *infinitely*. I don't know that your algorithm is `5 * 3 * 6 * 3 * 10 * 3 * 6....`, but I have your input (say `2983365892`) and output (say `268102931614472333845596`) and can simply derive a new algorithm `output = input * 89865923698263` and even though I didn't "brute-force" your algorithm, I have broken it because I can now create my own inputs and outputs which match your algorithm. – Pacerier May 14 '14 at 07:38
  • @Pacerier A secret hash algorithm is never secure by virtue of being secret: it makes no sense since, in order to use the system at all, you must provide a way to use it and, thus, make it non-secret. At best, you can restrict the use of that system so that limit the risks of leaks but as soon as it's leaked, you whole security system crumbles – Stephane May 14 '14 at 07:53
  • @Stephane, Yes when it's leaked it crumbles. But that's not answering the question. The question is asking with a set of inputs and a set of outputs, is the algorithm "hash function mixing" breakable? – Pacerier May 14 '14 at 08:16
  • @Pacerier I'm afraid you can't have it both way: you can't say we're not answering the original question when you change it yourself so that it makes (some) sense to you. It's still non-sensical. You've established (correctly) that the naive answer is "yes, given the conditions of the proposition, a simple algorithm can be found matching both input and output" but, in this forum, it makes a lot of sense to point out that the answer to the implied question is "no, it makes no security sense to design a system that way" – Stephane May 14 '14 at 08:27
  • @Stephane, I don't see why you think the question doesn't make sense. Neither have I established an answer to the question. I have stated infinitely doing the *times* operation is weak, but the question is not that. Do you equate a simple *times* operation with that of a hash function? – Pacerier May 14 '14 at 08:34
  • 1
    It doesn't make sense because, as you explained, a simple lookup table will answer the literal question (if you take abstraction of the incorrect use of "brute forcing" anyway). – Stephane May 14 '14 at 08:46
  • @Stephane, How would you obtain *the* lookup table in the first place then? The attacker obviously don't have all the inputs and outputs. He only have a portion of it. So with this set of inputs and outputs, how would you get the lookup table? – Pacerier May 14 '14 at 09:43
  • 1
    That is in the question: you have both the input and the output of the function. The question could be translated as "given two points on a plane, can you find a path between these two points". Of course, that question makes little sense so you have to extrapolate – Stephane May 14 '14 at 09:51
  • let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/14510/discussion-between-stephane-and-pacerier) – Stephane May 14 '14 at 10:01
  • @Stephane. Yes, let's continue there. – Pacerier May 29 '14 at 15:39
1

It will be more difficult to brute-fore a hash without knowing the hash function used -- you'll have to guess both the hash function and the data to hash. However, there's no reason to keep the hash function secret.

First, it's usually quite apparent from the length of the hash (or from setting up a dummy account and trying many obvious possibilities) what algorithm was used to generate it. It's generally a bad idea to modify existing hash functions to "roll your own" hash function, as (1) its unnecessary, (2) will be a nightmare to maintain across various systems, (3) if done wrong can drastically weaken the security, and (4) how the hash function is calculated is going to be in the source code/executable and a determined attacker who compromises the list of hashes, may also be able to get their hands at the source code/executable and be able to reverse engineer the hash function from there.

If you really want to go down the route that the hash should not be brute-forceable (until the compromise the application source code or executable), you shouldn't use a hash but should use a MAC (Message Authentication Code). You can build a MAC out of a hash function using the HMAC construction: HMAC(password) = H(K1 ++ H(K2 ++ password)) where H is a hash function, K is a secret key (stored in the application computing the HMAC), K1 = K XOR 0x5c5c...5c and K2 = K XOR 0x3636...36 are two different secret keys derived from K (where 0x5c5c...5c and 0x3636...36 is a block of the byte 0x5c or 0x36 repeated with the same length as the secret key). So if you have a 256-bit secret key and use SHA-256 as your hash function, there are about 2256 potential HMAC keys (~1077 -- for comparison there's only been about 1026 nanoseconds since the big bang) an attacker would have to iterate through before they find the K used (if they in fact find the correct K and not an unlikely collision that works for one specific password but not others).

MACs are usually used to verify message integrity (in that only someone with the secret key can generate a valid MAC) and not for storing passwords/authenticating users, but there's no reason they couldn't be used for this purpose. The problem is the same -- you do not want someone who obtained a message and the associated hash (MAC), be able to create new hashes (MACs) for other messages (as part of a brute-force attack). MACs are believed to have security against this attack; while choosing some random complicated scheme like SHA1(SHA512(pw)++SHA256(MD5(pw)++MD4(pw))) is in principle much easier for an attacker to guess than a random 256-bit key (and more likely to be leaked from some sort of side-channel attack).

Granted using a MAC is probably overkill. If you use strong passwords and a modern key-strengthened salted hash (bcrypt, PBKDF, SHA512crypt) then brute-forcing is simply not feasible.

dr jimbob
  • 38,936
  • 8
  • 92
  • 162
0

Given that the attacker has a single known input and output he can reverse engineer the hashing algorithm with brute force. Varying the hashing algorithm for each password makes it a lot more difficult for him but it is still not impossible.

Daniel
  • 101
0

The hashing algorithm is guessable, but this is not the way how you can secure your passwords in the database. If somebody can steal your entire database, then I assume the problem is much wider in your system as you think, and he will got your source code as well... From there it is not a long step to find the hashing algorithm you use.

What you should really do is using a real password hashing algorithm, which is intentionally very slow. So forget sha1 or md5, and use bcrypt (if not possible, than use the previous ones with salt and with the currently recommended iteration count).

inf3rno
  • 479
  • 1
  • 7
  • 19