Initial idea
bcrypt(sha1(password));
Problem #1 - Null Termination Problem
The first reason you don't want to do that is because SHA, or any hashing algorithm, puts out bytes. And many programming languages do not have proper String types; and instead simulate strings with a series of characters followed by a null (i.e. \0)
terminator. If your hash digest contains a null, the bcrypt algorithm might see the \0 character, and assume that's the end of the string:
- bcrypt(sha1("fsdf3hgfh2faff32f"))
- bcrypt(
96 87 0f 9e 71 ff 62 57 55 00 b6 5c 91 07 64 6f b5 81 13 a9)
And with C and PHP, if you blindly treated the digest as a "string", then your "string" would look like:
- bcrypt("–‡žqÿbWU
\0¶\‘doµ©")
causing some bcrypt implemetations to cut off at the \0 null terminator:
This is known as the null termination problem
Solution
Your implementation may be immune to this; or it may not. So lets not tempt fate. You can pre-hash the password, but be sure to base-64 encode the digest first:
- bcrypt(base64(sha1("fsdf3hgfh2faff32f")))
- bcrypt(base64(
96 87 0f 9e 71 ff 62 57 55 00 b6 5c 91 07 64 6f b5 81 13 a9))
- bcrypt("locPnnH/YldVALZckQdkb7WBE6k=")
Problem 2 - Hash Shucking
The next issue is has to do with dictionary attacks.
An attacker isn't going to bruteforce every possible password:
- aaaaaaaa
- aaaaaaab
- aaaaaaac
- ...
Instead they're going to use dictionaries, previous password breeches, and password that follow the rules that certain stupid corporations insist upon (e.g. password complexity policies).
- hunter2
- password
- Tr0ub4dor&3
- 12345
- qazxsw
- zxcvbn
The whole point of bcrypt is that it is still hard to brute-force all these dictionary words. But the fact remains that there are still these lists, and it can dramatically shorten the search space.
But imagine there was a password database breech, and fortunately the web-site used SHA-1 to store all their passwords, and one of the breeched SHA-1 hashes was:
96 87 0f 9e 71 ff 62 57 55 00 b6 5c 91 07 64 6f b5 81 13 a9
They don't know what the original password is, but at least it's something they can add to their dictionary list. And if your web-site does pre-hash with SHA-1, then suddenly they can try:
- bcrypt(base64(
96 87 0f 9e 71 ff 62 57 55 00 b6 5c 91 07 64 6f b5 81 13 a9))
If it matches, it means that they have the SHA-1 hash of someone's password. And since SHA-1 is so easy to compute in hardware, they now have an SHA-1 hash they can try to bruteforce.
This problem is known as 'Hash Shucking'.
Solution
What you want to do is be sure to salt the password hash:
- bcrypt(base64(sha1(password+salt)))
This way the "password hash" will never appear in any other global database.