21

This question is a fork from a previous question here: Is it safe/wise to store a salt in the same field as the hashed password?

Assume you run a web portal, and store passwords in SHA1 hashes. How do you upgrade this to BCRYPT hashes instead?

Typically, you'd wait for users to log on, and re-hash their passwords (from the plaintext they entered) to BCRYPT. But it could be a while before all users to logon to your platform, and there will always be inactive users who will never come back.

A separate proposal (which I first heard from Troy Hunt) was to BCRYPT the existing SHA1 passwords in your database. Effectively you'd be BCRYPT(SHA1(plaintext_password)). This way all users on the system get upgraded to BCRYPT at once, regardless of their activity.

This way, a breach on your database, doesn't expose users who haven't logged in yet and still on SHA1.

The question is:

  1. Is BCRYPT(SHA1(plaintext_password)) is equivalent in security to BCRYPT(plaintext_password)
  2. If Not --Why? And is the gap reasonable enough to consider this option?

The question focuses on BCRYPT(SHA1) but could easily apply to any two hash algorithms with the stronger one being applied last.

keithRozario
  • 3,631
  • 2
  • 12
  • 25
  • 8
    Be careful when combining hashes - unexpected side effects like this can occur: https://blog.ircmaxell.com/2015/03/security-issue-combining-bcrypt-with.html – Royce Williams Apr 10 '18 at 04:37
  • 1
    Thanks @RoyceWilliams good write-up. You've pointed out some real-world examples of how this can be bad, but I can't help wondering, is the danger of this happening high enough that keeping a subset of users on SHA1 a worthwhile trade. I'm not arguing that BCRYPT(SHA1()) is better than BCRYPT(), but from a practical migration perspective, BCRYPT() is just not feasible. – keithRozario Apr 10 '18 at 05:09
  • https://blogs.dropbox.com/tech/2016/09/how-dropbox-securely-stores-your-passwords/ is a great example of password security – exussum Apr 10 '18 at 11:38
  • While I applaud your attitude and concern for security, no *system* (e.g. a web/mobile application with associated backend and persistence layer) is more secure than the weakest link in the chain. Be sure you don't have lower hanging fruit than how hard your hashes are to crack if your DB is stolen. – Jared Smith Apr 10 '18 at 12:54
  • 3
    Would it be possible to do a combined hash like BCRYPT(SHA1()) (or some combination like what the answers recommend) for all users, then change to BCRYPT() when they log in? Then, any shortcomings or flaws in the combination will be mitigated by logging in, but you have the ability to use a stronger combination in the meantime. – mbomb007 Apr 10 '18 at 14:42

2 Answers2

22

First of all, thank you for taking the time to determine how to do this correctly and improve security for your users!

Migrating password storage while taking legacy hashes into account is relatively common.

For your migration scenario, bcrypt(base64(sha1(password))) would be a reasonable balance. It avoids the null problem (important - you definitely don't want to leave out the base64 stage!), sidesteps bcrypt's native 72-character limit, and is 100% compatible with your existing hashes.

In the basic case, you would simply hash all existing SHA1s with bcrypt(base64(sha1)), and then hash all new passwords with the full sequence. (You could also use SHA256 instead, though this would increase your code complexity slightly, to check to see if SHA1 or SHA25 was used (or to just try them both and pass if either succeeds) Long term, SHA256 would be more resistant to collisions, so would be a better choice).

For resistance to bruteforce, this would not only be equivalent to bcrypt, but would be theoretically superior (though in practice, 72 characters is so large for password storage than they are effectively the same).

Bonus advice:

  • Be sure to use a bcrypt work factor that is high enough to be resistant to offline attacks - the highest value that your users can tolerate, 100ms or higher (probably at least a work factor of 10). For speeds under 1 second, bcrypt may actually be slower for the attacker (better for the defender) than its modern replacements scrypt and Argon2 (YMMV).
  • Store the default value for the work factor as a system-wide configurable variable, so you can periodically increase it as your underlying hardware speeds get faster (or more distributed).

UPDATE: Note that in general, wrapping a fast hash inside a slow hash is an anti-pattern (that should be reserved for migration or temporary purposes only). See my answer here for more information. In a nutshell, even if not yet cracked themselves can be cracked inside a slow hash and then cracked much faster directly. This is true both for targeted attacks against high-value users and against larger sets of target hashes. This attack is called 'hash shucking' and is well-known to advanced password-cracking researchers. This is definitely non-intuitive, so please see my other answer for extended discussion, rather than attempting to discuss it here.

Royce Williams
  • 9,318
  • 1
  • 32
  • 55
  • 14
    I think that perhaps the null problem ([Security Issue: Combining Bcrypt With Other Hash Functions](https://blog.ircmaxell.com/2015/03/security-issue-combining-bcrypt-with.html)) deserves more emphasis so that users looking for a quick answer won't be tempted to remove the base64 encoding stage. – Andrew Morton Apr 10 '18 at 08:02
  • I haven't read through the links, but this answer itself doesn't really answer the question(s). – xehpuk Apr 10 '18 at 09:39
  • are scrypt or argon2 considered mature enough for production use yet, or is that point about relative hardness mostly theoretical at present? – Dan Is Fiddling By Firelight Apr 10 '18 at 12:47
  • @DanNeely, maturity is subjective. :) scrypt came out 9 years ago. It does have limitations that Argon2 and yescrypt are intended to address. None of these are in *wide* use for password hashing, but experts agree that they're all very good - though for ~sub-1s speeds, bcrypt is more resistant to brute force (see https://twitter.com/AaronToponce/status/982250310910410752). As long as you use a mainstream implementation (don't roll it yourself) and use the highest work factors you and your users can stand, they are all fine choices. – Royce Williams Apr 10 '18 at 13:31
  • Note also that variations in performance at sub-1s speeds varies by platform - and will vary over time. So always test for your environment, and be ready to upgrade the work factor over time (platforms like Dropbox already do this!) – Royce Williams Apr 10 '18 at 13:40
  • 1
    @RoyceWilliams Putting my question into a bit more context, I remember a few years ago the advice here was to use bcrypt (or if required for regulatory reasons pbkdf2) because scrypt hadn't been around long enough for cryptographers to feel confident that serous problems with either the algorithm or baseline implementations wouldn't turn up. – Dan Is Fiddling By Firelight Apr 10 '18 at 14:00
  • @DanNeely Totally understood - still good advice. I would expect large, sophisticated orgs to be exploring scrypt already, and Argon2 soon. Small orgs might want to wait and see - but they've both been heavily scrutinized, and have held up well so far. – Royce Williams Apr 10 '18 at 18:04
4

This will limit the keyspace to roughly 2160, as SHA-1 outputs 20 byte digests, but it is otherwise fine. In fact, bcrypt has an input limit of 72 bytes, so it is not at all uncommon for people to hash a password first using a fast, cryptographically secure hash. Because the limit is larger than the digest size of SHA-1, you may instead want to use a hash with a larger digest. Note that the null problem requires the digest be converted into another encoding, such as base64.

If you already have a database full of SHA-1 hashes, it is perfectly fine to run them through bcrypt.

forest
  • 65,613
  • 20
  • 208
  • 262
  • This answer completely ignores the [null problem](https://blog.ircmaxell.com/2015/03/security-issue-combining-bcrypt-with.html) mentioned in other answers. TL;DR: bcrypt implementations may truncate the input if any zero bytes occur in the input stream. – Ben Apr 10 '18 at 15:19
  • 1
    @Ben That's an implementation issue. I was assuming that the input hash would be in base64 or another similar encoding. I will update my answer to clarify. – forest Apr 11 '18 at 01:15
  • 1
    If legacy SHA1 Hashes are stored in the DB they are most likely ASCII encoded already, so not affected by the NULL problem. – eckes Apr 19 '18 at 02:08