55

Let's say I have a database with a bunch of users in it. This user database would typically have a hashed password per user. Would it be bad practice to prefix this hash with the hashing algorithm used?

For instance, instead of the hash aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d, I store sha1_aaf4c61ddcc5e8a2dabede0f3b482cd9aea9434d, since the method for making the hash is SHA1.

I noticed Argon2 does this, and I actually think it is quite convenient, because it makes it easier to partially migrate to newer hashing algorithms for newer users over time.

I don't use SHA1 in my production code for passwords. SHA1 was chosen randomly. This question is not about the use of the hashing method.

  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackexchange.com/rooms/85699/discussion-on-question-by-mathias-lykkegaard-lorenzen-is-it-bad-practice-to-pref). – Rory Alsop Nov 13 '18 at 13:54

5 Answers5

119

Many different password hashing systems do this. The type of hashing mechanism used should not be (and should not need to be) a secret. Kerckhoffs's principle says:

A cryptosystem should be secure even if everything about the system, except the key, is public knowledge.

So, if your system is properly secure, it should not matter if the hash mechanism is exposed.

schroeder
  • 125,553
  • 55
  • 289
  • 326
  • 13
    Why “should not be”? Should not _need_ to be secret is Kerckhoffs's principle, but as long as it doesn't cause you to compromise on the key safety it certainly can't cause harm either if the hash algorithm is secret too. – leftaroundabout Nov 08 '18 at 16:13
  • 9
    @leftaroundabout that's about Kerckhoffs's. Operationally and policy-wise, to classify the hashing algorithm as a secret is a problem. It attempts to provide security by obscurity, and it introduces control costs that outweigh the benefits. Not to mention the practical use case provided by Toby Speight. It's not about prepending the alo type, it's about the classification of the data about the algo type. – schroeder Nov 08 '18 at 16:35
  • 2
    _Classifying_ it as a secret is not necessary for it being secret. It may be secret incidentally, for performance reasons. – leftaroundabout Nov 08 '18 at 16:38
  • 1
    Difference between making it a secret and incidentally not disclosing it. In this case, the question is about the classification of that data, not about not disclosing it (although the context is about disclosing it). – schroeder Nov 08 '18 at 16:40
  • 19
    I was under the impression that adding "security by obscurity" is still a good thing. Systems *should* be secure, even if you know all of the details of the system, but adding obscurity adds security (to slow down attackers, to hide systems when 0-day bugs get found, etc). It this fair? – Nathan Merrill Nov 08 '18 at 17:15
  • 15
    @NathanMerrill I would say that it should be taken on a case-by-case basis. In this case, I think the benefit of having the algorithm stored with the hash significantly outweighs the very slight security benefit you'd get by obscuring it. – Jeremy Nov 08 '18 at 19:04
  • I'll have to agree with @leftaroundabout. This interpretation of Kerckhoff is in my opinion similarly flawed as the common interpretation of Occam's Razor that says "the easiest solution is always right". Making an implementation detail a secret (or somewhat obscure at least) isn't any bad as long as you do not rely on it as security measure. It makes an inexperienced attacker's life a little bit harder, though. Many companies have their "secret sauce" a.k.a. trade secrets protected via legal means (which will work even if the secret is known) but _still_ they keep their secrets _secret_. – Damon Nov 09 '18 at 12:47
  • @Damon well, what I meant was that it's no good to bundle implementation details in plaintext with a cypher just to “adhere to Kerckhoffs's principle”. That doesn't buy you anything, unless it has independent advantages (like facilitating migration to another hash, but for that, prepending the hash's name in ASCII is not the only option and arguably a pretty hacky one). Deliberately publishing implementation details is only an _advantage_ if it's done in such a way that it gives you peer review, (open source library etc.). If none of that applies, keep it simple and omit the algorithm name. – leftaroundabout Nov 09 '18 at 13:04
  • 1
    @NathanMerrill Kerkhoff's principle is used to allow hardening systems. If everything but the key is open, it can be scrutinized comprehensively. If parts of the system aren't open, anyone trying to verify its security has to reverse-engineer those parts before they can evaluate them properly. – Deduplicator Nov 09 '18 at 14:05
  • 1
    @leftaroundabout: Exactly, that's what I am saying. Publishing a secret for the sake of "because Kerckhoff!" is a wrong interpretation of that principle (in my opinion). – Damon Nov 09 '18 at 14:30
  • @Deduplicator that's true: You want it to be obscure to attackers, but not obscure to evaluators. There are definitely scenarios where this isn't possible. – Nathan Merrill Nov 09 '18 at 16:45
38

I agree with schroeder that this is OK to do. Even without a prefix, an attacker can probably figure out what algorithm you are using. And anyway, it is the strength of the hashing algorithm that protects the passwords, not the secrecy of the algorithm.

Note that many hashing algorithms and libraries already do this for you. They embed all the relevant information - algorithm, cost factors, salt, the actual hash - into one serialized string. See for instance the Modular Crypt Format or the PHP password_hash function. So don't go making up your own scheme. If you are using a descent hashing library you already got one.

Don't use SHA1 to hash passwords, though. That is not OK.

Anders
  • 65,052
  • 24
  • 180
  • 218
  • 1
    Yeah I used SHA1 because I the hash of "hello" was too long for an example in SHA512. – Mathias Lykkegaard Lorenzen Nov 08 '18 at 13:44
  • 20
    @MathiasLykkegaardLorenzen Don't use SHA512 either, unless it is just a component in a bigger scheme with more iterations. But off course for an example, anything goes! :-) – Anders Nov 08 '18 at 14:05
  • 6
    @MathiasLykkegaardLorenzen And don't just use multiple iterations either! Use a proven construct like PBKDF2, which uses HMAC (not just raw hash iterations). – forest Nov 09 '18 at 02:38
  • Is SHA512 considered insecure? I did not know that at this time. Can you reference an article stating why? – Mathias Lykkegaard Lorenzen Nov 09 '18 at 08:08
  • 15
    It's not insecure for what it was designed for (i.e. hashing data), but it was designed to be *fast*. You want your password hash function to be slow to stave off brute force attempts. – hlt Nov 09 '18 at 08:29
  • But even if it's fast, does it matter if the amount of attempts that would be needed (given a salt and pepper is added) is still insanely high? – Mathias Lykkegaard Lorenzen Nov 09 '18 at 11:42
  • 4
    Yes it still matters. Hashing functions are not meant for password storage due to their speed. With a single p3.16xlarge instance on Amazon, you can do 17.28 billion guesses per second on SHA512. That means that you can crack any 8 character password of letters and numbers in only 151.17 seconds. Key derivation functions are made for storing passwords as they are slow. With bcrypt on workfactor 12 you can only do 424200 guesses per second. That would take 34.8 days per password. Additionally bcrypt and argon2 has built in salts, so you don't have to worry about salting the passwords. – FrederikNS Nov 09 '18 at 17:21
13

No, it's not bad practice, and arguably, you should keep this information.

As you observe, this allows you to change algorithm for new passwords whenever you like, without invalidating all users' existing passwords (doing that tends to make you unpopular, for some reason).

There's an argument that this allows an attacker to search out the older, weaker passwords to attack first, but you haven't reduced their security at all (assuming Kerckhoff's Principle, that the algorithm itself mustn't need to be a secret).

Toby Speight
  • 1,226
  • 9
  • 17
  • 2
    If one can store a combination of algorithms and salts, there's no need to keep weaker passwords once stronger algorithms are available. If the current entry says that putting the password through weakAlgorithm with salt1 yields X, simply compute Y by putting X through a strongerAlgorithm with salt2 then change the entry to say that putting the password through the weakAlgorithm with salt1, and then putting the result of *that* through a stronger algorithm with salt2, will yield Y. No need to wait for the user to log in. – supercat Nov 08 '18 at 21:16
9

It's good practice to store the hashing algorithm, but proper password hashing functions such as Argon2 and PBKDF2 already take care of this. It is however bad practice to use SHA1 or SHA256 alone for password hashing because they're a fixed (and relatively small) amount of work to crack using dictionary attacks and brute force. Those passwords should be migrated to a secure hashing function by rehashing the password on their next login. See https://crypto.stackexchange.com/a/45405 for details.

leo v
  • 91
  • 1
1

It's not bad practice, it's actually a common thing to do. In fact if you open the /etc/shadow file on a linux machine you will see your hashed password prefixed with $x$salt$ where x is a number indicating which hashing algorithm was used.

Now a few things to consider:

  • do not use SHA1 directly as a hashing algorithm. Use bcrypt, scrypt, or argon2 instead (which you did mention but this can't be repeated too much). These do use simple hashing algorithm like SHA1 as a base, but they run it in a special way that makes it resistant to attacks.
  • for practical reasons, since you're using a database and not a file, you might want to store the hashing method and salt in separate rows. However you can also choose to store the string starting with $x$salt$ in your database, and parse the salt and hashing method from that string at the time you check the password. It's probably fast enough that it makes no significant difference. And in both cases the security of your system will be the same.