13

So when it comes to security, when I have an idea that seems good, but no one else seems to be doing, I try to assume that I'm overlooking something obvious or otherwise significant. This is one such case...

The context of this question is an authentication system that I'm beginning work on. I've implemented similar systems in the past, and generally have my head around the "salt and hashing" thing, but I'm by no means an expert in cryptographic security.

As I was in the process of defining the storage for 16 random bytes of salt on my user records, it occurred to me that there may be a way to reduce the attack surface in the event that someone has stolen or otherwise accessed my user data. I'm operating off of the following assumptions:

  1. If a hacker steals my user data, they have access to a salted + (repeatedly) hashed representation of the user's password, and the salt value itself.
  2. If the hacker wants to determine the plain text value of that password, they have some bruteforce labor in front of them.
  3. However, given the salt, the search space of that brute force attack is greatly reduced.*

*I'm not 100% sure that point three is correct, but it makes sense to me that if a brute force algorithm can assume that the pre-hash value consists of <8-or-so-bytes-of-password> + <16-bytes-from-stolen-salt> (or similar combination of the password and salt values), then we've reduced the search space by algorithmically significant quantities.

<obligatory mid-question apology for the length of this question here>

So with the above being my current line of thinking, the solution that I'm considering essentially boils down to using known, constant values about a user to dynamically generate a salt for use during the hashing step.

Given that salt values don't really need to be cryptographically "anything", so long as they are unguessable to the point of requiring bruteforce discovery, is it sufficient to essentially take some transformation of, say, the provided username and password (e.g., apply some SHA-2 goodness and proprietary ciphers), and use that in the hashing step?

Of course if someone were to steal the code as well, the "secret" salt generation is pointless. The only added security I'm attempting to achieve is forcing the hacker to steal both my data and my code if they want the convenience of the reduced search space described above in point 3.

I hope I've adequately (and not too verbosely) described my question and reasoning. The specific question I'm hoping to have answered is whether or not this is a cryptographically (or otherwise) secure means of storing and comparing user passwords.


Edit: It appears the key points of my question may have gotten lost in the noise above. Let me try to be more explicit:
  1. Does having access to the salt used in generating a hashed value substantially aid a brute force attack on discovery the actual password?
  2. If so, does it therefore make sense trouble the hacker with having to steal my code in addition to the data itself?
  3. Is the general approach of dynamically generating the salt (using reproducible, non-random, but unguessable means) ill-advised?
AviD
  • 72,708
  • 22
  • 137
  • 218
jmar777
  • 233
  • 1
  • 5
  • I think you might want to look at: http://stackoverflow.com/questions/1753028/what-is-best-possible-way-of-salting-and-storing-salt and http://chargen.matasano.com/chargen/2007/9/7/enough-with-the-rainbow-tables-what-you-need-to-know-about-s.html – NotMe Sep 15 '11 at 15:33
  • @Chris - thanks, that's a good read. To the best of my knowledge I'm already applying most everything described in there (heavy-weight hashing, time complexity, stretching, etc.) with the exceptions of being dumb enough to think I'm smart enough to build my own system, and of course the variation on salt handling described above. Is there anything in that article that specifically addressed my question? –  Sep 15 '11 at 15:43
  • Two things: First, the idea that you don't need to take any additional steps to secure the hash as that's a waste of time. Second, that the hash should have zero relevance to the data it's protecting. – NotMe Sep 15 '11 at 15:46
  • My question doesn't involve securing the hash though (perhaps I didn't describe it well enough). The question revolves around whether or not it's advantageous to *not* store the salt at all in the database, on the reasoning that the salt could substantially aid a brute force attack against the hash itself. –  Sep 15 '11 at 15:50
  • Sorry.. meant to write "steps to secure the salt" – NotMe Sep 15 '11 at 16:46
  • Ahh, I see (trying to decide if I agree). Specifically, is my concern invalid that having the salt aides an attempt to brute force the password from the hash? –  Sep 15 '11 at 16:52
  • Besides @ThomasPornin's excellent answer, see my question on SO from a couple years ago: http://stackoverflow.com/questions/536584/non-random-salt-for-password-hashes – AviD Sep 25 '11 at 00:17

3 Answers3

16

You are somewhat mistaken about the role of the salt. It is not meant to be "unguessable by the attacker". If it was meant to be unguessable, it would be a piece of confidential data, and we would call it a "key", not a "salt".

The salt is there to make the password derivation process unique, in order to prevent an attacker from making economies of scale when attacking several passwords. If the attacker tries to guess three passwords, it must cost him three times as much as attacking one. One big fat example of "economy of scale" is a big table of precomputed hashes for common passwords (or its evil offspring, the rainbow table, which is just an enormous precomputed table with a trick to keep it merely huge). In the presence of a salt, there is not one hash function, but one per salt value; or, said otherwise, a precomputed table would be specific to a given salt (the one used during the computation) and worthless for attacking any password which does not use that salt.

Of course, if an attacker could get a copy of the hashed passwords but not the corresponding salts, his task becomes very hard indeed. But this is not a plausible scenario: if you have a storage area where you can put the salts out of reach of the attacker, then why did you not store the hashed passwords there too ? So the security analysis assumes that the salts are known to the attacker. Their usefulness stems from their uniqueness: no two distinct hashed passwords, on the same or different systems, at the same or different times, have the same salt value (not even two successive passwords for the same user: when you change the password, you also change the salt).

You can generate and store salts in any way that you see fit, even with a "dynamic generator", as long as the probability of reusing a salt value is negligible. This is so precisely because they are supposed to be public and need not be "unguessable" by the attacker -- unguessable things require heavy artillery.

There is a slightly related concept which has sometimes been called "pepper": a secret key which is also inserted into the hashing process (see this previous question). That key is shared system-wide. The idea would be that a unique key is easier to keep confidential than a whole database of hashed passwords, at least in some configurations (e.g. you keep it in a private configuration file, not the SQL database). The "pepper" concept implies a more complex management (one more thing to backup and not lose, or share between front-end servers) and is worth anything only in the fringe scenario where an attacker can dump the user database but not the complete host system. In any case, the "pepper" does not realize the uniqueness property that the salt provides, so the pepper must not replace the salt -- only possibly complement it.

Thomas Pornin
  • 322,884
  • 58
  • 787
  • 955
  • Good answer, but salts aren't just about economies of scale - they also prevent precomputation. – Nick Johnson Sep 16 '11 at 01:01
  • 3
    Precomputation _is_ an economy of scale. You do _one_ exhaustive search (for table building) and then attack _several_ passwords for negligible marginal cost. – Thomas Pornin Sep 16 '11 at 01:59
  • 1
    It's not entirely the same. Salting with the username (a terrible practice, yes) would prevent economies of scale (or at least reduce them) but still permit precomputation. Using a fixed salt (another terrible practice) would prevent precomputation but still permit economies of scale. – Nick Johnson Sep 16 '11 at 03:08
  • @Thomas - Thank you for the detail in this answer. I understand your point about economies of scale. What I'm struggling with is that it seems to me that having the salt readily available also lends itself to economies of scale (not in the form of precomputation, but in the same way that a fixed/static salt would (and as described by @Nick)). *cont...* – jmar777 Sep 16 '11 at 13:37
  • Specifically, let's say that the range of valid password characters is 80. If I know that all passwords use the same salt, my labor is reduced from 80^24 iterations (for an 8 byte password and 16 byte salt) to only 80^8. It just seems that having salt values readily available, even if they're unique per password, has the same effect. Is this not the case? Or, perhaps, is that indeed the case, but the iterations aren't as important as I'm making them? – jmar777 Sep 16 '11 at 13:39
  • I think everyone is saying that using a salt negates the use of 'ALREADY' compiled rainbow tables, and require attackers to generate new ones based on the salt. I.E. I can't go to bittorrent for the latest MD5 table of precomputed hashes, I have to build my own that takes into account a salt value. – MToecker Sep 16 '11 at 15:23
  • 1
    @MToecker: yeah but it's a waste to build a table in that case, because it will work only once (for _that_ specific salt value) and building the table is more expensive than a single exhaustive search of the same password space. Properly _unique_ salts utterly defeat precomputed tables, rainbow or not. – Thomas Pornin Sep 16 '11 at 16:20
  • @Thomas - agreed on the effectiveness of a unique salt in rendering precomputed tables useless. Should I or should I not be bothered, though, by the greatly reduced order of complexity that having the unique salt associated with a hash provides a brute force algorithm (i.e., the 80^24 --> 80^8 reduction mentioned above)? – jmar777 Sep 16 '11 at 17:47
  • @jmar777: if the salt is secret then it is not a salt. It is something else. You can have that too (the "pepper") but it is not the same functionality. I suggest you do not try to merge them: keep a salt stored with the hashed password for uniqueness, _and_ a secret "pepper" (shared between hashed password) stored elsewhere. "Dynamically generated secret salts" are another name for "I am doing the salt+pepper mix myself instead of letting my password hashing function do its job". – Thomas Pornin Sep 16 '11 at 18:10
  • @Thomas - I swear I plan on giving some upvotes and accepting like it's Christmas soon, so thank you for sticking with me. I still feel like my initial question hasn't been answered though. Trusting the experts and vetted utility classes for this stuff is sound advice, but that doesn't help me understand it any better. Providing a plain-text salt along with the hash is either; a) *not as useful to a hacker as I think it is*, or b) *a dismissible concern for other reasons that haven't yet been mentioned.* – jmar777 Sep 16 '11 at 18:20
  • 1
    @jmar777: it is: c) _not the right name for it_. If some part of the data is not stored along the hash then that part is not a salt. It has been called "pepper". Having part of the data not accessible to the attacker surely makes things harder for him, but it is not easy or free (keeping something secret always involves some extra management complexity). Most of my answer is about making a distinction between the "salt job" (uniqueness) and the "pepper job" (being unknown to the attacker). Secrecy is easier without uniqueness (one key vs 10000 keys), hence the distinction is useful. – Thomas Pornin Sep 16 '11 at 18:39
  • @ThomasPornin Thanks again. I'm starting to wonder how to actually get an answer to my question though. c) may indeed be true, but it doesn't really address 1), 2), or 3) in my question. This question is academic in nature and focuses on whether or not the proposed approach has negative implications for the cryptographic properties of the secured password. This has all been good input and I appreciate it - I just want to see 1), 2), and 3) explicitly addressed... – jmar777 Sep 21 '11 at 16:59
  • Just realized I never accepted this answer - sorry for the long overdue accept! Not sure I ever got my question resolved, but I definitely appreciate the time and thought you put into this. – jmar777 May 15 '12 at 16:08
3

This was a bit long for comments.

The salt does indeed give them a starting point.

However, it is typically a moot issue as the value of knowing a particular password is almost always only related to the total number of passwords they can reverse from your user table. In other words, if they have to spend literally years of computer time to pull out say 10 passwords then that data in effect has no value.

Lifting usernames and passwords from a site only has value because users tend to reuse the same password across multiple services. It's obviously not important to a hacker to log in as a user on the same site that they pulled the password from because if they were able to get the passwords from you then odds are great that they could pull any other data they wanted as well.

When selling those usernames and passwords they have value because those keys could open other locks such as facebook accounts and banking sites. The only other situation that they have value is in publicly shaming you by posting the usernames/passwords on a public forum, ala Sony.

So, if you are using a random salt per password then you are already increasing the time it takes to get anything of value to the point of silliness. Further the people with the resources necessary to crack all of those passwords in a reasonable amount of time are most likely governmental agencies... which is an altogether different problem.

Which leads me to my vote: find a different problem to solve.

NotMe
  • 696
  • 3
  • 11
  • Thank you for all the replies. I'd have to suggest though, that, depending on the site, a single password valid only on the targeted site can still be of great value to a hacker. With regards to your statement "if you are using a random salt per password then you are already increasing the time it takes to get anything of value to the point of silliness", that's actually part of the question itself. It certainly makes precomputation-based attacks pointless, but I'm hoping for some math proving that giving them the salt doesn't nullify the "silliness" of a brute force attack. – jmar777 Sep 16 '11 at 14:36
-1

An important Law of security is "stop trying to do your own crypto". Stop creating your own authentication systems and just use the standard systems. There's nothing particularly wrong with what you propose, but while asking it, you demonstrate that you don't fully understand the problem. It's like watching a carpenter hammer in a nail using a wrench. It's not so much that it wouldn't work as the fact you would distrust his competence.

If you want to make life harder for hackers, make as small a change as possible.

Let's simplify your question: What can I do to make cracking the passwords harder? Does forcing them to steal the source code help?

For example, using SHA-2 makes it harder for hackers. Their tools are optimized for SHA-1 and MD5, and a lot of tools don't work with SHA-2.

Or, if you really want a secret in the source code, the complex stuff you suggest wouldn't improve the situation more than simply adding "x" to front of the salt value. That way, the hacker could steal the salts and hashes, but wouldn't be able to do much with them without also knowing the source code.

Remember that some programmers are going to come after you and have to maintain your code. If it's some bizarre system, they are going to tear their hair out in frustration. If it's the normal authentication system, or a normal system with one minor change, then they won't curse your name.

Robert David Graham
  • 3,893
  • 1
  • 15
  • 14
  • Thanks for the comment. As I've stated already, though, I'm entirely comfortable with accepting that "trusting the experts and proven solutions" is sound advice. That's an answer to a different question though (e.g., "*should* I roll my own solution?"). I have an academic interest in this, however, and I've *very* explicitly outlined what my exact questions are, with regards to the proposed solution. (Also, apologies, as this isn't really targeted just as you, and I really do appreciate you taking the time to answer.) – jmar777 Sep 21 '11 at 16:54