40

I've read "Is my developer's home-brew password security right or wrong, and why?", but there's still a question in my mind:

Even if someone uses a bad, home-brewed security algorithm just like Dave, an attacker can't get the real password if the attacker only compromises the database, not the server. Mark Burnett's answer to Dave's question seems to prove my guess.

Again, take Dave's code as an example. Dave uses insecure/fast hash functions, md5 and sha1, so yes you can crack his password (get the plain text) from the database easily. But you still can't get the real password if his server is not compromised.

schroeder
  • 125,553
  • 55
  • 289
  • 326
Rick
  • 1,027
  • 1
  • 9
  • 23
  • 2
    @Anders Actually I feel quite confused when reading some highly voted password security questions. Because when talking about "compromised" or "cracked", I don't know whether they are talking about both the server and the database are compromised, or merely the database. – Rick Jul 01 '19 at 09:07
  • 5
    "yes you can crack his password (get the plain text) from the database easily. But you still can't get the real password if his server is not compromised".I dont understand this line.If you can crack his password how do you not get the real password? – yeah_well Jul 01 '19 at 09:30
  • @VipulNair Because like how Dave does, he's not storing the password hash directly, you don't know what conversion he uses on the server. – Rick Jul 01 '19 at 09:32
  • I am disappointed in this thread. Everyone waxed eloquent about how homebrew is broken, but only 1 answer came close to explaining how a custom algorithm could possibly be reverse-engineered. And I still don't understand - I can make arbitrary transformations as part of my homebrew hashing. If cryptographers are so good at reverse-engineering homebrew transformations of data, they should help out data scientists dealing with badly managed data, figuring out what, say, a particular column actually was meant to be, in terms of the others. – Milind R Nov 23 '21 at 20:39

8 Answers8

69

Yes, But..

To make it nice and clear... We're talking about a database-only compromise when an attacker has access to the database but not the application source code. In that case the attacker will get the password hashes but will be unable to crack them and get the original passwords because of Dave's custom algorithm. So in the case of a database-only breach, yes, Dave's password algorithm will protect passwords more than if he had used MD5 or SHA1.

However That's only one possible avenue for system leaks. There is one key fact that trashes the "math" that makes Dave's homebrew algorithm seem reasonable.

Half of all breaches start internally.

(sources 1 2 3) Which is a very sobering fact, once you let it sink in. Of the half of breaches caused by employees, half of them are accidental and half are intentional. Dave's algorithm can be helpful if all you are worried about is a database-only leak. If that is all you are worried about though, then the threat model you are protecting against in your head is wrong.

To pick just one example, developers by definition have access to the application source code. Therefore if a developer gains read-only access to the production database they now have everything they need to easily crack the passwords. Dave's custom algorithm is now useless, because it relies on old and easy-to-crack hashes.

However, if Dave had used a modern password hashing algorithm and used both a salt and pepper, the developer who gained access to a database-only dump would have absolutely nothing useful at all.

That is just one random example but the overall point is simple: there are plenty of data leaks that happen in the real world where proper hashing would have stopped actual damage when Dave's algorithm could not.

In Summary

It's all about defense in depth. It's easy to create a security measure that can protect against one particular kind of attack (Dave's algorithm is a slight improvement over MD5 for protecting against database-only leaks). However, that doesn't make a system secure. Many real-world breaches are quite complicated, taking advantage of weaknesses at multiple points in a system in order to finally do some real damage. Any security measure that starts with the assumption "This is the only attack vector I have to worry about" (which is what Dave did) is going to get things dangerously wrong.

Conor Mancone
  • 30,380
  • 13
  • 92
  • 98
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackexchange.com/rooms/95782/discussion-on-answer-by-conor-mancone-isnt-daves-protocol-good-if-only-the-d). – Rory Alsop Jul 05 '19 at 18:51
33

This doesn't answer the question about Dave's protocol specifically, but I wanted to address the more general question, for the Daves around the world who are writing their own hashes. There are a few things, Daves, that you need to realize:

  1. You are not a cryptographer. That's not a slight against you; I'm not one, either. But even if you were a cryptographer, you'd have to be the best in the entire world to be certain that your algorithm had no flaws which could compromise security, because even the experts mess up a lot (all four words are separate links). Among other things, potential flaws in hashes include:
    • Accidental reversibility. Maybe you didn't mean it, but you put too much information into the "hash", and now it can be trivially reversed, even without brute-force. For an example of a "complex" algorithm which is nevertheless pretty easy to reverse, look at linear congruential generators.
    • Not enough complexity on CPUs, GPUs, ASICs, etc. This is surprisingly hard to do; there's a reason there's only, like, three libraries to do password hashing, and they're all based off the same ideas. Unless you're intimately familiar with how GPUs and ASICs work, you're most likely going to build something that can be run much quicker on GPUs than CPUs, instantly negating any other protections you have.
    • Too much complexity where you're actually running it, combined with the last point. It's very easy to point to your performance testing and say, "Look, it takes me 30 seconds to do 30 hashes, that's great!" Except you're, again, not a cryptographer or GPU dev, so you don't realize that your complex additions and multiplications can actually be replicated quite easily on GPUs, so they can crack 30 million hashes in 30 seconds, all the while DoSing your service by trying to log in more than once a second.
    • Insufficient uniformity. A theoretically perfect password hash function's output is indistinguishable from a true random number generator's, when fed varying input. In practice, we can't quite get there, but we can get incredibly close. Your algorithm might not. And no, "looking" random does not mean it's actually close enough; if you're inexperienced enough to be writing your own secret crypto for "better" security, you're inexperienced enough not to know how to spot true randomness.
    • Even if you build your algorithm entirely out of good, solid crypto primitives, you can still put them together wrong.
  2. You are not a cybersecurity programmer. There's probably a better word for that, but point is, you haven't specialized in writing code which correctly implements algorithms, even ones like your own. For a very brief list of possible issues which could be visible from the database alone, each of which is linked to the first Google result for "[item] attack":

And all that is just thinking exclusively about offline attacks on databases, where the thinking is done by a college student who isn't even majoring in cybersecurity. I guarantee you I've missed quite a few things. I've completely skipped over all the other attack vectors for MITMs, malicious clients, etc. I've also omitted mention of every error that could happen even if you used an off-the-shelf product; consider it an exercise for the reader to figure out how you could use even good crypto wrong. Finally, I've entirely omitted the common class of errors where the developer uses encryption where they should be using hashes, which I see occasionally.

So, in sum, Dave, whenever you think you've got the best idea for a secret, internal hash to use for your production code and it isn't to use a standard, off-the-shelf, public, thoroughly tested product, remember this:

You don't.

Just use bcrypt. (Or Argon2)

(As a side note, if you're just building an algorithm for fun and/or self-education, feel free to ignore all of this. Building your own algorithm to protect passwords in production is dangerous because you'll build a weak algorithm that offers little to no protection. Building your own algorithm to see if you can break it is an excellent way to pass the time, stimulate your mind, and maybe even learn some crypto.)

Nic
  • 1,826
  • 15
  • 22
  • 10
    I've got some horrors specifically designed to make an ASIC board designer worry, but I'm afraid to use them because allocating 5600k of RAM per password check opens up one easy DOS against my server. – Joshua Jul 02 '19 at 03:21
  • 3
    "Just use Bcrypt" - no, use Argon2. It has been the gold standard for quite some time now. – Polynomial Jul 03 '19 at 01:09
  • @Polynomial In the original post, Dave was trying to replace Bcrypt, which is why the original version of this post only referenced Bcrypt. Good point; I've added Argon2. – Nic Jul 03 '19 at 19:02
  • If you indent the line beginning "And all that is just thinking exclusively about offline attacks on databases..." with 4 spaces, it will be rendered as part of the numbered section above it, lining it up with that content. – jpmc26 Jul 04 '19 at 00:06
  • @jpmc26 It would, but I was trying to apply it to _both_ sections. All of the issues I mentioned in the first bit could leave the database, alone, crackable, directly or otherwise. If I expanded it to anything that could lead to an attack, this list would be _much_ longer... – Nic Jul 04 '19 at 00:40
  • @Polynomial The wikipedia article on Argon2 is pretty short and apart from the reference to the "Password Hashing Competition" which seems like it wasn't done by any big reputable organization. Not saying that makes it bad, but is there any detailed analysis by known experts I could look at? (Even a question on this very site [doesn't mention it prominently](https://security.stackexchange.com/questions/211/how-to-securely-hash-passwords). Hard to keep up to date with the ever changing algorithms and decide what's well tested enough to be used.. – Voo Jul 04 '19 at 14:19
  • @Voo That question is from 2010. Excluding the Community auto-edit to correct links, the last edit is from 2015, and that doesn't look to have been a content edit, only a grammar one. If you look at the edit history, you'll see that that information is really from 2013. A _lot_ can change in six years, especially in digital fields, especially in cybersecurity. I picked a newer question intentionally, because something from 2018 is less out-of-date than something from 2013, and _that_ question explicitly calls Argon2 the preferred choice. – Nic Jul 04 '19 at 15:29
  • 1
    @Nic Somehow missed your link in the answer, I blame my phone. – Voo Jul 04 '19 at 17:10
  • @Voo Argon2 has been heavily vetted by both academics and the security community. While a few cost-reduction attacks have been discovered in one of the modes, they in no way detract from the overall superiority of Argon2 over bcrypt. I would strongly recommend it for new designs (and design upgrades), unless you're in a situation where an scrypt implementation is natively available in your platform/framework of choice and Argon2 is not. I would not generally recommend bcrypt for new designs except for IoT devices with very little memory. – Polynomial Jul 04 '19 at 17:48
  • @Voo Just say it was compromised because it uses Dave's algorithm to store your password :) – Nic Jul 04 '19 at 19:15
  • @Polynomial I'm fairly sure that if you're trying to secure an IoT device with very little memory, you won't use a password hash database at all. You'll build the password functionality into the control app, using some key derivation function and then standard key-based crypto. Or at least, that's how I'd do it... – Nic Jul 15 '19 at 23:14
  • @NicHartley Not all devices interact with external systems. There are still common cases in firmware where you need to perform key derivation from a password, or store a password locally in a secure manner. – Polynomial Jul 16 '19 at 15:22
  • @Polynomial ...I'm not sure I understand what you mean. Can you give any examples of IoT devices which only ever, even indirectly, interact with other ultra-low-power devices? I ask because even industrial automation usually has some sort of central 'standard' computer, which could easily do the key-stretching in a much more brute force-resilient way than any embedded device. – Nic Jul 16 '19 at 16:09
  • @NicHartley They're not necessarily IoT, they can just be regular hardware. Standalone devices like hardware password managers come to mind. I've also seen IoT devices where a remote access password is stored locally as a hash and the incoming requests take that password and check it for validity - PBKDF2 is fine here, Argon2 or scrypt isn't feasible. – Polynomial Jul 16 '19 at 23:12
  • @Polynomial Ohh, I see what you mean -- right, that's a good point. Though in that case, hopefully whoever's making it will realize that no matter _what_ function they use, a PC can immediately crack it many times faster, just by virtue of not being a low-power device... As for "proper" IoT devices that take a password, I'm sure that's _done_, my point is only that it _shouldn't be_. I give you more credit than I give most IoT devs. Even in non-IoT embedded devices, an effort should be made to avoid using passwords. – Nic Jul 16 '19 at 23:16
10

In the case of a breach of the database and not the source code, Dave might have made things better compared to plain SHA1. But...

  1. The source code is likely to be leaked too, as Conor Mancone explains.

  2. The homebrew might screw up the hash, making it even less safe than just a plain SHA1. God knows how Daves strange contraption interacts with the internals of the hashing algorithm. If nothing else, Dave has created a little maintainance hell for those coming after him, and big messes are never good for security.

  3. It gives a false sense of security. Had Dave not been so proud of his brilliant solution, he might have taken the time to read up on how to do password hashing properly. From the question, it is clear that Dave thinks what he has done is better than say bcrypt. It is not.

  4. The little extra protection given by the homebrew algorithm could have been achieved with a pepper instead. That is better than a homebrew algorithm in every possible way.

So yes, a homebrew might be better than SHA1 under some very specific circumstances. If it is better on average is an open question, but the answer doesn't really matter. The point is that it is terrible compared to a real password hashing algorithm, and that is exactly what the home brewing stopped Dave from implementing.

Long story short, Dave fucked this up.

Anders
  • 65,052
  • 24
  • 180
  • 218
  • 1
    I would argue that if adding a pepper means append the pepper directly to the password and then hash it. It's useless and Dave's bad algo wins under this situation. – Rick Jul 01 '19 at 11:41
  • 1
    @Rick Why would it be useless? – Anders Jul 01 '19 at 11:44
  • Also, I don't understand why some highly voted answers are saying that appending a pepper directly to the password and hash it can be more secure. You can simply increase the salt length if you merely concatenate the pepper and password, whether append or prepend it. I think a pepper should be used as a secret key, for example, used along with `HMAC`. But like I comment under @Conor Mancone's answer, simply altering a bit would be enough. – Rick Jul 01 '19 at 11:49
  • 8
    @Rick Yes, a pepper should be secret, and not stored in the database. So it is protecting against exactly the situation you describe - database leaked, code not leaked. That is what a pepper is for, so you can't compare it to increasing the length of the salt. – Anders Jul 01 '19 at 11:51
  • 1
    @Rick I'm not taking any position on how to mix in the pepper - that is outside the scope of this quesiton. – Anders Jul 01 '19 at 11:53
  • 1
    @Rick Question remains: Why would a pepper be useless? – Anders Jul 01 '19 at 11:53
  • 1
    Let us [continue this discussion in chat](https://chat.stackexchange.com/rooms/95578/discussion-between-rick-and-anders). – Rick Jul 01 '19 at 11:57
7

TLDR: Insecure crypto isn't even safe if you can't see the encrypted value.

The other posts are pretty good about explaining a lot of reasons why you shouldn't write your own crypto, in an environment where the attacker can see the encrypted value, but they miss something really important: you also shouldn't write your own crypto when the attacker can't see the encrypted message.

There's this thing called a "side channel." Side channels are (usually1) unintended things that leak information about what your application is doing. For example, maybe it takes more CPU cycles - and therefore more time - to compare a password that is partially correct against the "encrypted"2 value. This would let an attacker accelerate a brute force attack to slowly learn the correct password.

Let's take a naive example. Let's pretend it takes 1 second to test a single character of the submitted password against the value stored in the database. Pretend the correct password is 8 characters long, and an invalid password is rejected at the first incorrect result. The algorithm might look something like:

boolean encrypt_password(string password) {
    if(not isascii(password) ) { return false; } // ERROR! 
    string result;
    foreach(char c : password) {
        result += daves_magic_that_takes_1s(c)
    }
    return true;
} 

boolean is_correct_password(input, pw_from_db) {
    if(input.length != pw_from_db.length) { return false }
    foreach(char c_in, c_db : input, password) {
         c_in = daves_magic_that_takes_1s(c_in)
        if(c_in != c_db){ return false}
    }
    return true; // valid password!
}

Now, let's imagine the valid password is "password" and the attacker tries the input "a". This fails, because the passwords are the wrong length. The attacker may randomly try various passwords. Every incorrect password longer or shorter than "password" takes less than a second to process. Let's say they soon try "12345678". "12345678" is the same length as "password" so it takes one second to process. The timing is different and the attacker notices. They try several more times to verify, and it's consistent.

The attacker now tries several 8 character passwords. They all take 1 second. The attacker has discovered a side channel that tells them that the valid password is probably 8 characters long. Now they need to determine which 8 character password is the right one.

The attacker starts randomly trying 8 character passwords. Eventually, they try "p2345678" and notice that this takes 2 seconds to complete. They test a bunch and discover that all attempts that start with "p" take 2 seconds to complete. The attacker guesses that the algorithm has a side channel that tells them how many characters they have correct.

Now, instead of having to attempt all 96^8 passwords to brute force the valid one, the attacker only has to try 96*8 passwords3. Depending upon how many passwords can be tested in parallel, they probably can successfully brute force the password in very reasonable time. This is great for the attacker! And it's terrible for the security of your system.4

How do we protect against timing side channels? We guarantee that all operations where timing would leak sensitive information MUST always take the same amount of time to execute.

This may look like a very simple example. It has happened in the wild. Searching NVD for "timing side channel" will get you lots of real world vulnerabilities that all produce the same kind of results, allowing an attacker to learn secret information to which they are not authorized. By definition, if all operations take the same amount of time regardless of the input, then the amount of time something takes doesn't tell you anything about the input.

In the real world, side channels are incredibly easy to introduce. Dave probably hasn't even heard of them, and is probably a good engineer concerned with the performance of his system - which is actually an anti-pattern for protecting against side channels. Dave's algorithm may very well have both obvious and subtle side channels that he'll never discover, but that researchers and attackers know to look for, and can pretty easily write automated tests to detect.5

So, just because you can't see the crypto doesn't mean that you can't see the side effects of bad crypto, and use those side effects to learn the protected secret.


Endnotes

1: Well, if you're an intelligence agent or a good newspaper journalist, you probably intentionally set up side channels so that you can communicate with your agents/sources without "the enemy" knowing. Similarly, if you're devious, you might create a crypto protocol with a side channel in it with the intention of leaking secret information.

2: Since we can and should always assume custom crypto is insecure (for the reasons others have mentioned in this thread and more), we probably shouldn't call the use of custom crypto algorithms "encryption" or "decryption"... maybe "insecure encryption" or "broken decryption"...

3: I ignore brute force attacks succeeding, on average, when the attacker has tried 50% + 1 passwords, for the sake of simplicity of description. I'm focusing on side channels, not on brute force attacks. I also gloss over the math of a brute force attack, since that's also tangential to the main topic; some Google-Fu on behalf of the reader should locate plenty of resources that go deep into the details.

4: "1 second is way too slow", right? No real world system could be checked for real world timing side channels over the internet, right? Wrong. I don't have the references on hand, but there was research a number of years a go showing that you can statistically test for timing on the order of milliseconds over HTTP transactions.

5 In fact, I'd be willing to bet that there are either frameworks or existing tools (most likely both) that you can use to test your application for obvious side channels, if you were to exercise your Google-Fu.

Nic
  • 1,826
  • 15
  • 22
atk
  • 2,156
  • 14
  • 15
4

Known-plaintext / Chosen-plaintext attacks

Let's suppose you know a particular user's password, or better still, can sign up and create your own password as many times as you like.

And you have read access to the database.

Let's further assume that you suspect (or know) that it's a fairly simple homebrew algorithm.

At this point you can start brute forcing the algorithm and try to get the hash in the database. For instance, you might "guess" the algorithm is

$hash = md5($pass . $salt . 'some random string');

You would brute force the random string. This becomes harder as the string gets longer, but you may be able to exploit some md5 weakness by carefully picking passwords.

You might alternatively "guess" the algorithm is

$hash = sha1(md5($pass . $salt . 'abc') . 'def');

and then try brute forcing again.

Something as complicated as Dave's algorithm would be extremely hard without some hints. If you knew that some reordering of characters was involved that would help.

Artelius
  • 588
  • 2
  • 4
  • *This.* The attacker can determine the algorithm without that much effort! That is what necessitates Kerckhoffs's principle. Only thing I disagree with is that it's hard to figure out. I'd assume it's easy for an attacker to figure out, even though it might require a little more effort than a plain MD5. – jpmc26 Jul 03 '19 at 21:59
  • @jpmc26 Having the database makes it easy to _verify_ a potential algorithm but still hard to _guess_. You can pick off the low hanging fruit (like the programmer storing a salt in the DB but forgetting to use it!) but beyond that there are just too many permutations to explore. In one line of code you can multiply all the ASCII values by 3, concatenate them, interpret as hex, and then subtract 730 before hashing... how do you even write a brute-forcer to explore this crap? – Artelius Jul 05 '19 at 01:15
  • The one exception being, if you observe that the distribution of final values is non-uniform, and/or dependent on the input in some way, this can leak information about the algorithm. But if the last stage of the algorithm is a standard hash function (even md5) that won't happen. – Artelius Jul 05 '19 at 01:17
  • "How do you even write a brute-forcer to explore this crap?" One of the things I've learned is that attackers are vastly more clever than I can imagine. =) We can't prove that someone hasn't figured out a completely unexpected way to do it, and the fact that so many cryptographic schemes have fallen to some very clever attack no one imagined was possible is why we don't depend on secrecy. I'm simply not willing to believe no one has done it. – jpmc26 Jul 05 '19 at 03:53
  • @jpmc26 Of course. _Everything_ should be assumed susceptible to creative attacks; but we make assertions about things so that other security people can identify weaknesses. I'd be very interested to hear about known weaknesses in this sort of thing. Keep in mind the use of peppers is commonplace as an _additional_ measure, and they rely on secrecy. What we have here is (if done right) a different form of pepper, as I see it. – Artelius Jul 07 '19 at 07:20
4

I learnt a lot more than this question, by reading @Conor Mancone's answer and discussed with @Conor Mancone and @Anders, so I decide to write it down. Correct me if I get wrong again.


md5 and sha1 are broken, but not the way I think how it is

I was falsely thinking that the hash output of md5 and sha1 can always be easily cracked (get the original input text), no matter how large the input is.

No it's not.

If I use a long enough pepper on the server, even if I use md5 or sha1 to hash a password, it would still be secure, given the server is not compromised.

For example: store md5($ 128bits_long_pepper . $password) into database.
Even without salt, you can't crack it. Let's say your md5 hash speed is 100 billion hash/s. Let's take it as 2^40 hash/s for convenience, since 2^40 > 100 billion. So as to brute force it, one still needs 2^128 / 2^40 = 2^88 seconds = 9.80719764 × 1018 years. And of course I think no one would pre-compute such a rainbow table.

But I doesn't say that I would choose md5 to hash my password. I wouldn't. Because once your server also gets compromised, then these hashed passwords are no difference from clear text.

I am a newbie and I read many highly voted posts talking about how bad md5 and sha1 are, but I see no answers talking about this ⬆️. I only notices some in the comments.


Salt, pepper, hash function

Finally I think I fully understand these 3 concepts and their use cases/conbinations.
Firstly, I sum up 3 kinds of attacks by myself: 1. brute force attack (the "try everything thing" way) 2. rainbow table attack (the direct reversed look up) 3. dictionary attack ($pepper. $common_password. $salt brute force, partially brute force, compared to 1.)

So I think:

  1. Salt is aimed to defend rainbow table attack.
  2. A good hash function (a slow one) is aimed to defend brute force attack.
  3. Pepper is mainly aimed to defend dictionary attack, given that the server is not compromised.

Explain with an example:

(Salt) One should realize that a salt is not a very big deal as many people describes. It only defends one thing, which is that an attacker can't just do a reverse look up in his rainbow table to get your unhashed password. If one only uses salt + fast hash function (no pepper on the server), then an attacker can do brute force for each hashed password. Of course another precondtion is that you don't require your user to store a 128bit long password for registration :).

So :

  • salt + fast hash function does not defend brute force attack. Whether it can defend rainbow table attack or dictionary attack is not important now.
  • salt + slow hash function defends brute force attack. It also defends rainbow table attack. It doesn't defend dictionary attack.

(Pepper) But for the salt + fast hash function conbination, if you use a pepper long enough on the server, given that your server is not compromised, only the database does, the password will still be secure.

So :

  • salt + fast hash function + long pepper(e.g.128bits) + server not compromised now can defend brute force attack. It also defend rainbow table attack and dictionary attack.

(hash function) But once the server gets compromised, the combination above is like a shit. The attacker would know the pepper. The difficulty to crack salt + fast hash function + long pepper(e.g.128bits) + server gets compromised is the same as someone only uses a fast hash function.

So :

  • salt + fast hash function + long pepper(e.g.128bits) + server gets compromised does not defend brute force attack. Whether it can defend rainbow table attack or dictionary attack is not important now.
  • But if you use a secure/slow hash function, things change.
    salt + slow hash function + pepper (not necessary to be very long e.g. 6 chars maybe good enough?) + server gets compromised defend brute force attack. It also defends rainbow table attack. It doesn't defend dictionary attack.

What about salt + slow hash function, isn't it enough? This combination does defend brute force attack and rainbow table attack. But it does not defend dictionary attack. Adding a pepper on the server is easy, why not?


Say something for Dave

As you can see from above, a pepper is just something that you use it to do some conversion on the server side.
It's the "conversion on the server" that really matters, as long as the server is not compromised.
For exmaple, you take a random constant and concatenate it with the password, hash($pepper . $password, $salt), it's one kind of conversion. So if you invent some shitty algorithms on the server like Dave does, you are also doing a conversion. For both the situations, the attacker needs to compromised the server. For the former, the attacker can just grap your constant value. For the latter, the attacker needs to figure out how your shitty thing works and need to do a reverse.

So my point is, I think Dave's idea is totally fine(do some conversion on the server), but it's just not necessary to add such obscurity/complexity. Because maintenance could be like hell if adds more and more complexity in such a way. A "conversion" like concatenating a pepper on the server is farily enough. Again, I think it's afterall a trade-off problem. And Dave's idea is right from the beginning (he wants to employ some extra security on the server side).

Rick
  • 1,027
  • 1
  • 9
  • 23
  • "For the latter, the attacker needs to figure out how your shitty thing works and need to do a reverse." -- For most homebrewed cryptographic functions, attackers don't need access to your source code to figure out how it works. -- And nobody's saying that if you're not an expert, you should stop... We're saying that if you're not an expert, then welcome to the club, and if you want to become an expert, great! Just don't think you're expert enough to deploy your homebrewed crypto to secure actual secrets until you're well past the Dunning-Kruger effect. – Ghedipunk Jul 02 '19 at 15:49
  • I'd also like to be a bit more pedantic (because in this case, it matters, because PHP devs like my peers, and many other devs, read Stack Exchange and walk away thinking they're experts) -- "Slow hashing algorithms" should be "Key Derivation Functions," some of which do use secure hash algorithms as their primitives, and are non-reversible like hashes... but KDFs have settings that tune the amount of work needed, and are designed to resist GPUs and ASICs. SHA2 is slow. PBKDF2 using SHA2 is secure. – Ghedipunk Jul 02 '19 at 16:01
  • @Ghedipunk Yes, by slow I mean hash functions like `bcrypt` and `PBKDF2`. – Rick Jul 02 '19 at 16:02
  • @Ghedipunk Why won't an attacker need to access the source code for example I use some homebrewed algo on the server, plus `bcrypt` and `salt` in the database? – Rick Jul 02 '19 at 16:03
  • *If* you're also using a secure password stretching function, then you're as safe as you can reasonably be, with or without your homebrew hash/crypto algorithm. That's often not the case from people who would use homebrewed crypto in production, though, so while there are nuances (and infosec is FULL of nuance), general advice is given assuming that a naive developer will implement things in the most naive way. – Ghedipunk Jul 02 '19 at 16:13
  • 1
    To rephrase: Security in depth is good, but is only as good as the aggregate of each layer. A weak layer doesn't reduce security *as long as it doesn't reduce entropy* (that's a big gotcha), but it must be used with strong layers, which is a step that many forget. – Ghedipunk Jul 02 '19 at 16:17
  • The pepper isn't for defeating dictionary attacks, but to ensure that a leak of the database isn't sufficient to crack the passwords. It's basically a secure way to "customize" a hash algorithm. – forest Jul 03 '19 at 01:25
  • @forest "ensure that a leak of the database isn't sufficient to crack the passwords." In what way? Ensure the attacker can't perfom a dictionary attack. You can argue that with `long pepper + salt + md5 hash function`, it indeeds also defend brute force attack. But that's for someone who uses bad hash functions like `md5` only. – Rick Jul 03 '19 at 01:34
  • 1
    @Rick A pepper is like a global salt stored _outside_ the database. You could even store it in the source code so it'd be necessary to leak the code to even begin to crack the hashes. Think of a pepper as a custom version of a hash that doesn't require you to be a cryptographer with decades of experience required to _design_ a custom hash or modification. A pepper provides perfect protection against a database leak. – forest Jul 03 '19 at 01:52
  • You're missing a type of attack: cryptanalysis.... Plenty of crypto algorithms designed by even well established cryptographers are found to be weak under certain conditions - and this can often be determined without the algorithm. Suppose I have a "known plaintext" attack - I can specify arbitrary text and then get the ciphertext, and then use statistical methods to find weaknesses in the algorithm. – Blackhawk Jul 03 '19 at 22:30
  • @Blackhawk Those are pretty damn easy to avoid. Not to mention, for password hashing, even the very weakest (and one of the earliest) hash, MD4, is not vulnerable to any attacks that make password cracking possible. Of course, being a fast hash and not a KDF, it's still bad to use for passwords, but cryptanalysis is not an issue. Hell, that's even true with the ancient hash that came _before_ it, MD2! – forest Jul 09 '19 at 06:46
2

While there's value in using a "hidden" custom algorithm, there is already a trivial and established way to make custom secure hash algorithms: Pepper*

Because the very cheap, very quick, and very safe option of using a Pepper exists, people who instead write a crypto algorithm from scratch to be "more secure" are always novices. So not only is "Dave's protocol" a waste of time and money, it's also a (supposedly) secure protocol written by someone doesn't know much about secure protocols. And that is generally considered an unwise choice, regardless of the actual security (or lack thereof) in Dave's protocol.

*In short, a pepper is a secret salt that's the same for all users.

Peter
  • 3,620
  • 3
  • 14
  • 24
0

Rule of thumb: Do not do home-brew security.

Because it often goes wrong. And the person who wrote it can not test it, if someone has some wrong assumption while writing it, he/she still has it when testing. An independent test is surprisingly expensive/time consuming - much more than writing it.

You need to include all related changes too. Knowing "there can't be a problem, because..." is just not a valid approach. For the same reasons as above.

Software security is a difficult topic. It's so difficult that it's hard to even understand how difficult it is.