23

How unsafe would be to publish the hash of my passwords?

I have written a Python script for helping me to remember my basic passwords (computer password, encrypted backup password, AppleID password, and KeyChain password).

It is hardcoded inside this:

SHA256(MD5(password) + password + MD5(password))

for each password and I periodically run it to keep my memory fresh.

I have a private repo on GitLab where I store all generic files and I would like to commit this script. I can't see any problem doing this since, as far as I know, it would be impossible to recover the original password, but I prefer to ask experts, to be sure.

EDIT: I'm adding an anonymized version of my script, so you can understand how it works:

from hashlib import md5, sha256
from getpass import getpass
from random import choice

def hash(pwd):
    pwd = pwd.encode()
    return sha256((md5(pwd).hexdigest()+str(pwd)+md5(pwd).hexdigest()).encode()).hexdigest()


dict = {'pass1': '6eaa49070c467d1edead2f6bc54cf42cdda11ae60d40aef2624a725871d3f452',
        'pass2': '240cbc4ba2661b333f9ad9ebec5969ca0b5cf7962a2f18a45c083acfd85dd062',
        'pass3': 'b018ed7bff94dbb0ed23e266a3c6ca9d1a1739737db49ec48ea1980b9db0ad46',
        'pass3': '7dd3a494aa6d5aa0759fc8ea0cd91711551c3e8d5fb5431a29cfce26ca4a2682'
       }

while True:
    tipologia, hash_result = choice(list(dict.items()))
    while True:
        pwd = getpass(f'Password {tipologia}: ')
        if hash(pwd) == hash_result:
            print('Correct!')
            break
        else:
            print('Wrong!')
L.A.
  • 349
  • 1
  • 2
  • 9
  • 38
    Why would you re-invent the wheel? Cryptography is hard. Use something built by people you trust to store your passwords in a master password store. For example https://www.schneier.com/academic/passsafe/ – Bae Mar 17 '21 at 05:32
  • 13
    @Bae they are not looking for a password store. They already use KeyChain (the other passwords they mention are not useful to store in KeyChain, I guess). What they are looking to do is to have their computer periodically query them for the passwords and verify the input using some hashes, to improve their memory of them. – Leif Willerts Mar 17 '21 at 06:11
  • 6
    I don't understand the rationale of your script: it stores the password hash(es). OK, but how does that help you remember the passwords? – Konrad Rudolph Mar 17 '21 at 10:00
  • I've added the script, so you can see how it works. Of course the hashes are not the original ones – L.A. Mar 17 '21 at 11:30
  • @KonradRudolph By periodically requiring you to enter all the passwords, your memory is refreshed. – Barmar Mar 17 '21 at 15:11
  • 40
    `SHA256(MD5(password) + password + MD5(password))`. This expression is a bit suspect. You shouldn't assume that mixing and matching and nesting different hashes will be more secure. It may actually make it less secure if you find a weak combination that opens up new statistical attacks. If you want a more secure hash, use a different algorithm that is proven to be effective. – Hymns For Disco Mar 17 '21 at 15:16
  • 1
    @Barmar Yeah, I see that now (the script was posted *after* my comment). Still, I feel the script comes a bit too late: it shows you that you forgot a password, but doesn’t help you recover it. Better to use a password manager for all but one (the master) password. And the latter shouldn’t be forgotten because it will be used regularly. – Konrad Rudolph Mar 17 '21 at 17:11
  • I'm having difficulty imagining a scenario where this would be likely to be practical. – Nat Mar 17 '21 at 17:15
  • @KonradRudolph I guess they don't trust password managers. If they have a password they normally use infrequently, they want to have to enter it every day to avoid forgetting it. I agree that it seems like a silly solution. – Barmar Mar 17 '21 at 19:13
  • 7
    @Barmar Two points (1) not trusting a password manager isn’t a valid, informed infosec opinion, and this assertion *must* be front and centre to a response to such questions on this site (2) OP *is already* using a password manager (Keychain.app). – Konrad Rudolph Mar 17 '21 at 19:23
  • 3
    @Barmar: I don't trust hosted password managers. I trust them fine on local disk. – Joshua Mar 18 '21 at 02:09
  • Personally, I use `gpg` for this purpose - I trust that the `gpg` guys have securely implemented "use AES256 to symmetrically encrypt or decrypt an empty file with a password (more than I trust myself to roll my own implementation that achieves the same goal). I use two shell functions: `pwhash() { : | gpg --symmetric --cypher-algo aes256 >"$1"; }` and `pwcheck() { gpg --decrypt --quiet <"$1"' }`. So instead of a hash of each password I have an empty file encrypted with each password. (These `gpg` commands assume `gpg1` - I've never looked into equivalent `gpg2` commands.) – mtraceur Mar 18 '21 at 18:52
  • @KonradRudolph Password managers are great but do not cover all use cases. E.g. travelling without a laptop, phone battery is dead, need to login to a low value site using an untrusted machine. A good middleground solution is to have a subset of memorable (e.g. diceware) passwords used for low value logins so that you can access them away from the password manager. – Jon Bentley Mar 19 '21 at 12:26
  • 1
    @KonradRudolph As for, "not trusting a password manager isn’t a valid, informed infosec opinion" - it certainly can be in some contexts. E.g. I may trust my password manager but not trust the machine I'm forced to access it from. Then I have a dilemma: do I access it anyway (and expose *all* my passwords to risk of attack), or do I not access it all (and suffer from denial of service)? Having a memorised password can mitigate this as it limits the risk of attack to just that one login. – Jon Bentley Mar 19 '21 at 12:29
  • @JonBentley I (almost) fully agree with what you’ve said but I don’t think this is very relevant in the context of this specific discussion. I would *strongly* caution against relying on memorised passwords only (though I also do this to some extent): the human mind is fickle! I was once stranded at an airport in a foreign country without cash which I needed for a train ticket, and I suddenly *couldn’t remember my debit card PIN*, which I was (back then) using all the time. Not a pleasant experience. – Konrad Rudolph Mar 19 '21 at 12:34
  • @KonradRudolph The relevance was to your statement "better to use a password manager for all but one (the master) password", to show that this is not necessarily always better. Also just to clarify, I was *not* suggesting that you *only* rely on memorised passwords. I still store my memorable passwords inside a password manager. The point of memorising some of them is to have extra options. In particular I memorise my email password (and make sure it is particularly strong) since email can often (perhaps unfortunately) be used to gain access to other sites via password reset. – Jon Bentley Mar 19 '21 at 12:37
  • @JonBentley You’re missing the *context* of my assertion about password managers, which was as a reply to the comment “I guess they don't trust password managers”. This, I posit, is (in its generality) simply not a valid objection. – Konrad Rudolph Mar 19 '21 at 12:41
  • @KonradRudolph Sorry for being a pain, but the quote I responded to (at least in my first and 3rd comments) was *before* the one you are quoting now, so it can't have all been in reply to that. Anyway, it's not important. These types of considerations always have to balance the needs of security vs convenience, and it rarely makes sense to look at just one aspect in isolation. Or in other words, the context should usually be the wider context anyway, even if that means going beyond the initially raised consideration. – Jon Bentley Mar 19 '21 at 13:47

6 Answers6

51

MD5 and SHA256 were both designed to be as fast as possible. Their purpose was to compute a hash for relatively big data volumes, e.g. for files. That's why they are very fast, which makes brute-forcing easier.

A single GPU can generate ~10^10 SHA256 hashes per second. It is ~10^15 per day. Computing of MD5 is much faster, on some GPUs ~5 times faster. So let's ignore MD5 part of your algorithm and consider SHA256 only.

Suppose you have a password that consists of 10 alphanumeric characters, i.e. from a 62-character set. The number of different passwords is 62^10 ~= 10^18. It means, to brute-force such password an attacker would need 10^18 / 10^15 = 1000 days with single GPU, or 100 days with 10 GPUs, or 10 days with 100 GPUs. Thus, an attacker that can afford 10 GPUs, will break such password in 100 days.

If you use 16-character passwords, i.e. 6 character longer, then required computing power need to be 62^6 = 5*10^10 higher.

Thus, the security depends essentially on how long your passwords are.

Another important factor is the costs of brute-forcing and possible benefits for an attacker. For instance, if brute-forcing your passwords will cost an attacker 900 000 USD, and knowing your password gives the attacker benefit of 1 000 000 USD, then in total this will give 100 000 USD benefit, and brute-forcing makes sense. But if password gives only access to a bank account with let say 10 000 USD, then an attacker will lose 890 000 USD. In such case brute-forcing makes no sense for the attacker. Only you can decide if somebody will be interested in paying much money for computing power needed to brute-force your passwords.

And it makes sense to use hashing algorithms that are designed to be slow, e.g. Argon2 or Lyra2. They have tunable parameters. Use values such that a single hash takes 0.1 - 1s. Thus you can essentially slow brute-forcing down and reduce the risks even more.

mentallurg
  • 10,256
  • 5
  • 28
  • 44
  • 2
    TL;DR if anyone could be bothered then they could probably brute force any weak passwords – Qwerky Mar 17 '21 at 09:56
  • Thanks! I would definitely switch to one of the algorithm you suggested! I didn't know their existence, but makes really sense! – L.A. Mar 17 '21 at 11:50
  • What's a 'GPU hat'? – JimmyJames Mar 17 '21 at 14:51
  • 12
    And then there is [this old good XKCD](https://xkcd.com/538) that always pops out when these discussions arise :P – frarugi87 Mar 17 '21 at 14:52
  • @JimmyJames: A typo. Should be "GPU can". Updated. Thanx. – mentallurg Mar 17 '21 at 14:52
  • What's a 'GPU can'? Just kidding. – JimmyJames Mar 17 '21 at 14:53
  • 2
    @frarugi87: No :) It was relevant 20 years ago, when one had a few passwords. Nowdays, if one has passwords for 100-200 sites or services, it is impossible to keep such things in mind. – mentallurg Mar 17 '21 at 14:55
  • @mentallurg well, you would still need to have a way to unlock, so instead of spending 900k on trying to brute force whatever you have locked it may be better to spend 1 or 2k on a henchman and get it "the wrenchy way" ;) – frarugi87 Mar 17 '21 at 15:03
  • 3
    @frarugi87 The problem with the wrench is that 1. you actually need to be present and not thousands of miles away where you are safe from extradition. 2. You hit someone with the wrench and they might not be able to tell you the password (or anything.) This is one of the dumbest plot points in "Die Hard". The big bad dude spends maybe a minute talking to the guy with the password and then shoots him dead. Not much of an evil genius if you ask me. – JimmyJames Mar 17 '21 at 16:36
  • 2
    @JimmyJames Christmas movies don't usually have a realistic plot – corsiKa Mar 18 '21 at 01:58
  • Another point; this is a bad password hash because he forgot the salt. – Joshua Mar 18 '21 at 02:09
  • @Joshua: Yes. "bk2204" has [mentioned it](https://security.stackexchange.com/a/246230/47524). – mentallurg Mar 18 '21 at 03:17
  • This answer assumes that the password has quite a bit of entropy, i.e. is randomly generated, which is probably not the case for passwords the user wants/need to remember. A dictionary attack may yield results much faster than brute force if OP didn't take appropriate measures. And of course didn't reuse the same passwords elsewhere. – jcaron Mar 18 '21 at 11:08
  • @corsiKa Wait? What? "Die Hard", not realistic? Come on, are you really serious? I mean it's *obviously* just like real life. – JimmyJames Mar 18 '21 at 14:33
46

There are a couple problems with this approach.

First of all, you're using two plain cryptographic hash functions to hash your data. By themselves, cryptographic hash functions are designed to be fast. That means that it's extremely easy for an attacker to try to brute-force your password. The only time it's safe to use a plain cryptographic hash function to hash a secret is when that secret is a sufficiently long output of a CSPRNG (i.e., it has at least 128 bits of entropy). In order to even store this password securely on a system, you should be using a password hashing function like one of the crypt functions on your system, scrypt, or Argon2, which are designed to be iterated and expensive to prevent brute forcing.

Second, you have no salt for this password. As a result, anyone can just hash the output of a large password list as found in any of a number of breaches and create a giant table of passwords using this scheme. If you were using a secure password hashing system, it would require a reasonably long salt to randomize the password and prevent generation of so-called rainbow tables to make guessing a simple table lookup.

Third, you are using MD5, which should not be used for anything anymore. MD5 has been known to be totally insecure for 17 years, and there is no longer a justifiable reason to use it at all. Carnegie Mellon University says it is “unsuitable for further use,” and responsible parties do not use it.

Fourth, it is strongly preferable not to disclose the password hashes at all. Passwords are securely hashed both to make guessing expensive and make it harder to guess even if the hashes are exposed, but if a person somehow gets your hashed password and it's guessable, they'll be able to guess it with enough effort. Your password must therefore be reasonably secure and contain sufficient entropy that it is computationally infeasible to guess even if the hash is exposed. It also needs to not be reused, because if it's ever exposed elsewhere, then you have to assume the attacker knows it (because usually, they can find it) and it then becomes just another entry in an easy word list to guess.

bk2204
  • 8,695
  • 20
  • 19
  • 4
    Good point about salt. – mentallurg Mar 17 '21 at 02:08
  • Thanks for your reply! Well I thought that MD5 part could play the role of the salt, but probably I do not understand correctly why salt is used. I thought that it is used only for preventing reverse table so adding any string would work just as well, but probably I'm loosing something... – L.A. Mar 17 '21 at 11:48
  • 11
    @L.A.: The point of salt is that it's extra randomness *not* associated with the password, so it's different even if 2 users chose the same password. The attacker also doesn't get their hands on the salt for a specific password-hash until they obtain that `/etc/shadow` or equivalent, so they can't do any generic pre-computation (e.g. rainbow tables). (The salt is effectively plaintext in the password hash, so once they do have the hash they have the salt and can start brute-forcing that specific password hash. If it's short & simple, it'll still be cracked quickly with a fast hash.) – Peter Cordes Mar 17 '21 at 12:09
  • 4
    I use MD5 for indexing binary columns. I would have used CRC64 if it was an available primitive. – Joshua Mar 18 '21 at 02:11
  • 2
    I think bk2204 means MD5 should not be used for any security applications anymore. Doing something else is fine. – Christian Mar 18 '21 at 21:29
  • No, I really do mean that MD5 shouldn't be used at all. If you don't need collision resistance, then MD5 is generally inefficient and you should use something else, like a CRC. In any event, BLAKE2b is both faster than MD5 and cryptographically secure, so there's no reason to use MD5. Using MD5 for anything encourages other people to copy it when "it's good enough" and then drive giant security holes through their software, and it's also been broken so long that using for anything is frankly embarrassing. Neither MD5 nor SHA-1 is acceptable any longer. – bk2204 Mar 18 '21 at 21:51
  • I think you SHOULD use MD5 for hashing of non-secure things. I say this because MD5 is KNOWN to be insecure. My fear would be that some early adopter would start using Blake in a (semi-)secure fashion before it was a standard: Blake gets ignored and something even better comes out. And software being what it is, since Blake isn't specifically called out as no longer be secure by NIST, it just sits in the code base forever because it's forgotten. On the other hand if someone sees MD5 in the security layer and they KNOW that there's a better solution that's a standard, it will likely be updated. – TruthOf42 Mar 19 '21 at 13:05
18

There are all excellent answers. I will try to address immediate questions.

How unsafe would be to publish the hash of my passwords?

Very unsafe. Irrespective of the algo used, please never share the passwords or the hash of it on open web. Why:

  1. This is a bad practice. A bad habit. You are encouraging yourself and others around you to lower your guard.
  2. Crypto (and their hacks) are very intricate. Very few people in the world can conclusively say whether what you are doing is actually safe or not. BUT there is a whole bunch of people who will NOT say anything and silently smile and let you do the mistake and misguide people. These are not just hackers, these are also state actors and intelligence agencies who have real incentives to get the general population lower their guard on privacy. Your AppleID password may not mean much to anyone, but the tiny bit bad habit that got in the air is worth something for them.
  3. What is safe (or innocent) now is probably not the same in future. Who knows where you will find yourself in 15 years. Internet never forgets. Your past might come haunting you. Or it might not - it could all continue to be innocent. Bhy take chance?

I have written a Python script for helping me to remember my basic passwords (computer password, encrypted backup password, AppleID password, and KeyChain password). It is hardcoded inside this: SHA256(MD5(password) + password + MD5(password))

Please never ever create Crypto algo of your own. Please see point #2 as why. Others have tried to answer the exact technical reason as to why this particular algo is flawed. It is better to remind your self that not just this one, but any algorithm that we came up ourselves - is flawed. In the world of crypto the safest path is the path well trodden. Without exception.

I have a private repo on GitLab where I store all generic files and I would like to commit this script.

Committing the code is fine, committing the hash is not.

I can't see any problem doing this since, as far as I know, it would be impossible to recover the original password, but I prefer to ask experts, to be sure.

Thanks for asking. :) impossible is a very strong word that no expert will use in its unbound sense. Even if companies like Google (for example) have their passwords tremendously hashed to the point of really-truely-unpossible to recover, still you do not see them publishing the hashes. Basically, please hold on to the "secret" stuff as if they really are secret. Hashing or frying does not decrease the sensitive nature of the data. Hashing is just one defence. Not an absolute one.

So, please

  1. do not invent your own crypto algo.
  2. do not hardcode hashes/passwords/keys in source code.
  3. definitely do not commit/push then in source control.
  4. do not publish the hashes.
inquisitive
  • 281
  • 1
  • 3
  • 1
    Thanks for your reply. I see the point: probably nothing will happen now or ever, but there is now reason to take the risk, even is little. I think I will use your idea to commit the code and store the hashes in a different private file – L.A. Mar 17 '21 at 11:42
7

Back in the day when airbags were an expensive option on expensive cars, I used to ride with a colleague who had a BMW-635 with an airbag. He believed he didn't need to wear his seatbelt because the airbag would save him. Nobody believes this any more. Both safety mechanisms are necessary, have complementary purposes

Passwords used to be stored in plain text. Then we started hashing them. Then hackers started building rainbow tables, so we started salting them. Now hackers are stealing password hash files and brute forcing them, so we stopped using MD5 and SHA1, and implemented 2048 rounds of hashing to slow down the brute force attacks. Too slowly, we are switching to slow hashing algorithms for passwords, to slow down brute force attacks even more

Still, hackers are stealing hash files, and still they are brute forcing. Ten to fifteen percent of passwords are guessable in 24 hours of brute force, because too few people understand "long and random". Elsewhere, I suggest 7-word Diceware passphrases, and nobody responds, not even with "please explain". Now, those 10%-15% of passwords were cracked from a hash file which was stolen or leaked from some insignificant member-only closed forum. But 30%-40% of those users have the same username and password on more important logins - gmail, applemail, msmail - and they use those email accounts for banking

The point is, don't be the guy who doesn't wear his seatbelt

Use long and random passwords, and never expose the hashes

  • 1
    And they look at me funny when I use three word + number passwords and actually roll a die to encrypt files to give to customers. – Joshua Mar 18 '21 at 02:13
3

Quick notes:

  1. Typing your password more is bad.
    This scheme requires you to regularly type your password on internet-connected devices at times you didn't have to before. That opens up additional opportunities for a key-loggers, videos of you typing, a corrupted variant of your script, etc., to steal your password.

  2. Fast hashes are easier to dictionary-attack.
    You're using fast hashes that could be easily dictionary-attacked. Other answers have explained why this is a bad thing.

  3. Secrets shouldn't be used multiple times.
    SHA256(MD5(password) + password + MD5(password)) inserts your password 3 times, which is bad. Generally, never reference a secret (like password) more than once.

  4. Consider blinding before calling an external function.
    You're passing your password into the MD5() and SHA256() functions plain-text-style. If either is backdoor'd, then they get your password. Consider blinding.

  5. This increases opportunities for mistakes.
    This scheme doesn't save you from having to save your passwords somewhere, so it's adding another moving-part to your security strategy. That's generally not a good thing, as it enables more opportunities for blunders to derail your security.

  6. This increases attack-surface-area.
    This scheme doesn't save you from having to save your passwords somewhere, so it's adding another moving-part to your security strategy. That's generally not a good thing, as it gives attackers more vectors to attack along.

  7. This isn't playing to your strengths.
    Information-security is about asymmetric computational ability, in whatever form it might take. Here, you're using scripting with weak crypto in an online format that relies on other folks' software playing nice for you.. you're opening yourself up to all sorts of problems, apparently just to use old-fashion crypto methods. And why? Old-fashion crypto is notable for being what other people understand well, and apparently not you.

    If you're not a cryptographer, then security-through-obscurity for personal use may be the better way to go.. just writing stuff down on a piece of paper that you store somewhere weird. Because a lot of folks may know all sorts of gaps in the software, systems, and algorithms used, but a relatively small number of people want to search your home for the one place you may've cryptically written down your passwords on a random thing. And generally speaking, there's little overlap between the population of people on Earth who love math/algorithms enough to become cryptographers and the population of people on Earth who're willing to rummage through your dirty laundry to see if there's a sticky note at the bottom of the hamper. Because it's probably not in that exact spot, and, ew.

    Satoshi Nakamoto scribbled their private-key in a dirty porto-potty, right on the wall where everyone could see it (if rather messily). It's likely that a lot of people have seen it by now but simply not recognized it. Ya know, hypothetically.

Nat
  • 1,443
  • 2
  • 11
  • 13
  • 1
    I've never heard the term "blinding" in this context before, and I doubt the OP has either, so point 4 is a bit meaningless. – IMSoP Mar 18 '21 at 17:54
  • It is a little too easy to misread point 3 as if you think that OP is actually pasting their password in three times, rather than what I think you meant: that inserting/reusing a secret multiple times inside a cryptographic hash is worse than just using the bytes once (I guess informally we might explain this as it creating more opportunity for each bit of the secret to leak out in some way). – mtraceur Mar 18 '21 at 19:24
0

Not only publishing hashes is unsafe, because this means that the attacker won't need to actually look for a security hole to acquire the hash and the attacker can start attacking the password immediately, but MD5 itself is known to be vulnerable for a long time. Not to mention the fact that sometimes capturing the hash is enough. Some attacks like PtH can exploit vulnerabilities in the authentication protocols and unauthorized person might be able to gain access even without actually cracking the password at all. So, publishing password hashes is terrible idea, even if they are relatively secure. If there is a security vulnerability, the attacker can provide the hash instead of the actual password and he still will be able to gain access.

btzom
  • 1