1

I have a background in web app development and I'm trying to up my security game but there are some things that I find confusing.

Like how does memory-hard hashed passwords protect against brute force attacks? Let's assume I have a webapp which hashes passwords with Argon2i. To my understanding it's great because it takes up a lot of computational power and it's "slow". This is the part where I get lost. How does it work under the hood? Who's doing the computational work?

When I try to imagine the how it could work, this is what I come up with. Which still leaves my with a lot of questions and is, probably, dead wrong:

  • User password is hashed and stored
  • Hacker decides to brute force an account
  • Server takes a long time to validate passwords due to the memory-hard hashing used therefore it takes up too much time to gain access to an account.

Based on the assumptions listed above, I'm left to conclude that the computational work happens on the server (validating passwords). This conclusion doesn't seem right to me. Shouldn't you have the computational work be on the hacker's side of the equation? Is my server not at risk running out of resources if my algorithms are heavy and one of these attacks hit? So when I read things like "This medium-hard algorithm is great because it costs a lot of computational work" I don't understand how hackers could care less since they can just send requests without having to do work.. unless they have had access to the db where the sensitive data was stored.

nobody
  • 11,341
  • 2
  • 41
  • 60
user83191
  • 13
  • 3

3 Answers3

1

Memory-hard hashing schemes are the one that require a lot of memory to be executed. If you need 100MB for calculating a hash, it limits the number of concurrent hashes you can execute. This protects the hashes from offline password cracking, not online bruteforces.

Online bruteforce can be countered with a couple tools:

  • Rate limit: at each try, increase the response time

  • Captcha: after n errors, ask for a captcha

  • Soft locking: after n errors, return "authentication failed" from that IP for n minutes

But the issue is offline cracking, when the attacker gets a list of hashes (usually by SQL Injection), and have all time and equipment available to bruteforce the hashes. If the hashes are memory hard, there's a large memory requirement to calculate the hashes, and that limits the amount of parallel threads attacking the hash. A computer with 256GB of memory can calculate in parallel at most 1024 hashes that takes 256MB each.

This also turn very difficult to develop an ASIC for breaking it. Memory is very expensive on those circuits, so increasing the memory cost will put the total attack cost out of the reach of the attacker.

This also turns difficult to bruteforce the hashes using GPUs, as they usually have lots of processing power, but not lots of memory. It's easy to build a computer with 1TB of RAM, but not a GPU with 1TB of memory.


How does it work under the hood? Who's doing the computational work?

It depends. If the attack is online, your server is working. If it's offline, it's the attacker's computer.

And why you want to use a slow hash? To prevent the attacker from throwing hundreds of millions of passwords per second at the hash function. If the hash function takes 100ms to calculate on your server, the attacker with a similar hardware is expected to try 10 times a second.

A hash that takes more than 100ms can lead to a server-side DoS if the attacker floods the server with login requests, and a hash that takes nanoseconds allows the attacker to send millions of passwords per second on a similar hardware.

I don't understand how hackers could care less since they can just send requests without having to do work

That's not the case. A hacker will not send a trillion passwords at your login page for an online attack. They will either use an already cracked password, or try a thousand common passwords. So, if your users are using password as a password, a hash function that takes one minute to calculate and uses 1 TB of RAM will not protect them from online attacks.

unless they have had access to the db where the sensitive data was stored.

And that is the case. The type of hash you use will determine how many passwords the attacker will capture. A database with unsalted MD5 hashes will be bruteforced in hours, maybe days. Some of the entries are a Google search away. But a database of Argon2 or Scrypt hashes will take thousands of centuries if the parameters you use are strong enough.

ThoriumBR
  • 51,983
  • 13
  • 131
  • 149
0

Password hashing is concerned with brute-force attacks that take place offline using a dump of hashed passwords, not online against a running service. See this answer for how password hashes are usually obtained.

A running service does pay a cost when verifying passwords since it is not completely cheap. A strongly recommended step to take is to rate-limit the number of login attempts that can be performed, thus preventing the attacker from sending more than "a few" password guesses.

If there is no rate-limiting, the likely result would be the server CPU being fully consumed with password verification, slowing things down even more potentially to the point of crashing. This would be equivalent to a denial-of-service but would not result in a successful brute-force attack.

If the servers can handle it, it would still take entirely too long to brute-force due to time needed to verify each password (as well as the round-trip to the server).

Finally, there are password hashing techniques where the load is mostly on the client (eg: SCRAM), but this is more useful with heavier custom clients (eg: DB clients).

Marc
  • 4,151
  • 1
  • 18
  • 23
0

As a normal human user, you will submit a password every few seconds. That is easy for the server to handle. An attacker could submit thousands of guessed password. The server protects itself against this by rate limiting, like allowing only one password request for a username every three seconds.

So the number of guesses that an attacker can handle using the server is quite limited.

PS. The purpose of “memory hard” hashing is not to make the hashing expensive- you can do that easily by requesting lots of hashing rounds. The purpose is to have hashing where a million dollar supercomputer doesn’t have a big speed advantage compared to your humble little server.

gnasher729
  • 2,107
  • 11
  • 16