4

When ever there is a file / firmware to download online and they provide a checksum to check the file against, i always confirm the check sum of the downloaded file matches the checksum posted online.

But it has often crossed my mind, if a malicious 3rd party has FTP access and are able to swap out the file / firmware with a malicious build, they would surely have the technical know how and access to go into the html / php etc webpage file and update the check sum to match their now malicious build, thus rendering the check sum worthless.

Have i missed the point here ?

sam
  • 536
  • 3
  • 14
  • 1
    Possible duplicate of [Downloaded file checksums](https://security.stackexchange.com/questions/107814), [Why is it necessary to match the checksum of a download with another file provided by the same server?](https://security.stackexchange.com/questions/26678), [Does hashing a file from an unsigned website give a false sense of security?](https://security.stackexchange.com/questions/1687), [Is there any purpose for providing checksums on a non-HTTPS location?](https://security.stackexchange.com/questions/138531). – Steffen Ullrich Oct 29 '18 at 21:03

3 Answers3

6

All that you've missed is where a hash is supposed to protect you. You are correct that if an attacker has access to the server itself, they can just modify everything.

Where a hash is supposed to help you is against a man-in-the-middle attack. For example:

  1. Download file
  2. Read webpage for plaintext md5 or sha1
  3. Hash downloaded file
  4. Compare values

If someone were sitting in the middle, they could theoretically sit in the middle and change both also but there are other technical solutions to try to combat this (SSL/TLS (still vulnerable to MITM) and digital signatures).

EDIT: On some of the customer remediations I've been on, how we've used download hashes to try and mitigate a MITM is to download the bits, and then verify that the hash on the website is seen as the same over multiple connections/computers. This significantly decreases the liklihood that an attacker will own all of the investigators' means of connections. If the hashes from the source site are the same across the different connections/computers, it should be assumed to be relatively safe.

thepip3r
  • 633
  • 3
  • 8
  • 5
    Some projects will use third parties to host large files in order to improve download speeds as well as offset the cost of maintaining the servers. In this case with multiple third parties it's possible one may be compromised. If that's the case co paring the hash from the official website would indicate the compromise. One big example if this is what Source forge did with multiple companies hosting files. These days many companies store their large files in the cloud using AWS which may be separate from their web host. – Daisetsu Oct 29 '18 at 20:31
6

Checksums are provided to detect corruption during file transfer, not to detect man in the middle attacks.

Teun Vink
  • 6,898
  • 2
  • 29
  • 35
  • 3
    He did say checksum... but most often you are provided a hash, not a checksum in my experience. – thepip3r Oct 29 '18 at 18:19
  • @Teun Vink What can be done (if anything) to prevent me downloading a malicious build in the first place ? For instance there was an issue recently with a malicious build of the popular video compression software Handbrake being distributed from their site – sam Oct 29 '18 at 18:20
  • There’s not much you can do to prevent that, mostly you can implement controls like virus scanners and malware detection to prevent you from running the installer. – Teun Vink Oct 29 '18 at 18:26
  • @thepip3r Intended usage is the only practical difference between a hash and a checksum here. Yes, SHA-1 (or even SHA-256) values for files are typically given, but they're meant to be used as checksums, not as hashes. – Austin Hemmelgarn Oct 29 '18 at 18:50
  • @AustinHemmelgarn, their intended use is making sure that all of the bytes arrive unmolested to the end-point. There should be a distinction between the two for a number of reasons. A checksum is not a hash--cryptographically speaking, they're not even in the same realm of discussion. Hashes happen to be able to provide a level of fidelity that the bits are what the source says they should be. In case the case of a MITM, I've mitigated this in the past by verifying the site checksum over distinct connections (commercial, and 2 cell phones) to ensure no MITM. – thepip3r Oct 29 '18 at 19:15
  • I think OP is talking about hashes like SHA256 or MD5. Neither are used to detect corruption - the TCP protocol does that. – Dan Dascalescu Aug 24 '20 at 04:03
  • Not all file transfer protocols use TCP. TFTP uses UDP for example. – Teun Vink Aug 24 '20 at 07:39
1

Another point worth mentioning is that the checksums can be supplied in a separate file signed with digital signature of the author/maintainer of the content.

In this case even if the attacker gets full control of the server on which the content resides and replaces the payload and checksums, they will be unable to sign the checksums file with the developer's signature (unless they also obtain the private key of the developer).

Then, end users will be able to detect that something is wrong, because signature verification process will fail.

The procedure of download then should include these steps:

  1. Download the payload.
  2. Download the checksum file.
  3. Verify the signature of the checksum file.
  4. Verify the checksum of the payload against checksum file.

With this, a user can decide to discard the payload should step 3 or 4 fail.

This makes it possible to distribute content via 3rd party infrastructure which is not controlled by original author of the content, without being afraid of unauthorized changes. For example, many Linux distribution's install files are hosted on public mirrors that belong to universities or enthusiasts.

Example: Gentoo Linux mirrors list and list of their public keys so that users can verify downloads.

VL-80
  • 1,234
  • 1
  • 9
  • 17