How to verify the checksum of a downloaded file (pgp, sha, etc.)?

Usually this would start on the owners side displaying the checksum for the file that you wish to download. Which would look something like the following:

md5: ba411cafee2f0f702572369da0b765e2

sha256: 2e17b6c1df874c4ef3a295889ba8dd7170bc5620606be9b7c14192c1b3c567aa

Now depending on what operating system you are using, once you have downloaded the required file you can compute a hash of it. First navigate to the directory of the file you downloaded, than:

Windows

CertUtil -hashfile filename MD5 / CertUtil -hashfile filename SHA256

Linux

md5sum filename / sha256sum filename

MacOS

md5 filename / shasum -a 256 filename


The issue that comes with checking a hash from a website is that it doesn't determine that the file is safe to download, just that what you have downloaded is the correct file, byte for byte. If the website has been compromised then you could be shown the hash for a different file, which in turn could be malicious.


Checksums vs Hashes vs Signatures

You mention checksums, PGP, and SHA in your question title, but these are all different things.

What is a checksum?

A checksum simply verifies with a high degree of confidence that there was no corruption causing a copied file to differ from the original (for varying definitions of "high"). In general a checksum provides no guarantee that intentional modifications weren't made, and in many cases it is trivial to change the file while still having the same checksum. Examples of checksums are CRCs, Adler-32, XOR (parity byte(s)).

What is a cryptographic hash?

Cryptographic hashes provide additional properties over simple checksums (all cryptographic hashes can be used as checksums, but not all checksums are cryptographic hashes).

Cryptographic hashes (that aren't broken or weak) provide collision and preimage resistance. Collision resistance means that it isn't feasible to create two files that have the same hash, and preimage resistance means that it isn't feasible to create a file with the same hash as a specific target file.

MD5 and SHA1 are both broken in regard to collisions, but are safe against preimage attacks (due to the birthday paradox collisions are much easier to generate). SHA256 is commonly used today, and is safe against both.

Using a cryptographic hash to verify integrity

If you plan to use a hash to verify a file, you must obtain the hash from a separate trusted source. Retrieving the hash from the same site you're downloading the files from doesn't guarantee anything. If an attacker is able to modify files on that site or intercept and modify your connection, they can simply substitute the files for malicious versions and change the hashes to match.

Using a hash that isn't collision resistant may be problematic if your adversary can modify the legitimate file (for example, contributing a seemingly innocent bug fix). They may be able to create an innocent change in the original that causes it to have the same hash as a malicious file, which they could then send you.

The best example of where it makes sense to verify a hash is when retrieving the hash from the software's trusted website (using HTTPS of course), and using it to verify files downloaded from an untrusted mirror.

How to calculate a hash for a file

On Linux you can use the md5sum, sha1sum, sha256sum, etc utilities. Connor J's answer gives examples for Windows.


What is a signature?

Unlike checksums or hashes, a signature involves a secret. This is important, because while the hash for a file can be calculated by anyone, a signature can only be calculated by someone who has the secret.

Signatures use asymmetric cryptography, so there is a public key and a private key. A signature created with the private key can be verified by the public key, but the public key can't be used to create signatures. This way if I sign something with my key, you can know for sure it was me.

Of course, now the problem is how to make sure you use the right public key to verify the signature. Key distribution is a difficult problem, and in some cases you're right back where you were with hashes, you still have to get it from a separate trusted source. But as this answer explains, you may not even need to worry about it. If you're installing software through a package manager or using signed executables, signature verification is probably automatically handled for you using preinstalled public keys (i.e. key distribution is handled by implied trust in the installation media and whoever did the installation).


Related Questions

  • Why we use GPG signatures for file verification instead of hash values?
  • Is sha1sum still secure for downloadable software packages signature?
  • Can an attacker replace the hash of a download, a download, and the public key?