Detection for corruption in HTTP and FTP

How is data downloaded from HTTP or FTP get checked for corruption?

By itself, not at all.

HTTP and FTP as protocols don't offer integrity¹.

However, HTTP and FTP are usually used atop of TCP/IP, which both have checksums in their transports – if a TCP checksum fails, your operating system will just discard the TCP packet and ask for it again. So there's no need for HTTP to even implement integrity checking.

When tunneling anything (including HTTP and FTP) over TLS, you get an additional layer of integrity checks.

So how does FTP and HTTP ensure that data is not corrupt?

They don't. It's usually the transport's job to guarantee integrity, not the job of the application protocol.


¹ there is an optional header in HTTP1.1 that allows the server to specify a checksum, but since that is practically impossible for resources generated on the fly and comes at high cost for large files, and has little advantage over the much more fine-granular TCP-checksuming, it's rarely used. I don't even know whether browsers commonly support it.

I'd like to add here that it's of course harder to cause a collision within MD5 (which is used in these headers) than to forge TCP packes, if you'd want to intentionally modify the transfer. But if that is your attacking scenario, TLS is the answer, not HTTP checksums.


HTTP and FTP almost always use TCP as the underlying transport layer, so TCP's protections apply there as well. These, however, are only concerned with network-level accidental corruption issues. As you point out, verification that the file was written successfully is generally left to a checksum like CRC32.

If you are concerned with intentional manipulation (which I assume you are, since this is on security.SE), these aren't sufficient, because they're not cryptographically secure. On the network side, we're generally dealing with a different layer by introducing TLS (when combined with HTTP, we get HTTPS; when combined with FTP, we get FTPS). But if you want to be particularly certain, as well as verify the integrity of the file you have on disk, a common approach for vendors is to provide a file listing the SHA-2 checksums, and then sign that with a GPG key; you download both files, then verify the checksums file has been signed by a key you trust, and that the other file(s) match the checksums listed in the file you've just verified.

So how does FTP and HTTP ensure that data is not corrupt?

In short, from a security perspective, they don't.