How does Tor protect against MITM attacks between the client and relay nodes?

The Tor Project hosts some bootstrapping servers called directory servers. They contain a list (a directory) with information about all Tor relays currently online. This info about each relay includes the public key. The directory is signed with one of the directory keys[1]. Those keys are distributed along with your copy of the Tor client[2].

Therefore, the answer to your question is: by having a trusted third party (similar to a certificate authority). Tor relays (called onion routers in the paper) upload their info to the directory periodically[3]. When a client connects to their guard node, they check that the connection is encrypted with the right key, as listed in the directory. Then, when proxying traffic to the middle node (between the guard and exit node), they setup an encrypted connection to that node, proxying through the guard, and checking again that the right key is used (instead of a MITM key) using the directory. Same goes for the final (exit) node.

Source: https://www.onion-router.net/Publications/tor-design.pdf

[1] "Each onion router maintains a long-term identity key and a short-term onion key. The identity key is used to sign TLS certificates, to sign the OR’s router descriptor (a summary of its keys, address, bandwidth, exit policy, and so on), and (by directory servers) to sign directories."

[2] "Client software is pre-loaded with a list of the directory servers and their keys, to bootstrap each client’s view of the network."

[3] "Tor uses a small group of redundant, well-known onion routers to track changes in network topology and node state,including keys and exit policies. Each such directory server acts as an HTTP server, so clients can fetch current network state and router lists, and so other ORs can upload state information."


In general, SSL/TLS does offer MITM protection. It encrypts data before it leaves your computer that only the endpoint can decrypt and vice versa. A MITM attack is ineffective against encrypted TLS connections because even if you intercept the public keys that the endpoints exchange, it still does not know their private keys.

As for a Tor, it is not encrypting your data at the Node level. The data is encrypted by a handshake between your endpoints so it can't be altered in transit. What Tor is encrypting with those TLS connections is routing instructions. The computer sending data establishes a packet of delivery instructions that is protected by layers of encryption. Basically Node A decrypted the 1st layer, which tells it who Node B is, but everything else is still encrypted. Node B runs another layer of decryption to find who Node C is and repeats the process until the endpoint is reached.

So, the only data any Node has access to is the next Node in the chain. Everything else remains unreadable. Since there is no unencrypted data to manipulate, a CA is pretty irrelevant. (That said, Luc's answer shows that Tor maintains its own CA just to be sure.)