How is it possible that people observing an HTTPS connection being established wouldn't know how to decrypt it?

It is the magic of public-key cryptography. Mathematics are involved.

The asymmetric key exchange scheme which is easiest to understand is asymmetric encryption with RSA. Here is an oversimplified description:

Let n be a big integer (say 300 digits); n is chosen such that it is a product of two prime numbers of similar sizes (let's call them p and q). We will then compute things "modulo n": this means that whenever we add or multiply together two integers, we divide the result by n and we keep the remainder (which is between 0 and n-1, necessarily).

Given x, computing x3 modulo n is easy: you multiply x with x and then again with x, and then you divide by n and keep the remainder. Everybody can do that. On the other hand, given x3 modulo n, recovering x seems overly difficult (the best known methods being far too expensive for existing technology) -- unless you know p and q, in which case it becomes easy again. But computing p and q from n seems hard, too (it is the problem known as integer factorization).

So here is what the server and client do:

  • The server has a n and knows the corresponding p and q (it generated them). The server sends n to the client.
  • The client chooses a random x and computes x3 modulo n.
  • The client sends x3 modulo n to the server.
  • The server uses its knowledge of p and q to recover x.

At that point, both client and server know x. But an eavesdropper saw only n and x3 modulo n; he cannot recompute p, q and/or x from that information. So x is a shared secret between the client and the server. After that this is pretty straightforward symmetric encryption, using x as key.

The certificate is a vessel for the server public key (n). It is used to thwart active attackers who would want to impersonate the server: such an attacker intercepts the communication and sends its value n instead of the server's n. The certificate is signed by a certification authority, so that the client may know that a given n is really the genuine n from the server he wants to talk with. Digital signatures also use asymmetric cryptography, although in a distinct way (for instance, there is also a variant of RSA for digital signatures).


Here's a really simplified version:

  1. When a client and a server negotiate HTTPS, the server sends its public key to the client.
  2. The client encrypts the session encryption key that it wants to use using the server's public key, and sends that encrypted data to the server.
  3. The server decrypts that session encryption key using its private key, and starts using it.
  4. The session is protected now, because only the client and the server can know the session encryption key. It was never transmitted in the clear, or in any way an attacker could decrypt, so only they know it.

Voilà, anyone can see the public key, but that doesn't allow them to decrypt the "hey-let's-encrypt-using-this-from-now-on" packet that's encrypted with that public key. Only the server can decrypt that, because only the server has that private key. Attackers could try to forge the response containing an encrypted key, but if the server sets up the session with that, the true client won't speak it because it isn't the key that the true client set.

It's all the magic of asymmetric key encryption. Fascinating stuff.

P.S. "really simplified" means "mangled details to make it easier to understand". Wikipedia "Transport Layer Security" gives an answer more correct in technical particulars, but I was aiming for "easy to grok".


The other answers are good, but here's a physical analogy that may be easier to grasp:

Imagine a lock-box, the kind with a metal flap that you put a padlock on to secure. Imagine that the loop where you put the padlock is large enough to fit two padlocks. To securely exchange send something to another party without sharing padlock keys, you would

  1. put the "Thing" in the box, and lock it with your padlock.
  2. send the locked box to the other party.
  3. they put their padlock on the loop also (so that there are two locks on it), and return the double-locked box to you
  4. You remove your padlock, and return the now singly-locked box to them
  5. they remove their own lock and open the box.

With encryption the locks and keys are math, but the general concept is vaguely like this.