How is the Heartbleed exploit even possible?

@paj28's comment covers the main point. OpenSSL is a shared library, so it executes in the same user-mode address space as the process using it. It can't see other process' memory at all; anything that suggested otherwise was wrong.

However, the memory being used by OpenSSL - the stuff probably near the buffer that Heartbleed over-reads from - is full of sensitive data. Specifically, it's likely to contain both the ciphertext and the plaintext of any recent or forthcoming transmissions. If you attack a server, this means you'll see messages sent to the server by others, and server responses to those messages. That's a good way to steal session tokens and private information, and you'll probably catch somebody's login credentials too. Other data stored by OpenSSL includes symmetric encryption keys (used for bulk data encryption and integrity via TLS) and private keys (used to prove identity of the server). An attacker who steals those can eavesdrop on (and even modify) the compromised TLS communication in realtime, or successfully impersonate the server, respectively (assuming a man-in-the-middle position on the network).

Now, there is one weird thing about Heartbleed that makes it worse than you might expect. Normally, there'd be a pretty good chance that if you try and read 64k of data starting from an arbitrary heap address within a process, you'd run into an unallocated memory address (virtual memory not backed by anything and therefore unusable) pretty quickly. These holes in a process address space are pretty common, because when a process frees memory that it no longer needs, the OS reclaims that memory so other processes can use it. Unless your program is leaking memory like a sieve, there usually isn't that much data in memory other than what is currently being used. Attempting to read unallocated memory (for example, attempting to access memory that has been freed) causes a read access violation (on Windows) / segmentation fault (on *nix), which will make a program crash (and it crashes before it can do anything like send data back). That's still exploitable (as a denial-of-service attack), but it's not nearly as bad as letting the attacker get all that data.

With Heartbleed, the process was almost never crashing. It turns out that OpenSSL, apparently deciding that the platform memory management libraries were too slow (or something; I'm not going to try to justify this decision), pre-allocates a large amount of memory and then uses its own memory management functions within that. This means a few things:

  • When OpenSSL "frees" memory, it doesn't actually get freed as far as the OS is concerned, so that memory remains usable by the process. OpenSSL's internal memory manager might think the memory is not allocated, but as far as the OS is concerned, the OpenSSL-using process still owns that memory.
  • When OpenSSL "frees" memory, unless it explicitly wipes the data out before calling its free function, that memory retains whatever values it had before being "freed". This means a lot of data that isn't actually still in use can be read.
  • The memory heap used by OpenSSL is contiguous; there's no gaps within it as far as the OS is concerned. It's therefore very unlikely that the buffer over-read will run into a non-allocated page, so it's not likely to crash.
  • OpenSSL's memory use has very high locality - that is, it's concentrated within a relatively small range of addresses (the pre-allocated block) - rather than being spread across the address space at the whim of the OS memory allocator. As such, reading 64KB of memory (which isn't very much, even next to a 32-bit process' typical 2GB range, much less the enormous range of a 64-bit process) is likely to get a lot of data that is currently (or was recently) in use, even though that data resides in the result of a bunch of supposedly-separate allocations.

I would expect a segmentation fault if a process tried to access any memory that it didn't explicitly allocate

This is where the misconception lies.

Any broken memory access could result in a segmentation fault, but actually if the requested memory address lies within the current process's address space (say, a variable you just freed), this is highly unlikely.

That's why you should not rely on segmentation faults for finding memory access bugs!