How do big cloud providers guard against VM escape attacks?

Highly customized and patched hypervisors, sandboxes around said hypervisors to mitigate breakouts, and heavy monitoring. Of course, any given server only hosts so many VMs, so a breakout is fundamentally limited to a finite number of guests, if it's able to get past the protections outside the hypervisor. For example, QEMU can be compiled with a hardened toolchain, isolated using chroots, mode 2 seccomp, mandatory access controls. It can be patched to be made more secure, using custom seccomp filters, disabling excess code or features, adding custom, hardened drivers. Custom kernels can be configured and hardened to limit the ability of an escaped guest to compromise the kernel and escape mandatory access controls.

Unfortunately, the answer to the question can partially be answered with "they don't". Many cloud providers do regularly get compromised, they just don't know it, or don't disclose it. I see this happening periodically, and it's not pretty. Sometimes 0days are used, and sometimes little-known public issues with the providers are leveraged. For example, Amazon's higher end dedis not shutting down when switching customers, resulting in their GPUs not wiping VRAM, letting the next customer see what the previous one was doing (stealing secrets in OpenCL, anything which was sent over VNC or with X11 forwarding, etc). The Amazon security team knows about this, but still have not fixed it. They do not tell their customers this. This shows a lack of effort put into security, and a lack of awareness about extant security issues. With something like this which you'd think would become big news being known by few people, it's not surprising that large-scale or advanced attacks against cloud providers can happen without the general public being aware of it. The lack of awareness and news of attacks leads people to wonder how the hell these large businesses stay secure. They aren't.

To my left is a window with vim open. In it is code with a 0day on used recently to break out of a hypervisor on a certain medium-sized cloud provider. No protections were in place. All that happened was the exploit broke out and gained access to the host with the privileges the process the hypervisor was running as. No sandbox or extensive monitoring solutions were in place. A single obsolete syscall causing a kernel vulnerability lead to a compromised kernel and persistence in memory (grsecurity would have mitigated it, fwiw), and subsequent exfiltration of information. This company is worth hundreds of millions of dollars. I'm sure countless people have been in there before. Although this is not real evidence and just my own experience, it hopefully should show that cloud big cloud providers do not in fact guard against VM escapes well at all, with the exceptions mentioned in the first paragraph.