Can GCC's ASAN provide the same memory safety as Rust?

The sanitizers

Both GCC and Clang have a suite of sanitizers; up until now, they were developed in Clang and then ported to GCC, so Clang has the most advanced versions:

  • Address Sanitizer (ASan): detects out-of-bounds access, use-after-free, use-after-scope, double-free/invalid-free and is adding support for memory leaks (expected memory overhead 3x),
  • Memory Sanitizer (MemSan): detects uninitialized reads (expected slow-down 3x),
  • Thread Sanitizer (TSan): detects data-races (expected slow-down 5x-15x, memory overhead 5x-10x),
  • Undefined Behavior Sanitizer (UBSan): various local undefined behaviors such as unaligned pointers, integral/floating point overflows, etc... (minimal slow-down, slight code size increase).

There is also work ongoing on a Type Sanitizer.


Sanitizers vs Rust

Unfortunately, bringing C++ up to Rust's level of safety with sanitizers is not possible; even combining all existing sanitizers would still leave gaps, they are known to be incomplete.

You can see John Regher's presentation on Undefined Behavior at CppCon 2017, the slides can be found on github, from which we get the current coverage:

enter image description here

And that is not accounting for the fact that sanitizers are incompatible with each others. That is, even if you were willing to accept the combined slow-down (15x-45x?) and memory overhead (15x-30x?), you would still NOT manage for a C++ program to be as safe as a Rust one.


Hardening vs Debugging

The reason sanitizers are so CPU/memory hungry is because they are debugging tools; they attempt to give developers as precise a diagnostic as possible, so as to be most useful for debugging.

For running code in production, what you are looking for is hardening. Hardening is about eliminating Undefined Behavior with as low an overhead as possible. Clang, for example, supports multiple ways to harden a binary:

  • Control Flow Integrity (CFI): protects against control-flow hi-jacking (virtual calls, indirect calls, ...),
  • Safe Stack: protects against stack buffer overflows, aka Return Oriented Programming,
  • Undefined Behavior Sanitizer.

Those tools can be combined and have minimal (< 1%) performance impact. They cover much less ground than sanitizers, unfortunately, and most notably do not attempt to cover use-after-free/use-after-scope or data-races which are frequent targets of attacks.


Conclusion

I do not see any way to bring C++ up to the level of safety that Rust combines, without either:

  • very serious restrictions on the language: see MISRA/JSF guidelines,
  • very serious loss of performance: sanitizers, disabling optimizations, ...
  • a complete overhaul of the standard library and coding practices, of which the Core Guidelines are a start.

On the other hand, it is worth noting that Rust itself uses unsafe code; and its unsafe code also needs to be vetted (see Rust Belt project) and would benefit from all the above sanitizers/hardening instrumentation passes.


No, the two features are not comparable.

Address sanitization is not a security feature, nor does it provide memory-safety: it's a debugging tool. Programmers already have tools to detect that the code they've written has memory problems, such as use-after-free or memory leaks. Valgrind is probably the best-known example. This gcc feature provides (some of) the same functionality: the only new thing is that it's integrated with the compiler, so it's easier to use.

You wouldn't have this feature turned on in production: it's for debugging only. You compile your tests with this flag, and automatically they detect memory errors that are triggered by the test. If your tests aren't sufficient to trigger the problem, then you still have the problem, and it'll still cause the same security flaws in production.

Rust's ownership model prevents these defects by making programs that contain such defects invalid: the compiler will not compile them. You don't have to worry about your tests not triggering the problem, because if the code compiles, there cannot be a problem.

The two features are for different sets of problems. One feature of address sanitization is to detect memory leaks (allocating memory and neglecting to free it later). Rust makes it harder to write memory leaks than in C or C++, but it's still possible (if you have circular references). Rust's ownership model prevents data races in sequential and multi-threaded situations (see below). Address sanitization doesn't aim to detect either of those cases.

An example of a data race in sequential code is if you're iterating over a collection of objects, while also adding or removing elements. In C++, changing most collections will invalidate any iterators, but it's up to the programmer to realise this has happened: it's not detected (though some collections have extra checks in debug builds). In Rust, it's not possible to mutate the collection while an iterator on it exists, because the ownership model prevents this.

An example of a data race in multithreaded code is having two threads that share an object, with access protected by a mutex. In C++, it's possible for the programmer to forget to lock the mutex while changing the object. In Rust, the mutex itself owns the object it protects, so it's not possible to access it unsafely. (There are many other kinds of concurrency bugs, though, so don't get carried away!)