Does "volatile" guarantee anything at all in portable C code for multi-core systems?

I'm no expert, but cppreference.com has what appears to me to be some pretty good information on volatile. Here's the gist of it:

Every access (both read and write) made through an lvalue expression of volatile-qualified type is considered an observable side effect for the purpose of optimization and is evaluated strictly according to the rules of the abstract machine (that is, all writes are completed at some time before the next sequence point). This means that within a single thread of execution, a volatile access cannot be optimized out or reordered relative to another visible side effect that is separated by a sequence point from the volatile access.

It also gives some uses:

Uses of volatile

1) static volatile objects model memory-mapped I/O ports, and static const volatile objects model memory-mapped input ports, such as a real-time clock

2) static volatile objects of type sig_atomic_t are used for communication with signal handlers.

3) volatile variables that are local to a function that contains an invocation of the setjmp macro are the only local variables guaranteed to retain their values after longjmp returns.

4) In addition, volatile variables can be used to disable certain forms of optimization, e.g. to disable dead store elimination or constant folding for microbenchmarks.

And of course, it mentions that volatile is not useful for thread synchronization:

Note that volatile variables are not suitable for communication between threads; they do not offer atomicity, synchronization, or memory ordering. A read from a volatile variable that is modified by another thread without synchronization or concurrent modification from two unsynchronized threads is undefined behavior due to a data race.


First of all, there's historically been various hiccups regarding different intepretations of the meaning of volatile access and similar. See this study: Volatiles Are Miscompiled, and What to Do about It.

Apart from the various issues mentioned in that study, the behavior of volatile is portable, save for one aspect of them: when they act as memory barriers. A memory barrier is some mechanism which is there to prevent concurrent unsequenced execution of your code. Using volatile as a memory barrier is certainly not portable.

Whether the C language guarantees memory behavior or not from volatile is apparently arguable, though personally I think the language is clear. First we have the formal definition of side effects, C17 5.1.2.3:

Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment.

The standard defines the term sequencing, as a way of determining order of evaluation (execution). The definition is formal and cumbersome:

Sequenced before is an asymmetric, transitive, pair-wise relation between evaluations executed by a single thread, which induces a partial order among those evaluations. Given any two evaluations A and B, if A is sequenced before B, then the execution of A shall precede the execution of B. (Conversely, if A is sequenced before B, then B is sequenced after A.) If A is not sequenced before or after B, then A and B are unsequenced. Evaluations A and B are indeterminately sequenced when A is sequenced either before or after B, but it is unspecified which.13) The presence of a sequence point between the evaluation of expressions A and B implies that every value computation and side effect associated with A is sequenced before every value computation and side effect associated with B. (A summary of the sequence points is given in annex C.)

The TL;DR of the above is basically that in case we have an expression A which contains side-effects, it must be done executing before another expression B, in case B is sequenced after A.

Optimizations of C code are made possible through this part:

In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object).

This means that the program may evaluate (execute) expressions in the order that the standard mandates elsewhere (order of evaluation etc). But it need not evaluate (execute) a value if it can deduce that it is not used. For example, the operation 0 * x doesn't need to evaluate x and simply replace the expression with 0.

Unless accessing a variable is a side-effect. Meaning that in case x is volatile, it must evaluate (execute) 0 * x even though the result will always be 0. Optimization is not allowed.

Furthermore, the standard speaks of observable behavior:

The least requirements on a conforming implementation are:

  • Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.
    /--/ This is the observable behavior of the program.

Given all of the above, a conforming implementation (compiler + underlying system) may not execute the access of volatile objects in an unsequenced order, in case the semantics of the written C source says otherwise.

This means that in this example

volatile int x;
volatile int y;
z = x;
z = y;

Both assignment expressions must be evaluated and z = x; must be evaluated before z = y;. A multi-processor implementation that outsource these two operations to two different unsequences cores is not conforming!

The dilemma is that compilers can't do much about things like pre-fetch caching and instruction pipelining etc, particularly not when running on top of an OS. And so compilers hand that problem over to the programmers, telling them that memory barriers is now the programmer's responsibility. While the C standard clearly states that the problem needs to be solved by the compiler.

The compiler doesn't necessarily care to solve the problem though, and so volatile for the sake of acting as a memory barrier is non-portable. It has become a quality of implementation issue.