In practice, why would different compilers compute different values of int x = ++i + ++i;?

The compiler takes your code, splits it into very simple instructions, and then recombines and arranges them in a way that it thinks optimal.

The code

int i = 1;
int x = ++i + ++i;

consists of the following instructions:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
5. read i as tmp2
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

But despite this being a numbered list the way I wrote it, there are only a few ordering dependencies here: 1->2->3->4->5->10->11 and 1->6->7->8->9->10->11 must stay in their relative order. Other than that the compiler can freely reorder, and perhaps eliminate redundancy.

For example, you could order the list like this:

1. store 1 in i
2. read i as tmp1
6. read i as tmp3
3. add 1 to tmp1
7. add 1 to tmp3
4. store tmp1 in i
8. store tmp3 in i
5. read i as tmp2
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

Why can the compiler do this? Because there's no sequencing to the side effects of the increment. But now the compiler can simplify: for example, there's a dead store in 4: the value is immediately overwritten. Also, tmp2 and tmp4 are really the same thing.

1. store 1 in i
2. read i as tmp1
6. read i as tmp3
3. add 1 to tmp1
7. add 1 to tmp3
8. store tmp3 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

And now everything to do with tmp1 is dead code: it's never used. And the re-read of i can be eliminated too:

1. store 1 in i
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
10. add tmp3 and tmp3, as tmp5
11. store tmp5 in x

Look, this code is much shorter. The optimizer is happy. The programmer is not, because i was only incremented once. Oops.

Let's look at something else the compiler can do instead: let's go back to the original version.

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
5. read i as tmp2
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

The compiler could reorder it like this:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
5. read i as tmp2
9. read i as tmp4
10. add tmp2 and tmp4, as tmp5
11. store tmp5 in x

and then notice again that i is read twice, so eliminate one of them:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
6. read i as tmp3
7. add 1 to tmp3
8. store tmp3 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

That's nice, but it can go further: it can reuse tmp1:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
6. read i as tmp1
7. add 1 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

Then it can eliminate the re-read of i in 6:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
4. store tmp1 in i
7. add 1 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

Now 4 is a dead store:

1. store 1 in i
2. read i as tmp1
3. add 1 to tmp1
7. add 1 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

and now 3 and 7 can be merged into one instruction:

1. store 1 in i
2. read i as tmp1
3+7. add 2 to tmp1
8. store tmp1 in i
5. read i as tmp2
10. add tmp2 and tmp2, as tmp5
11. store tmp5 in x

Eliminate the last temporary:

1. store 1 in i
2. read i as tmp1
3+7. add 2 to tmp1
8. store tmp1 in i
10. add tmp1 and tmp1, as tmp5
11. store tmp5 in x

And now you get the result that Visual C++ is giving you.

Note that in both optimization paths, the important order dependencies were preserved, insofar as the instructions weren't removed for doing nothing.


While this is UB (as the OP implied), following are hypothetical ways a compiler could get the 3 results. All three would give the same correct x result if used with different int i = 1, j = 1; variables instead of one and the same i.

  1. both ++i return 2, resulting in x=4.
int i = 1;
int i1 = i, i2 = i;   // i1 = i2 = 1
++i1;                 // i1 = 2
++i2;                 // i2 = 2
int x = i1 + i2;      // x = 4
  1. one ++i returns 2 and the other returns 3, resulting in x=5.
int i = 1;
int i1 = ++i;           // i1 = 2
int i2 = ++i;           // i2 = 3
int x = i1 + i2;        // x = 5
  1. both ++i return 3, resulting in x=6.
int i = 1;
int &i1 = i, &i2 = i;
++i1;                   // i = 2
++i2;                   // i = 3
int x = i1 + i2;        // x = 6

To me, the second seems most likely.

I am going for option #4: Both ++i happen concurrently.

Newer processors move toward some interesting optimizations and parallel code evaluation, where allowed as here, is another way compilers keep making faster code. I see as a practical implementation, compilers moving toward parallelism.

I could readily see a race condition causing non-deterministic behavior or a bus fault due to same memory contention - all allowed as the coder violated the C++ contract - hence UB.

My question is: what (reasonable) things could a C++ compiler do that would lead to a result of 4 or a result or 6?

It could, but do not count in it.

Don't use ++i + ++i nor expect sensible results.