Virtual tables and memory layout in multiple virtual inheritance

Virtual bases are very different from ordinary bases. Remember that "virtual" means "determined at runtime" -- thus the entire base subobject must be determined at runtime.

Imagine that you are getting a B & x reference, and you are tasked to find the A::a member. If the inheritance were real, then B has a superclass A, and thus the B-object which you are viewing through x has an A-subobject in which you can locate your member A::a. If the most-derived object of x has multiple bases of type A, then you can only see that particular copy which is the subobject of B.

But if the inheritance is virtual, none of this makes sense. We don't know which A-subobject we need -- this information simply doesn't exist at compile time. We could be dealing with an actual B-object as in B y; B & x = y;, or with a C-object like C z; B & x = z;, or something entirely different that derives virtually from A many more times. The only way to know is to find the actual base A at runtime.

This can be implemented with one more level of runtime indirection. (Note how this is entirely parallel to how virtual functions are implemented with one extra level of runtime indirection compared to non-virtual functions.) Instead of having a pointer to a vtable or base subobject, one solution is to store a pointer to a pointer to the actual base subobject. This is sometimes called a "thunk" or "trampoline".

So the actual object C z; may look as follows. The actual ordering in memory is up to the compiler and unimportant, and I've suppressed vtables.

+-+------++-+------++-----++-----+
|T|  B1  ||T|  B2  ||  C  ||  A  |
+-+------++-+------++-----++-----+
 |         |                 |
 V         V                 ^
 |         |       +-Thunk-+ |
 +--->>----+-->>---|     ->>-+
                   +-------+

Thus, no matter whether you have a B1& or a B2&, you first look up the thunk, and that one in turn tells you where to find the actual base subobject. This also explains why you cannot perform a static cast from an A& to any of the derived types: this information simply doesn't exist at compile time.

For a more in-depth explanation, take a look at this fine article. (In that description, the thunk is part of the vtable of C, and virtual inheritance always necessitates the maintenance of vtables, even if there are no virtual functions anywhere.)


I have pimped your code a bit as follows:

#include <stdio.h>
#include <stdint.h>

struct A {
   int a; 
   A() : a(32) { f(0); }
   A(int i) : a(32) { f(i); }
   virtual void f(int i) { printf("%d\n", i); }
};

struct B1 : virtual A {
   int b1;
   B1(int i) : A(i), b1(33) { f(i); }
   virtual void f(int i) { printf("%d\n", i+10); }
};

struct B2 : virtual A {
   int b2;
   B2(int i) : A(i), b2(34) { f(i); }
   virtual void f(int i) { printf("%d\n", i+20); }
};

struct C : B1, virtual B2 {
   int c;
   C() : B1(6),B2(3),A(1), c(35) {}
   virtual void f(int i) { printf("%d\n", i+30); }
};

int main() {
    C foo;
    intptr_t address = (intptr_t)&foo;
    printf("offset A = %ld, sizeof A = %ld\n", (intptr_t)(A*)&foo - address, sizeof(A));
    printf("offset B1 = %ld, sizeof B1 = %ld\n", (intptr_t)(B1*)&foo - address, sizeof(B1));
    printf("offset B2 = %ld, sizeof B2 = %ld\n", (intptr_t)(B2*)&foo - address, sizeof(B2));
    printf("offset C = %ld, sizeof C = %ld\n", (intptr_t)(C*)&foo - address, sizeof(C));
    unsigned char* data = (unsigned char*)address;
    for(int offset = 0; offset < sizeof(C); offset++) {
        if(!(offset & 7)) printf("| ");
        printf("%02x ", (int)data[offset]);
    }
    printf("\n");
}

As you see, this prints quite a bit of additional information that allows us to deduce the memory layout. The output on my machine (a 64-bit linux, little endian byte order) is this:

1
23
16
offset A = 16, sizeof A = 16
offset B1 = 0, sizeof B1 = 32
offset B2 = 32, sizeof B2 = 32
offset C = 0, sizeof C = 48
| 00 0d 40 00 00 00 00 00 | 21 00 00 00 23 00 00 00 | 20 0d 40 00 00 00 00 00 | 20 00 00 00 00 00 00 00 | 48 0d 40 00 00 00 00 00 | 22 00 00 00 00 00 00 00 

So, we can describe the layout as follows:

+--------+----+----+--------+----+----+--------+----+----+
|  vptr  | b1 | c  |  vptr  | a  | xx |  vptr  | b2 | xx |
+--------+----+----+--------+----+----+--------+----+----+

Here, xx denotes padding. Note how the compiler has placed the variable c into the padding of its non-virtual base. Note also, that all three v-pointers are different, this allows the program to deduce the correct positions of all the virtual bases.