Do I need to make a type a POD to persist it with a memory-mapped file?

To avoid confusion let me restate the problem.

You want to create an object in mapped memory in such a way that after the application is closed and reopened the file can be mapped once again and object used without further deserialization.

POD is kind of a red herring for what you are trying to do. You don't need to be binary copyable (what POD means); you need to be address-independent.

Address-independence requires you to:

  • avoid all absolute pointers.
  • only use offset pointers to addresses within the mapped memory.

There are a few correlaries that follow from these rules.

  • You can't use virtual anything. C++ virtual functions are implemented with a hidden vtable pointer in the class instance. The vtable pointer is an absolute pointer over which you don't have any control.
  • You need to be very careful about the other C++ objects your address-independent objects use. Basically everything in the standard library may break if you use them. Even if they don't use new they may use virtual functions internally, or just store the address of a pointer.
  • You can't store references in the address-independent objects. Reference members are just syntactic sugar over absolute pointers.

Inheritance is still possible but of limited usefulness since virtual is outlawed.

Any and all constructors / destructors are fine as long as the above rules are followed.

Even Boost.Interprocess isn't a perfect fit for what you're trying to do. Boost.Interprocess also needs to manage shared access to the objects, whereas you can assume that you're only one messing with the memory.

In the end it may be simpler / saner to just use Google Protobufs and conventional serialization.


Not as much an answer as a comment that grew too big:

I think it's going to depend on how much safety you're willing to trade for speed/ease of usage. In the case where you have a struct like this:

struct S { char c; double d; };

You have to consider padding and the fact that some architectures might not allow you to access a double unless it is aligned on a proper memory address. Adding accessor functions and fixing the padding tackles this and the structure is still memcpy-able, but now we're entering territory where we're not really gaining much of a benefit from using a memory mapped file.

Since it seems like you'll only be using this locally and in a fixed setup, relaxing the requirements a little seems OK, so we're back to using the above struct normally. Now does the function have to be trivially copyable? I don't necessarily think so, consider this (probably broken) class:

   1 #include <iostream>
   2 #include <utility>
   3 
   4 enum Endian { LittleEndian, BigEndian };
   5 template<typename T, Endian e> struct PV {
   6         union {
   7                 unsigned char b[sizeof(T)];
   8                 T x;
   9         } val;  
  10         
  11         template<Endian oe> PV& operator=(const PV<T,oe>& rhs) {
  12                 val.x = rhs.val.x;
  13                 if (e != oe) {
  14                         for(size_t b = 0; b < sizeof(T) / 2; b++) {
  15                                 std::swap(val.b[sizeof(T)-1-b], val.b[b]);
  16                         }       
  17                 }       
  18                 return *this;
  19         }       
  20 };      

It's not trivially copyable and you can't just use memcpy to move it around in general, but I don't see anything immediately wrong with using a class like this in the context of a memory mapped file (especially not if the file matches the native byte order).

Update:
Where do you draw the line?

I think a decent rule of thumb is: if the equivalent C code is acceptable and C++ is just being used as a convenience, to enforce type-safety, or proper access it should be fine.

That would make boost::interprocess::offset_ptr OK since it's just a helpful wrapper around a ptrdiff_t with special semantic rules. In the same vein struct PV above would be OK as it's just meant to byte swap automatically, though like in C you have to be careful to keep track of the byte order and assume that the structure can be trivially copied. Virtual functions wouldn't be OK as the C equivalent, function pointers in the structure, wouldn't work. However something like the following (untested) code would again be OK:

struct Foo { 
    unsigned char obj_type;
    void vfunc1(int arg0) { vtables[obj_type].vfunc1(this, arg0); }
};

Even PoD has pitfalls if you are interested in interoperating across different systems or across time.

You might look at Google Protocol Buffers for a way to do this in a portable fashion.


Yes, but for reasons other than the ones that seem to concern you.

You've got virtual functions and a virtual base class. These lead to a host of pointers created behind your back by the compiler. You can't turn them into offsets or anything else.

If you want to do this style of persistence, you need to eschew 'virtual'. After that, it's all a matter of the semantics. Really, just pretend you were doing this in C.