switching to another different custom allocator -> propagate to member fields

Justification

At its core, this question is asking for a way to use a custom allocator with a multi-level container. There are other stipulations, but after thinking about this, I've decided to ignore some of those stipulations. They seem to be getting in the way of solutions without a good reason. That leaves open the possibility of an answer from the standard library: std::scoped_allocator_adaptor and std::vector.

Perhaps the biggest change with this approach is tossing the idea that a container's allocator needs to be modifiable after construction (toss the setAllocator member). That idea seems questionable in general and incorrect in this specific case. Look at the criteria for deciding which allocator to use:

  • One-frame allocation requires the object be destroyed by the end of the loop over timeStep.
  • Heap allocation should be used when one-frame allocation cannot.

That is, you can tell which allocation strategy to use by looking at the scope of the object/variable in question. (Is it inside or outside the loop body?) Scope is known at construction time and does not change (as long as you don't abuse std::move). So the desired allocator is known at construction time and does not change. However, the current constructors do not permit specifying an allocator. That is something to change. Fortunately, such a change is a fairly natural extension of introducing scoped_allocator_adaptor.

The other big change is tossing the MyArray class. Standard containers exist to make your programming easier. Compared to writing your own version, the standard containers are faster to implement (as in, already done) and less prone to error (the standard strives for a higher bar of quality than "works for me this time"). So out with the MyArray template and in with std::vector.

How to do it

The code snippets in this section can be joined into a single source file that compiles. Just skip over my commentary between them. (This is why only the first snippet includes headers.)

Your current Allocator class is a reasonable starting point. It just needs a pair of methods that indicate when two instances are interchangeable (i.e. when both are able to deallocate memory that was allocated by either of them). I also took the liberty of changing amountByte to an unsigned type, since allocating a negative amount of memory does not make sense. (I left the type of align alone though, since there is no indication of what values this would take. Possibly it should be unsigned or an enumeration.)

#include <cstdlib>
#include <functional>
#include <scoped_allocator>
#include <vector>

class Allocator {
public:
    virtual void * allocate(std::size_t amountByte, int align)=0;
    virtual void deallocate(void * v)=0;
    //some complex field and algorithm

    // **** Addition ****
    // Two objects are considered equal when they are interchangeable at deallocation time.
    // There might be a more refined way to define this relation, but without the internals
    // of Allocator, I'll go with simply being the same object.
    bool operator== (const Allocator & other) const  { return this == &other; }
    bool operator!= (const Allocator & other) const  { return this != &other; }
};

Next up are the two specializations. Their details are outside the scope of the question, though. So I'll just mock up something that will compile (needed since one cannot directly instantiate an abstract base class).

// Mock-up to allow defining the two allocators.
class DerivedAllocator : public Allocator {
public:
    void * allocate(std::size_t amountByte, int)  override { return std::malloc(amountByte); }
    void   deallocate(void * v)                   override { std::free(v); }
};
DerivedAllocator oneFrameAllocator;
DerivedAllocator heapAllocator;

Now we get into the first meaty chunk – adapting Allocator to the standard's expectations. This consists of a wrapper template whose parameter is the type of object being constructed. If you can parse the Allocator requirements, this step is simple. Admitedly, parsing the requirements is not simple since they are designed to cover "fancy pointers".

// Standard interface for the allocator
template <class T>
struct AllocatorOf {

    // Some basic definitions:

    //Allocator & alloc; // A plain reference is an option if you don't support swapping.
    std::reference_wrapper<Allocator> alloc; // Or a pointer if you want to add null checks.
    AllocatorOf(Allocator & a) : alloc(a) {} // Note: Implicit conversion allowed

    // Maybe this value would come from a helper template? Tough to say, but as long as
    // the value depends solely on T, the value can be a static class constant.
    static constexpr int ALIGN = 0;

    // The things required by the Allocator requirements:

    using value_type = T;
    // Rebind from other types:
    template <class U>
    AllocatorOf(const AllocatorOf<U> & other) : alloc(other.alloc) {}
    // Pass through to Allocator:
    T *  allocate  (std::size_t n)        { return static_cast<T *>(alloc.get().allocate(n * sizeof(T), ALIGN)); }
    void deallocate(T * ptr, std::size_t) { alloc.get().deallocate(ptr); }
    // Support swapping (helps ease writing a constructor)
    using propagate_on_container_swap = std::true_type;
};
// Also need the interchangeability test at this level.
template<class T, class U>
bool operator== (const AllocatorOf<T> & a_t, const AllocatorOf<U> & a_u)
{ return a_t.get().alloc == a_u.get().alloc; }
template<class T, class U>
bool operator!= (const AllocatorOf<T> & a_t, const AllocatorOf<U> & a_u)
{ return a_t.get().alloc != a_u.get().alloc; }

Next up are the manifold classes. The lowest level (M1) does not need any changes.

The mid-levels (M2) need two additions to get the desired results.

  1. The member type allocator_type needs to be defined. Its existence indicates that the class is allocator-aware.
  2. There needs to be a constructor that takes, as parameters, an object to copy and an allocator to use. This makes the class actually allocator-aware. (Potentially other constructors with an allocator parameter would be required, depending on what you actually do with these classes. The scoped_allocator works by automatically appending the allocator to the provided construction parameters. Since the sample code makes copies inside the vectors, a "copy-plus-allocator" constructor is needed.)

In addition, for general use, the mid-levels should get a constructor whose lone parameter is an allocator. For readability, I'll also bring back the MyArray name (but not the template).

The highest level (M3) just needs the constructor taking an allocator. Still, the two type aliases are useful for readability and consistency, so I'll throw them in as well.

class M1{};   //e.g. a single-point collision site

class M2{     //e.g. analysed many-point collision site
public:
    using allocator_type = std::scoped_allocator_adaptor<AllocatorOf<M1>>;
    using MyArray        = std::vector<M1, allocator_type>;

    // Default construction still uses oneFrameAllocator, but this can be overridden.
    explicit M2(const allocator_type & alloc = oneFrameAllocator) : m1s(alloc) {}
    // "Copy" constructor used via scoped_allocator_adaptor
    //M2(const M2 & other, const allocator_type & alloc) : m1s(other.m1s, alloc) {}
    // You may want to instead delegate to the true copy constructor. This means that
    // the m1s array will be copied twice (unless the compiler is able to optimize
    // away the first copy). So this would need to be performance tested.
    M2(const M2 & other, const allocator_type & alloc) : M2(other)
    {
        MyArray realloc{other.m1s, alloc};
        m1s.swap(realloc); // This is where we need swap support.
    }

    MyArray m1s;
};

class M3{     //e.g. analysed collision surface
public:
    using allocator_type = std::scoped_allocator_adaptor<AllocatorOf<M2>>;
    using MyArray        = std::vector<M2, allocator_type>;

    // Default construction still uses oneFrameAllocator, but this can be overridden.
    explicit M3(const allocator_type & alloc = oneFrameAllocator) : m2s(alloc) {}

    MyArray m2s;
};

Let's see... two lines added to Allocator (could be reduced to just one), four-ish to M2, three to M3, eliminate the MyArray template, and add the AllocatorOf template. That's not a huge difference. Well, a little more than that count if you want to leverage the auto-generated copy constructor for M2 (but with the benefit of fully supporting the swapping of vectors). Overall, not that drastic a change.

Here is how the code would be used:

int main()
{
    M3 output_m3{heapAllocator};
    for ( int timeStep = 0; timeStep < 100; timeStep++ ) {
        //v start complex computation #2
        M3 m3;
        M2 m2;
        M1 m1;
        m2.m1s.push_back(m1);  // <-- vector uses push_back() instead of add()
        m3.m2s.push_back(m2);  // <-- vector uses push_back() instead of add()
        //^ end complex computation
        output_m3 = m3; // change to heap allocation
        //.... clean up oneFrameAllocator here ....
    }    
}

The assignment seen here preserves the allocation strategy of output_m3 because AllocatorOf does not say to do otherwise. This seems to be what should be the desired behavior, not the old way of copying the allocation strategy. Note that if both sides of an assignment already use the same allocation strategy, it doesn't matter if the strategy is preserved or copied. Hence, existing behavior should be preserved with no need for further changes.

Aside from specifying that one variable uses heap allocation, use of the classes is no messier than it was before. Since it was assumed that at some point there would be a need to specify heap allocation, I don't see why this would be objectionable. Use the standard library – it's there to help.


Since you're aiming at performance, I imply that your classes would not manage the lifetime of allocator itself, and would simply use it's raw pointer. Also, since you're changing storage, copying is inevitable. In this case, all you need is to add a "parametrized copy constructor" to each class, e.g.:

template <typename T> class MyArray {
    private:
        Allocator& _allocator;

    public:
        MyArray(Allocator& allocator) : _allocator(allocator) { }
        MyArray(MyArray& other, Allocator& allocator) : MyArray(allocator) {
            // copy items from "other", passing new allocator to their parametrized copy constructors
        }
};

class M1 {
    public:
        M1(Allocator& allocator) { }
        M1(const M1& other, Allocator& allocator) { }
};

class M2 {
    public:
        MyArray<M1> m1s;

    public:
        M2(Allocator& allocator) : m1s(allocator) { }
        M2(const M2& other, Allocator& allocator) : m1s(other.m1s, allocator) { }
};

This way you can simply do:

M3 stackM3(stackAllocator);
// do processing
M3 heapM3(stackM3, heapAllocator); // or return M3(stackM3, heapAllocator);

to create other-allocator-based copy.

Also, depeding on your actual code structure, you can add some template magic to automate things:

template <typename T> class MX {
    public:
        MyArray<T> ms;

    public:
        MX(Allocator& allocator) : ms(allocator) { }
        MX(const MX& other, Allocator& allocator) : ms(other.ms, allocator) { }
}

class M2 : public MX<M1> {
    public:
        using MX<M1>::MX; // inherit constructors
};

class M3 : public MX<M2> {
    public:
        using MX<M2>::MX; // inherit constructors
};

I realize this isn't the answer to your question - but if you only need the object for the next cycle ( and not future cycles past that ), can you just keep two one-frame allocators destroying them on alternate cycles?

Since you are writing the allocator yourself this could be handled directly in the allocator where the clean-up function knows if this is an even or odd cycle.

Your code would then look something like:

int main(){
    M3 output_m3; 
    for(int timeStep=0;timeStep<100;timeStep++){
        oneFrameAllocator.set_to_even(timeStep % 2 == 0);
        //v start complex computation #2
        M3 m3;
        M2 m2;
        M1 m1;
        m2.m1s.add(m1);
        m3.m2s.add(m2);
        //^ end complex computation
        output_m3=m3; 
        oneFrameAllocator.cleanup(timestep % 2 == 1); //cleanup odd cycle
    }
}