Getting size of dynamic C-style array vs. use of delete[]. Contradiction?

I think the reason for this is a confluence of three factors.

  1. C++ has a "you only pay for what you use" culture
  2. C++ started its life as a pre-processor for C and hence had to be built on top of what C offered.
  3. C++ is one of the most widely ported languages around. Features that make life difficult for existing ports are unlikely to get added.

C allows the programmer to free memory blocks without specifying the size of the memory block to free, but does not provide the programmer with any standard way to access the size of the allocation. Furthermore the actual amount of memory allocated may well be larger than the amount the programmer asked for.

Following the principle of "you only pay for what you use", C++ implementations implement new[] differently for different types. Typically they only store the size if it is necessary to do so, usually because the type has a non-trivial destructor.

So while yes, enough information is stored to free the memory block, it would be very difficult to define a sane and portable API for accessing that information. Depending on the data type and platform, the actual requested size may be available (for types where the C++ implementation has to store it), only the actual allocated size may be available (for types where the C++ implementation does not have to store it on platforms where the underlying memory manager has an extension to get the allocated size), or the size may not be available at all (for types where the C++ implementation does not have to store it on platforms that don't provide access to the information from the underlying memory manager).


TL;DR The operator delete[] destructs the objects and deallocates the memory. The information N ("number of elements") is required for destructing. The information S ("size of allocated memory") is required for deallocating. S is always stored and can be queried by compiler extensions. N is only stored if destructing objects requires calling destructors. If N is stored, where it is stored is implementation-dependent.


The operator delete [] has to do two things:

a) destructing the objects (calling destructors, if necessary) and

b) deallocating the memory.

Let's first discuss (de)allocation, which is delegated to the C functions malloc and free by many compilers (like GCC). The function malloc takes the number of bytes to be allocated as a parameter and returns a pointer. The function free takes only a pointer; the number of bytes is not necessary. This means that the memory allocating functions have to keep track how many bytes have been allocated. There could be a function to query how many bytes have been allocated (in Linux this can be done with malloc_usable_size, in Windows with _msize). This is not what you want because this does not tell you the size of an array but the amount of memory allocated. Since malloc is not necessarily giving you exactly as much memory as you have asked for, you cannot compute the array size from the result of malloc_usable_size:

#include <iostream>
#include <malloc.h>

int main()
{
    std::cout << malloc_usable_size(malloc(42)) << std::endl;
}

This example gives you 56, not 42: http://cpp.sh/2wdm4

Note that applying malloc_usable_size (or _msize) to the result of new is undefined behavior.

So, let's now discuss construction and destruction of objects. Here, you have two ways of delete: delete (for single objects) and delete[] (for arrays). In very old versions of C++, you had to pass the size of the array to the delete[]-operator. As you mentioned, nowadays, this is not the case. The compiler tracks this information. GCC adds a small field prior the beginning of the array, where the size of the array is stored such that it knows how often the destructor has to be called. You might query that:

#include <iostream>

struct foo {
    char a;
    ~foo() {}
};

int main()
{
    foo * ptr = new foo[42];
    std::cout << *(((std::size_t*)ptr)-1) << std::endl;
}

This code gives you 42: http://cpp.sh/7mbqq

Just for the protocol: This is undefined behavior, but with the current version of GCC it works.

So, you might ask yourself why there is no function to query this information. The answer is that GCC doesn't always store this information. There might be cases where destruction of the objects is a no-operation (and the compiler is able to figure that out). Consider the following example:

#include <iostream>

struct foo {
    char a;
    //~foo() {}
};

int main()
{
    foo * ptr = new foo[42];
    std::cout << *(((std::size_t*)ptr)-1) << std::endl;
}

Here, the answer is not 42 any more: http://cpp.sh/2rzfb

The answer is just garbage - the code was undefined behavior again.

Why? Because the compiler does not need to call a destructor, so it does not need to store the information. And, yes, in this case the compiler does not add code that keeps track how many objects have been created. Only the number of allocated bytes (which might be 56, see above) is known.


It does - the allocator, or some implementation detail behind it, knows exactly what the size of the block is.

But that information is not provided to you or to the "code layer" of your program.

Could the language have been designed to do this? Sure! It's probably a case of "don't pay for what you don't use" — it's your responsibility to remember this information. After all, you know how much memory you asked for! Often times people will not want the cost of a number being passed up the call stack when, most of the time, they won't need it to be.

There are some platform-specific "extensions" that may get you what you want, like malloc_usable_size on Linux and _msize on Windows, though these assume that your allocator used malloc and didn't do any other magic that may extend the size of the allocated block at the lowest level. I'd say you're still better off tracking this yourself if you really need it… or using a vector.

Tags:

C++

Arrays

Heap