Is it possible to partially free dynamically-allocated memory on a POSIX system?

If your entire buffer has to be in memory at once, then you probably will not gain much from freeing it partially later.

The main point on this post is basically to NOT tell you to do what you want to do, because the OS will not unnecessarily keep your application's memory in RAM if it's not actually needed. This is the difference between "resident memory usage" and "virtual memory usage". "Resident" is what is currently used and in RAM, "virtual" is the total memory usage of your application. And as long as your swap partition is large enough, "virtual" memory is pretty much a non-issue. [I'm assuming here that your system will not run out of virtual memory space, which is true in a 64-bit application, as long as you are not using hundreds of terabytes of virtual space!]

If you still want to do that, and want to have some reasonable portability, I would suggest building a "wrapper" that behaves kind of like std::vector and allocates lumps of some megabytes (or perhaps a couple of gigabytes) of memory at a time, and then something like:

 for (size_t i = 0; i < buf.size(); ++i) {
    do_algorithm(buf[i]);
    buf.done(i);
 }

The done method will simply check if the value if i is (one element) past the end of the current buffer, and free it. [This should inline nicely, and produce very little overhead on the average loop - assuming elements are actually used in linear order, of course].

I'd be very surprised if this gains you anything, unless do_algorithm(buf[i]) takes quite some time (certainly many seconds, probably many minutes or even hours). And of course, it's only going to help if you actually have something else useful to do with that memory. And even then, the OS will reclaim memory that isn't actively used by swapping it out to disk, if the system is short of memory.

In other words, if you allocate 100GB, fill it, leave it sitting without touching, it will eventually ALL be on the hard-disk rather than in RAM.

Further, it is not at all unusual that the heap in the application retains freed memory, and that the OS does not get the memory back until the application exits - and certainly, if only parts of a larger allocation is freed, the runtime will not release it until the whole block has been freed. So, as stated at the beginning, I'm not sure how much this will actually help your application.

As with everything regarding "tuning" and "performance improvements", you need to measure and compare a benchmark, and see how much it helps.


Is it possible to partially free dynamically-allocated memory on a POSIX system?

You can not do it using malloc()/realloc()/free().

However, you can do it in a semi-portable way using mmap() and munmap(). The key point is that if you munmap() some page, malloc() can later use that page:

  • create an anonymous mapping using mmap();
  • subsequently call munmap() for regions that you don't need anymore.

The portability issues are:

  • POSIX doesn't specify anonymous mappings. Some systems provide MAP_ANONYMOUS or MAP_ANON flag. Other systems provide special device file that can be mapped for this purpose. Linux provides both.
  • I don't think that POSIX guarantees that when you munmap() a page, malloc() will be able to use it. But I think it'll work an all systems that have mmap()/unmap().

Update

If your memory region is so large that most pages surely will be written to swap, you will not loose anything by using file mappings instead of anonymous mappings. File mappings are specified in POSIX.


If you can do without the convenience of std::vector (which won't give you much in this case anyway because you'll never want to copy / return / move that beast anyway), you can do your own memory handling. Ask the operating system for entire pages of memory (via mmap) and return them as appropriate (using munmap). You can tell mmap via its fist argument and the optional MAP_FIXED flag to map the page at a particular address (which you must ensure to be not otherwise occupied, of course) so you can build up an area of contiguous memory. If you allocate the entire memory upfront, then this is not an issue and you can do it with a single mmap and let the operating system choose a convenient place to map it. In the end, this is what malloc does internally. For platforms that don't have sys/mman.h, it's not difficult to fall back to using malloc if you can live with the fact that on those platforms, you won't return memory early.

I'm suspecting that if your allocation sizes are always multiples of the page size, realloc will be smart enough not to copy any data. You'd have to try this out and see if it works (or consult your malloc's documentation) on your particular target platform, though.