How to implement Reference counting in C?

Semi-serious answer

#include "Python.h"

Python has a great reference counting memory manager. If I had to do this for real in production code, not homework, I'd consider embedding the python object system in my C program which would then make my C program scriptable in python too. See the Python C API documentation if you are interested!


Such a system in C requires some discipline on the part of the programmer but ...

You need to think in terms of ownership. All things that hold references are owners and must keep track of the objects to which it holds references, e.g. through lists. When a reference holding thing is destroyed it must loop its list of referred objects and decrement their reference counters and if zero destroy them in turn.

Functions are also owners and should keep track of referenced objects, e.g. by setting up a list at the start of the function and looping through it when returning.

So you need to determine in which situations objects should be transferred or shared with new owners and wrap the corresponding situations in macros/functions that add or remove owned objects to owning objects' lists of referenced objects (and adjust the reference counter accordingly).

Finally you need to deal with circular references somehow by checking for objects that are no longer reachable from objects/pointers on the stack. That could be done with some mark and sweep garbage collection mechanism.


An early problem with reference counting is that it is relatively easy to count the initial reference by putting code in a custom malloc / free implementation, but it is quite a bit harder to determine if the initial recipient passes that address around to others.

Since C lacks the ability to override the assignment operator (to count the new reference), basically you are left with a limited number of options. The only one that can possibly override the assignment is macrodef, as it has the ability to rewrite the assignment into something that inlines the increment of the reference count value.

So you need to "expand" a macro that looks like

a = b;

into

if (b is a pointer) { // this might be optional, if lookupReference does this work
  struct ref_record* ref_r = lookupReference(b);
  if (ref_r) {
    ref_r->count++;
  } else {
    // error
  } 
}
a = b;

The real trick will be in writing a macro that can identify the assignment, and insert the code cleanly without introducing other unwanted side-effects. Since macrodef is not a complete language, you might run into issues where the matching becomes impossible.

(jokes about seeing nails where you learn how to use a hammer have an interesting parallel here, except that when you only have a hammer, you had better learn how to make everything a nail).

Other options (perhaps more sane, perhaps not) is to keep track of all address values assigned by malloc, and then scan the program's stack and heap for matching addresses. If you match, you might have found a valid pointer, or you might have found a string with a luck encoding; however, if you don't match, you certainly can free the address; provided they aren't storing an address + offset calculated from the original address. (perhaps you can macrodef to detect such offsets, and add the offset as multiple addresses in the scan for the same block)

In the end, there isn't going to be a foolproof solution without building a referencing system, where you pass back references (pretend addresses); hiding the real addresses. The down side to such a solution is that you must use the library interface every time you want to deal with an address. This includes the "next" element in the array, etc. Not very C-like, but a pretty good approximation of what Java does with its references.