With std::byte standardized, when do we use a void* and when a byte*?

(This is a potential rule of thumb which comes off the top of my head, not condoned by anyone.)

Rule of thumb: When to use which kind of pointer?

  • Use char * for sequences of textual characters, not anything else.
  • Use void * in type-erasure scenarios, i.e. when the pointed-to data is typed, but for some reason a typed pointer must not be used or it cannot be determined whether it's typed or not.
  • Use byte * for raw memory for which there is no indication of it holding any typed data.

An exception to the above:

  • Also use void */unsigned char */char * when older or non-C++ forces you and you would otherwise use byte * - but wrap that with a byte *-based interface as tightly as you can rather than exposing it to the rest of your C++ code.

Examples

void * my_custom_malloc(size_t size) - wrong
byte * my_custom_malloc(size_t size) - right

struct buffer_t { byte* data; size_t length; my_type_t data_type; } - wrong
struct buffer_t { void* data; size_t length; my_type_t data_type; } - right


First, void * still makes sense when you have to use a C library function or generally speaking to use any other extern "C" compatible function.

Next a std::byte array still allows individual access to any of its elements. Said differently this is legal:

std::byte *arr = ...;
arr[i] = std::byte{0x2a};

It makes sense if you want to be able to allow that low level access, for example if you want to manually copy all or parts of the array.

On the other hand, void * is really an opaque pointer, in the sense that you will have to cast it (to a char or byte) before being able to access its individual elements.

So my opinion is that std::byte should be used as soon as you want to be able to address elements of an array or move a pointer, and void * still makes sense to denote an opaque zone that will only be passed (hard to actually process a void *) as a whole.

But real use case for void * should become more and more unusual in modern C++ at least at high level, because those opaque zones should normally be hidden in higher level classes coming with methods to process them. So IMHO void * should in the end be limited to C (and older C++ versions) compatibiliy, and low level code (such as allocating code).


What is the motivation for std::byte?

Quoting from the original paper;

Many programs require byte-oriented access to memory. Today, such programs must use either the char, signed char, or unsigned char types for this purpose. However, these types perform a “triple duty”. Not only are they used for byte addressing, but also as arithmetic types, and as character types. This multiplicity of roles opens the door for programmer error - such as accidentally performing arithmetic on memory that should be treated as a byte value - and confusion for both programmers and tools.

In essence, std::byte is there to "replace" the use of char-types when required to deal with raw memory as bytes, it would be safe to assert that this is applicable when used by-value, by-refernce, pointers and in containers.

std::byte does not have the same connotations as a char; it's about raw memory, not characters

Correct, so std::byte should be preferred over char-types when dealing with bytes in memory (as in, an array of bytes). Some lower level protocol data manipulation immediately comes to mind.

What is a good rule of thumb, for the days of std::byte, regarding when to prefer it over void * and when it's the other way around?

I would argue that similar guides apply now as they did previously. When dealing with raw blocks of memory, where the byte addressability is required, char * etc. would have been preferred over void *, I think the same logic applies now, but prefer byte * over char *. A char * is better for a character sequences.

If the desire is to pass around a pointer opaquely, the void * probably still best fits the problem. void * essentially means "point to anything", but the anything is still something, we are just not saying what yet.

Further, the types uintptr_t (and intptr_t) would probably factor in as alternatives, depending of course on the desired application.

... I mostly mean new code where you get to choose all the types.

New code generally has very limited use of void * outside of compatibility (where you don't get to choose the type). If you need byte based processing then favour byte *.