Why doesn't C++ support functions returning arrays?

I'd wager a guess that to be concise, it was simply a design decision. More specifically, if you really want to know why, you need to work from the ground up.

Let's think about C first. In the C language, there is a clear distinction between "pass by reference" and "pass by value". To treat it lightly, the name of an array in C is really just a pointer. For all intents and purposes, the difference (generally) comes down to allocation. The code

int array[n];

would create 4*n bytes of memory (on a 32 bit system) on the stack correlating to the scope of whichever code block makes the declaration. In turn,

int* array = (int*) malloc(sizeof(int)*n);

would create the same amount memory, but on the heap. In this case, what is in that memory isn't tied to the scope, only the reference TO the memory is limited by the scope. Here's where pass by value and pass by reference come in. Passing by value, as you probably know, means that when something is passed in to or returned from a function, the "thing" that gets passed is the result of evaluating the variable. In other words,

int n = 4;
printf("%d", n);

will print the number 4 because the construct n evaluates to 4 (sorry if this is elementary, I just want to cover all the bases). This 4 has absolutely no bearing or relationship to the memory space of your program, it's just a literal, and so once you leave the scope in which that 4 has context, you lose it. What about pass by reference? Passing by reference is no different in the context of a function; you simply evaluate the construct that gets passed. The only difference is that after evaluating the passed "thing", you use the result of the evaluation as a memory address. I once had a particular cynical CS instructor who loved to state that there is no such thing as passing by reference, just a way to pass clever values. Really, he's right. So now we think about scope in terms of a function. Pretend that you can have an array return type:

int[] foo(args){
    result[n];
    // Some code
    return result;
}

The problem here is that result evaluates to the address of the 0th element of the array. But when you attempt to access this memory from outside of this function (via the return value), you have a problem because you are attempting to access memory that is not in the scope with which you are working (the function call's stack). So the way we get around this is with the standard "pass by reference" jiggery-pokery:

int* foo(args){
    int* result = (int*) malloc(sizeof(int)*n));
    // Some code
    return result;
}

We still get a memory address pointing to the 0th element of the Array, but now we have access to that memory.

What's my point? In Java, it is common to assert that "everything is pass by value". This is true. The same cynical instructor from above also had this to say about Java and OOP in general: Everything is just a pointer. And he's also right. While everything in Java is in fact pass by value, almost all of those values are actually memory addresses. So in Java, the language does let you return an array or a String, but it does so by turning it in to the version with pointers for you. It also manages your memory for you. And automatic memory management, while helpful, is not efficient.

This brings us to C++. The whole reason C++ was invented was because Bjarne Stroustrup had been experimenting with Simula (basically the original OOPL) during his PhD work, and thought it was fantastic conceptually, but he noticed that it performed rather terribly. And so he began working on what was called C with Classes, which got renamed to C++. In doing so, his goal was to make a programming language that took SOME of the best features from Simula but remained powerful and fast. He chose to extend C due to its already legendary performance, and one tradeoff was that he chose to not implement automatic memory management or garbage collecting on such a large scale like other OOPL's. Returning an array from one of the template classes works because, well, you're using a class. But if you want to return a C array, you have to do it the C way. In other words, C++ does support returning an array EXACTLY the same way that Java does; it just doesn't do all of the work for you. Because a Danish dude thought it'd be too slow.


C++ does support it - well sort of:

vector< string> func()
{
   vector<string> res;
   res.push_back( "hello" );
   res.push_back( "world" );
   return res;
}

Even C sort-of supports it:

struct somearray
{
  struct somestruct d[50];
};

struct somearray func()
{
   struct somearray res;
   for( int i = 0; i < 50; ++i )
   {
      res.d[i] = whatever;
   }
   // fill them all in
   return res;
}

A std::string is a class but when you say a string you probably mean a literal. You can return a literal safely from a function but actually you could statically create any array and return it from a function. This would be thread-safe if it was a const (read-only) array which is the case with string literals.

The array you return would degrade to a pointer though, so you would not be able to work out its size just from its return.

Returning an array, if it were possible, would have to be fixed length in the first place, given that the compiler needs to create the call stack, and then has the issue that arrays are not l-values so receiving it in the calling function would have to use a new variable with initialisation, which is impractical. Returning one may be impractical too for the same reason, atlhough they might have used a special notation for return values.

Remember in the early days of C all the variables had to be declared at the top of the function and you couldn't just declare at first use. Thus it was infeasible at the time.

They gave the workaround of putting the array into a struct and that is just how it now has to remain in C++ because it uses the same calling convention.

Note: In languages like Java, an array is a class. You create one with new. You can reassign them (they are l-values).


Arrays in C (and in C++ for backwards compatibility) have special semantics that differ from the rest of the types. In particular, while for the rest of the types, C only has pass-by-value semantics, in the case of arrays the effect of the pass-by-value syntax simulates pass-by-reference in a strange way:

In a function signature, an argument of type array of N elements of type T gets converted to pointer to T. In a function call passing an array as argument to a function will decay the array to a pointer to the first element, and that pointer is copied into the function.

Because of this particular treatment for arrays --they cannot be passed by value--, they cannot be returned by value either. In C you can return a pointer, and in C++ you can also return a reference, but the array itself cannot be allocated in the stack.

If you think of it, this is not different from the language that you are using in the question, as the array is dynamically allocated and you are only returning a pointer/reference to it.

The C++ language, on the other hand, enables different solutions to that particular problem, like using std::vector in the current standard (contents are dynamically allocated) or std::array in the upcoming standard (contents can be allocated in the stack, but it might have a greater cost, as each element will have to be copied in those cases where the copy cannot be elided by the compiler). In fact, you can use the same type of approach with the current standard by using off-the-shelf libraries like boost::array.