Are arrays in PHP copied as value or as reference to new variables, and when passed to functions?

For the second part of your question, see the array page of the manual, which states (quoting) :

Array assignment always involves value copying. Use the reference operator to copy an array by reference.

And the given example :

<?php
$arr1 = array(2, 3);
$arr2 = $arr1;
$arr2[] = 4; // $arr2 is changed,
             // $arr1 is still array(2, 3)

$arr3 = &$arr1;
$arr3[] = 4; // now $arr1 and $arr3 are the same
?>


For the first part, the best way to be sure is to try ;-)

Consider this example of code :

function my_func($a) {
    $a[] = 30;
}

$arr = array(10, 20);
my_func($arr);
var_dump($arr);

It'll give this output :

array
  0 => int 10
  1 => int 20

Which indicates the function has not modified the "outside" array that was passed as a parameter : it's passed as a copy, and not a reference.

If you want it passed by reference, you'll have to modify the function, this way :

function my_func(& $a) {
    $a[] = 30;
}

And the output will become :

array
  0 => int 10
  1 => int 20
  2 => int 30

As, this time, the array has been passed "by reference".


Don't hesitate to read the References Explained section of the manual : it should answer some of your questions ;-)


With regards to your first question, the array is passed by reference UNLESS it is modified within the method / function you're calling. If you attempt to modify the array within the method / function, a copy of it is made first, and then only the copy is modified. This makes it seem as if the array is passed by value when in actual fact it isn't.

For example, in this first case, even though you aren't defining your function to accept $my_array by reference (by using the & character in the parameter definition), it still gets passed by reference (ie: you don't waste memory with an unnecessary copy).

function handle_array($my_array) {  

    // ... read from but do not modify $my_array
    print_r($my_array);

    // ... $my_array effectively passed by reference since no copy is made
}

However if you modify the array, a copy of it is made first (which uses more memory but leaves your original array unaffected).

function handle_array($my_array) {

    // ... modify $my_array
    $my_array[] = "New value";

    // ... $my_array effectively passed by value since requires local copy
}

FYI - this is known as "lazy copy" or "copy-on-write".


TL;DR

a) the method/function only reads the array argument => implicit (internal) reference
b) the method/function modifies the array argument => value
c) the method/function array argument is explicitly marked as a reference (with an ampersand) => explicit (user-land) reference

Or this:
- non-ampersand array param: passed by reference; the writing operations alter a new copy of the array, copy which is created on the first write;
- ampersand array param: passed by reference; the writing operations alter the original array.

Remember - PHP does a value-copy the moment you write to the non-ampersand array param. That's what copy-on-write means. I'd love to show you the C source of this behaviour, but it's scary in there. Better use xdebug_debug_zval().

Pascal MARTIN was right. Kosta Kontos was even more so.

Answer

It depends.

Long version

I think I'm writing this down for myself. I should have a blog or something...

Whenever people talk of references (or pointers, for that matter), they usually end up in a logomachy (just look at this thread!).
PHP being a venerable language, I thought I should add up to the confusion (even though this a summary of the above answers). Because, although two people can be right at the same time, you're better off just cracking their heads together into one answer.

First off, you should know that you're not a pedant if you don't answer in a black-and-white manner. Things are more complicated than "yes/no".

As you will see, the whole by-value/by-reference thing is very much related to what exactly are you doing with that array in your method/function scope: reading it or modifying it?

What does PHP says? (aka "change-wise")

The manual says this (emphasis mine):

By default, function arguments are passed by value (so that if the value of the argument within the function is changed, it does not get changed outside of the function). To allow a function to modify its arguments, they must be passed by reference.

To have an argument to a function always passed by reference, prepend an ampersand (&) to the argument name in the function definition

As far as I can tell, when big, serious, honest-to-God programmers talk about references, they usually talk about altering the value of that reference. And that's exactly what the manual talks about: hey, if you want to CHANGE the value in a function, consider that PHP's doing "pass-by-value".

There's another case that they don't mention, though: what if I don't change anything - just read?
What if you pass an array to a method which doesn't explicitly marks a reference, and we don't change that array in the function scope? E.g.:

<?php
function readAndDoStuffWithAnArray($array) 
{
    return $array[0] + $array[1] + $array[2];
}

$x = array(1, 2, 3);

echo readAndDoStuffWithAnArray($x);

Read on, my fellow traveller.

What does PHP actually do? (aka "memory-wise")

The same big and serious programmers, when they get even more serious, they talk about "memory optimizations" in regards to references. So does PHP. Because PHP is a dynamic, loosely typed language, that uses copy-on-write and reference counting, that's why.

It wouldn't be ideal to pass HUGE arrays to various functions, and PHP to make copies of them (that's what "pass-by-value" does, after all):

<?php

// filling an array with 10000 elements of int 1
// let's say it grabs 3 mb from your RAM
$x = array_fill(0, 10000, 1); 

// pass by value, right? RIGHT?
function readArray($arr) { // <-- a new symbol (variable) gets created here
    echo count($arr); // let's just read the array
}

readArray($x);

Well now, if this actually was pass-by-value, we'd have some 3mb+ RAM gone, because there are two copies of that array, right?

Wrong. As long as we don't change the $arr variable, that's a reference, memory-wise. You just don't see it. That's why PHP mentions user-land references when talking about &$someVar, to distinguish between internal and explicit (with ampersand) ones.

Facts

So, when an array is passed as an argument to a method or function is it passed by reference?

I came up with three (yeah, three) cases:
a) the method/function only reads the array argument
b) the method/function modifies the array argument
c) the method/function array argument is explicitly marked as a reference (with an ampersand)


Firstly, let's see how much memory that array actually eats (run here):

<?php
$start_memory = memory_get_usage();
$x = array_fill(0, 10000, 1);
echo memory_get_usage() - $start_memory; // 1331840

That many bytes. Great.

a) the method/function only reads the array argument

Now let's make a function which only reads the said array as an argument and we'll see how much memory the reading logic takes:

<?php

function printUsedMemory($arr) 
{
    $start_memory = memory_get_usage();

    count($arr);       // read
    $x = $arr[0];      // read (+ minor assignment)
    $arr[0] - $arr[1]; // read

    echo memory_get_usage() - $start_memory; // let's see the memory used whilst reading
}

$x = array_fill(0, 10000, 1); // this is 1331840 bytes
printUsedMemory($x);

Wanna guess? I get 80! See for yourself. This is the part that the PHP manual omits. If the $arr param was actually passed-by-value, you'd see something similar to 1331840 bytes. It seems that $arr behaves like a reference, doesn't it? That's because it is a references - an internal one.

b) the method/function modifies the array argument

Now, let's write to that param, instead of reading from it:

<?php

function printUsedMemory($arr)
{
    $start_memory = memory_get_usage();

    $arr[0] = 1; // WRITE!

    echo memory_get_usage() - $start_memory; // let's see the memory used whilst reading
}

$x = array_fill(0, 10000, 1);
printUsedMemory($x);

Again, see for yourself, but, for me, that's pretty close to being 1331840. So in this case, the array is actually being copied to $arr.

c) the method/function array argument is explicitly marked as a reference (with an ampersand)

Now let's see how much memory a write operation to an explicit reference takes (run here) - note the ampersand in the function signature:

<?php

function printUsedMemory(&$arr) // <----- explicit, user-land, pass-by-reference
{
    $start_memory = memory_get_usage();

    $arr[0] = 1; // WRITE!

    echo memory_get_usage() - $start_memory; // let's see the memory used whilst reading
}

$x = array_fill(0, 10000, 1);
printUsedMemory($x);

My bet is that you get 200 max! So this eats approximately as much memory as reading from a non-ampersand param.