Tricky Tricky Refcounts…
Occasionally a PHP engineer reports this prolem:
Example Code:
————
$my_arr = array(1,2,3);
foreach ($my_arr as &$val) {
var_dump($val);
}
foreach ($my_arr as $val) {
var_dump($val);
}
Expected Output:
———
int(1)
int(2)
int(3)
int(1)
int(2)
int(3)
Actual Output:
——-
int(1)
int(2)
int(3)
int(1)
int(2)
int(2)
The confusion comes from the expectation that the second loop will print the last element of the array as int(3) rather than int(2). The initial reaction is usally “this is a PHP bug”, but it really isn’t. There are two key aspects to this code to watch out for; 1) The scope of foreach variables is not limited to the foreach block. 2) Foreach loops do not unset foreach variables at the start of the block.
With this in mind we can see that at the end of the first loop, $val is a reference to the last element of $my_arr. Each iteration over the foreach loop can be thought of as an assignment operation, in this case by reference:
$val = &$my_arr[0]
$val = &$my_arr[1]
$val = &$my_arr[2]
// last iteration $val is a reference to $my_arr[2]
As we step through each iteration of the second foreach loop we see the assignments of $val to each element of the $my_arr (assigned by value this time).
$val = $my_arr[0]
$val = $my_arr[1]
$val = $my_arr[2]
But if we you recall $val is really a reference to the last element of $my_arr because it carries over from the first foreach loop, so the actual assignment looks more like:
$my_arr[2] = $my_arr[0]
$my_arr[2] = $my_arr[1]
$my_arr[2] = $my_arr[2]
Thus we end up with $my_array being set as such on each iteration:
// (array(1,2,1)) first element is set to value of last
$my_arr[2] = $my_arr[0]
// (array(1,2,2)) second element is set to value of last
$my_arr[2] = $my_arr[1]
// (array(1,2,2)) last element is set to value of itself
$my_arr[2] = $my_arr[2]
Note that the last assignment is really assigning the last element to itself!
Because PHP5 handles variables with a copy on write algorithm, it’s typically not necessary to do any assignmnents by reference with performance gains in mind (as was the case with a lot of PHP4 code). The above code can be made to function as the expected case by placing an unset($var) between the foreach loops, or not iterating over references and instead assigning the values of $my_arr explicitly by index or key values. References should be used by care and only when necessary. When code like this is present in global scope or large functions it may affect future code in seemingly unpredictable ways.