Problem array_diff_assoc() on multidimensional arrays

732 0

When coding an endpoint of a REST API on decoupled applications, it is a very good idea to optimize the response json in order it doesn’t include redundant or unnecessary information we are not going to update in the frontend of the app. This way, we avoid useless processing of information when the json is received by the frontend and the speed of the communication with the backend is increased as its payload is decreased.

If, for example, we are sending the entire state or a big part of its variables from the frontend to the backend, it has no sense that we return on the response json all the variables that only came within the state just to determine which functions must be executed.

This may often occur if our backend workflow is based on:

  • Receiving a json
  • Converting it into an array.
  • Modifying it with the new calculated values.
  • And sending it back to the frontside in order the app’s state to be mutated.

Working with that same array during all the backend process causes that many or part the variables in the array are sent back again without having been modified, neither required.

To avoid this, it exists a very useful function in PHP called array_diff_assoc() (and some others regarding to the comparision of arrays, as array_diff(), array_diff_uassoc(), array_udiff_assoc(), array_udiff_uassoc(), array_intersect() or array_intersect_assoc()) which helps us to compare 2 arrays and generate one containing only the key-value pairs which are different in one respect the other one (those keys in the array whose values have changed and new existing key-value pairs).

<?php
$array1 = array("a" => "green", "b" => "brown", "c" => "blue", "red");
$array2 = array("a" => "green", "yellow", "red");
$result = array_diff_assoc($array1, $array2);
print_r($result);
?>

That code would echo:

Array
(
    [b] => brown
    [c] => blue
    [0] => red
)

However, there is a problem with array_diff_assoc() when working on multidimensional arrays, as array_diff_assoc() works properly only when the arrays we are comparing have only one dimension.

In case we are working with multidimensional arrays array_diff_assoc() would only extract the differences in the first dimension values, but will skip the comparison on keys containing nested arrays.

To avoid this problem, a solution would be:

  • To use the serialize() function to covert first the possible sub-arrays into their string representation.
  • Map the array to iterate the comparision and extract the differences.
  • Then use unserialize() to give back the proper array format to those compared nested arrays.

Something like this:

$result = array_map('unserialize',
    array_diff(array_map('serialize', $array1), array_map('serialize', $array2)));

But this kind of implementation assumes that the associative keys in the arrays are in the same order, and will not work properly if they are not.

That’s why the best implementation for me is to create a recursive function which will:

  • Iterate on the associative keys on a dimension of the first array.
  • Search that key value in the second array.
  • Extract it if it is not an array and it is different or does not exist in the second array.
  • Call to itself in case the value is an array.
function array_diff_assoc_recursive($array1, $array2) {
    $difference = array();
    foreach ($array1 as $key => $value) {
        if (is_array($value)) {
            if (!isset($array2[$key]) || !is_array($array2[$key]) ) {
                $difference[$key] = $value;
            } else {
                $new_diff = array_diff_assoc_recursive($value, $array2[$key]);
                if (!empty($new_diff))
                    $difference[$key] = $new_diff;
            }
        } else if (!array_key_exists($key,$array2) || $array2[$key] !== $value) {
            $difference[$key] = $value;
        }
    }
    return $difference;
}

The only caution to be aware of is to pass as the first array the one which is susceptible of having more associative keys than the other one, as the differences will be extracted by iterating this first array.

If you follow the paradigma of sending the state of the app to the backend, mutate it and give it back to the frontend, then the mutated array should be the first parameter and the original one the second.

Carlos Pérez

Digital manager, tech lead, product designer, full stack engineer, web and app developer, SEO, digital marketing, automation and AI expert.

Leave a Reply

Your email address will not be published. Required fields are marked *