-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Enable Excel function implementations for use in "array formulae" #2551
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
With some initial work; I've decided to move the This also means that use of the Methods in the |
Logic currently supports either a single array argument, which can be
or two array arguments which can be
Currently unsupported
|
This is:
Preparation of Excel function implementations allowing them to be used in "array formulae".
Typically in MS Excel, arguments passed to a function are simple "scalar" values, e.g.
=ABS(-3)
But MS Excel does allow for that argument passed to a function also to be a matrix/array of values:
=ABS({-3, 2.5, -1; 0, -1, -2})
This is still considered a single argument, but the argument is a matrix rather than a simple "scalar". In this case, MS Excel will return a matrix with the same dimensions, where the function has been applied to each individual "scalar" value in that matrix. In the above case, this will result in a response of
{3, 2.5, 1; 0, 1, 2}
That result may then be processed by other Excel functions in the formula (e.g.
MAX()
) to reduce it to a single value; or it may be displayed across multiple cells.Currently, if we write a cell with the formula
=MAX(ABS({-3, 2.5, -1; 0, -1, -12}))
, the Calculation Engine will make the call toABS()
with the array of arguments, but will discard all but the first value before determining the absolute of that value, which gives a result of3
(the absolute value of the first array argument of-3
); so the subsequent call toMAX()
will only receive the value3
, giving a final (incorrect) result of3
.With these changes,
ABS()
evaluates every entry in the array, and returns an array of absolute values that is then passed toMAX()
, so the call to max has the full set of values (3, 2., 1, 1, 12) and can identify the correct maximum value of12
.This series of changes will ensure that arrays are returned by function calls when appropriate (rather than single values), and are then passed correctly through the Calculation Engine call stack as arrays rather than single scalar values. In the case of our formula
MAX(ABS({-3, 2.5, -1; 0, -1, -12}))
, it will allow a correct result of12
rather than an incorrect result of3
to be returned.It will not change the way that a cell handles the result of a matrix being returned to the
getCalculatedValue()
call; it will still be reduced to the value of the first entry in that matrix. Handling for that change is ongoing as part of PR #2539, which includes a BC break, meaning that it will only be released with PhpSpreadsheet 2.0.0.However, in addition to resolving returns from the Excel function implementations so that values aren't lost in a chain of calls, this is also preparatory work for that final PR #2539 handling of array results. This change ensures that a call passing an array to an Excel function results in the return of an array from that function so that the final cell-level handling in PR #2539 receives an array when it should.
Example of an existing function (The Math/Trig ABS() function)
This is relatively straightforward to implement for functions that only accept a single argument: the following shows the changes required for a typical function.
Current implementation:
Updated (array function enabled implementation)
Changes to the unit tests to verify correct behaviour when functions are enabled for use in array formulae
Example of an existing function that accepts multiple arguments (The Math/Trig ATAN2() function)... (comments deleted to reduce code noise)
Current implementation:
Updated (array function enabled implementation)
Changes to the unit tests to verify correct behaviour when functions are enabled for use in array formulae
The function test for
ROUND()
contains a lot more variations on different combinations and sizes of arrays.Functions that accept arguments using the splat operator, or more than 2 arguments, need to be assessed on a case by case basis. However, for the example of
WORKDAYS()
, which accepts two "static" arguments, then uses the splat operator to accept additional arguments, theevaluateArrayArgumentsSubset()
method allows the call to indicate the static arguments that could be arrays, but to process the array of trailing arguments accepted by the method "normally":We only check to see if the "static" argument values are arrays; and we call
evaluateArrayArgumentsSubset()
with the additionallimit
argument, that tells the code logic that only the first two arguments should be processed for the purposes of array testing.Why is this required?
Besides fixing some basic array formulae when passed through the call stack in the Calculation engine (as described above):
One of the planned new features for PhpSpreadsheet 2.0 is support for array formulae, including the new array functions like
SEQUENCE()
,SORT()
,FILTER()
, etc; and also the newSpill
andSingle
operators.While the PhpSpreadsheet operators already support Excel matrix/array handling, most of the function implementations don't yet; so this is preparation work in anticipation of providing full support for array formulae in version 2.0.
This is linked to the work ongoing in PR #2539 on branch CalculationEngine-Array-Formulae-Initial-Work
These changes to the function implementations can be done ahead in the current codebase ahead of the 2.0 release. The existing Calculation Engine will simply discard all but the very first scalar value from any matrix of values returned by an Excel function: that is already existing behaviour; but it may provide correct results when using arrays as arguments when the current implementation does not (as in the case of our
=MAX(ABS({-3, 2.5, -1; 0, -1, -12}))
example).Examples of Excel array functions, and the proposed support/implementation can be found in the documentation
The text was updated successfully, but these errors were encountered: