RFC: add support for computing the element-wise minimum and maximum #667

kgryte · 2023-07-27T09:06:06Z

This proposes adding support for computing the element-wise minimum and maximum values when provided two input arrays.

Overview

Based on API comparison data, maximum and minimum APIs are available in all array libraries. And each array library uses the same naming convention: maximum and minimum.

Prior art

maximum

minimum

Proposal

def maximum(x1: array, x2: array, /)
def minimum(x1: array, x2: array, /)

Note: as in the specified min and max, while conforming implementations may support complex number inputs, inequality comparisons of complex numbers is unspecified, and thus implementation-dependent.

Questions

Is the min/minimum and max/maximum naming distinction clear enough, where the former are reductions and the latter are binary element-wise operations?

>>> x = np.arange(100000)
>>> y = np.random.randint(0, 1000, size=100000)
>>> %timeit np.maximum(x, y)
91.1 µs ± 42.5 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
>>> %timeit np.where(x > y, x, y)
132 µs ± 180 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

>>> x = np.arange(100000).astype(np.float64)
>>> y = np.random.randint(0, 1000, size=100000).astype(np.float64)
>>> %timeit np.maximum(x, y)
93.2 µs ± 303 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
>>> %timeit np.where(x > y, x, y)
141 µs ± 194 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Conclusions:

For NumPy it's a ~40% performance difference
For libraries with a compiler there's no difference
The syntax is understandable either way - that maximum is element-wise comparison rather than a reduction like max is more confusing that helpful imho, so I think they're roughly equivalent.

Based on the above, I am not convinced that it meets the bar for inclusion. On the other hand, these functions are used regularly and all libraries have an implementation, so perhaps it is simply convenient to include them anyway. Let's say I'm +0.5 for inclusion based on making the life of array-consuming library authors a bit easier.

seberg · 2023-07-27T16:44:20Z

FWIW I get a 2x difference in your example and a 6x difference in NumPy if I make sure there is a 50% change of a flip each time (which, the way numpy operates, is a lot more work).

The next thing is cache friendliness. The size above probably fits into L3 cache. Because if it doesn't (ignoring the boolean array itself), you have to 5 arrays vs. 3 arrays that need to go through memory, so a diffreence of ~1.66 is the best case (and likely) scenario usually. Maybe 1.7 best case is not much different from 40%, but...

shoyer · 2023-07-31T03:49:42Z

I think is a good idea, if only because xp.maximum(x, y) is also somewhat more readable than xp.where(x > y, x, y).

leofang · 2023-07-31T04:39:39Z

Given the precedence we had regarding adding new APIs mainly/only for perf reasons, I am slightly inclined toward objection but not as strongly. I am more concerned about minimum() vs min() (and likewise for max()) as it's been a source of confusion. I certainly have to look up the docs every time I need either of them, and it's not a good ergonomics. What do everyone here think?

oleksandr-pavlyk · 2023-07-31T12:28:45Z

Being driven by the same concern as @leofang, I considered elementwise_min and elementwise_max for the names, but I am afraid we are constrained by what's currently supported in majority of implementations.

rgommers · 2023-11-16T21:05:43Z

Discussed today: let's add these two functions under their current names, minimum and maximum. Considerations:

These functions are not perfect, but good enough:
- The performance aspect matters for adoption (avoids some code branching to keep, e.g., a numpy-specific fast path in scikit-learn around).
- The names are not ideal because of the min/minimum confusion, but there's no better names available either and any new name would have a high implementation cost that libraries won't be happy about.
- There are no blockers, since all libraries already have implementations.
The opinions were moderately positive, a couple of +1's and +0.5's, with some hesitations but no blocking concerns.

Closes: #667 PR-URL: #713

kgryte added the API extension Adds new functions or objects to the API. label Jul 27, 2023

kgryte mentioned this issue Nov 16, 2023

Order of signed zeros when computing the minimum and maximum #707

Closed

rgommers added this to the v2023 milestone Nov 16, 2023

kgryte mentioned this issue Nov 30, 2023

Add specifications for maximum and minimum #713

Merged

kgryte closed this as completed in #713 Dec 14, 2023

kgryte added a commit that referenced this issue Dec 14, 2023

Add specifications for maximum and minimum

50ca679

Closes: #667 PR-URL: #713

github-project-automation bot added this to Proposals Aug 29, 2024

github-project-automation bot moved this to Stage 1 in Proposals Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: add support for computing the element-wise minimum and maximum #667

RFC: add support for computing the element-wise minimum and maximum #667

kgryte commented Jul 27, 2023

rgommers commented Jul 27, 2023 •

edited

Loading

seberg commented Jul 27, 2023

shoyer commented Jul 31, 2023

leofang commented Jul 31, 2023 •

edited

Loading

oleksandr-pavlyk commented Jul 31, 2023

rgommers commented Nov 16, 2023

RFC: add support for computing the element-wise minimum and maximum #667

RFC: add support for computing the element-wise minimum and maximum #667

Comments

kgryte commented Jul 27, 2023

Overview

Prior art

maximum

minimum

Proposal

Questions

Related

rgommers commented Jul 27, 2023 • edited Loading

seberg commented Jul 27, 2023

shoyer commented Jul 31, 2023

leofang commented Jul 31, 2023 • edited Loading

oleksandr-pavlyk commented Jul 31, 2023

rgommers commented Nov 16, 2023

rgommers commented Jul 27, 2023 •

edited

Loading

leofang commented Jul 31, 2023 •

edited

Loading