Skip to content

RFC: add support for computing the element-wise minimum and maximum #667

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kgryte opened this issue Jul 27, 2023 · 6 comments · Fixed by #713
Closed

RFC: add support for computing the element-wise minimum and maximum #667

kgryte opened this issue Jul 27, 2023 · 6 comments · Fixed by #713
Labels
API extension Adds new functions or objects to the API.
Milestone

Comments

@kgryte
Copy link
Contributor

kgryte commented Jul 27, 2023

This proposes adding support for computing the element-wise minimum and maximum values when provided two input arrays.

Overview

Based on API comparison data, maximum and minimum APIs are available in all array libraries. And each array library uses the same naming convention: maximum and minimum.

Prior art

maximum

minimum

Proposal

def maximum(x1: array, x2: array, /)
def minimum(x1: array, x2: array, /)

Note: as in the specified min and max, while conforming implementations may support complex number inputs, inequality comparisons of complex numbers is unspecified, and thus implementation-dependent.

Questions

  • Is the min/minimum and max/maximum naming distinction clear enough, where the former are reductions and the latter are binary element-wise operations?

Related

@kgryte kgryte added the API extension Adds new functions or objects to the API. label Jul 27, 2023
@rgommers
Copy link
Member

rgommers commented Jul 27, 2023

@oleksandr-pavlyk commented: These can be implemented with where and less, sure, but direct implementation is much faster for non-JIT-ting implementations.

So I measured it:

>>> x = np.arange(100000)
>>> y = np.random.randint(0, 1000, size=100000)
>>> %timeit np.maximum(x, y)
91.1 µs ± 42.5 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
>>> %timeit np.where(x > y, x, y)
132 µs ± 180 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

>>> x = np.arange(100000).astype(np.float64)
>>> y = np.random.randint(0, 1000, size=100000).astype(np.float64)
>>> %timeit np.maximum(x, y)
93.2 µs ± 303 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
>>> %timeit np.where(x > y, x, y)
141 µs ± 194 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Conclusions:

  • For NumPy it's a ~40% performance difference
  • For libraries with a compiler there's no difference
  • The syntax is understandable either way - that maximum is element-wise comparison rather than a reduction like max is more confusing that helpful imho, so I think they're roughly equivalent.

Based on the above, I am not convinced that it meets the bar for inclusion. On the other hand, these functions are used regularly and all libraries have an implementation, so perhaps it is simply convenient to include them anyway. Let's say I'm +0.5 for inclusion based on making the life of array-consuming library authors a bit easier.

@seberg
Copy link
Contributor

seberg commented Jul 27, 2023

FWIW I get a 2x difference in your example and a 6x difference in NumPy if I make sure there is a 50% change of a flip each time (which, the way numpy operates, is a lot more work).

The next thing is cache friendliness. The size above probably fits into L3 cache. Because if it doesn't (ignoring the boolean array itself), you have to 5 arrays vs. 3 arrays that need to go through memory, so a diffreence of ~1.66 is the best case (and likely) scenario usually. Maybe 1.7 best case is not much different from 40%, but...

@shoyer
Copy link
Contributor

shoyer commented Jul 31, 2023

I think is a good idea, if only because xp.maximum(x, y) is also somewhat more readable than xp.where(x > y, x, y).

@leofang
Copy link
Contributor

leofang commented Jul 31, 2023

Given the precedence we had regarding adding new APIs mainly/only for perf reasons, I am slightly inclined toward objection but not as strongly. I am more concerned about minimum() vs min() (and likewise for max()) as it's been a source of confusion. I certainly have to look up the docs every time I need either of them, and it's not a good ergonomics. What do everyone here think?

@oleksandr-pavlyk
Copy link
Contributor

Being driven by the same concern as @leofang, I considered elementwise_min and elementwise_max for the names, but I am afraid we are constrained by what's currently supported in majority of implementations.

@rgommers
Copy link
Member

Discussed today: let's add these two functions under their current names, minimum and maximum. Considerations:

  • These functions are not perfect, but good enough:

    • The performance aspect matters for adoption (avoids some code branching to keep, e.g., a numpy-specific fast path in scikit-learn around).
    • The names are not ideal because of the min/minimum confusion, but there's no better names available either and any new name would have a high implementation cost that libraries won't be happy about.
    • There are no blockers, since all libraries already have implementations.
  • The opinions were moderately positive, a couple of +1's and +0.5's, with some hesitations but no blocking concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API extension Adds new functions or objects to the API.
Projects
Status: Stage 1
Development

Successfully merging a pull request may close this issue.

6 participants