-
Notifications
You must be signed in to change notification settings - Fork 13.4k
i32.clamp() suggested by Clippy produces worse code than i32.min().max() #141915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The issue here is the fn clamp(self, min: Self, max: Self) -> Self
where
Self: Sized,
{
assert!(min <= max);
if self < min {
min
} else if self > max {
max
} else {
self
}
} Unfortunately, the panic is documented behavior https://doc.rust-lang.org/std/cmp/trait.Ord.html#method.clamp, so I'm not sure what can actually be done here. |
Oh, I can confirm that's the case on godbolt. That's weird; I expect the panic to be removed by constant propagation, and it is, but it messes with optimization anyway. |
This is a simplified example: |
@JonathanBrouwer Inverting the order in which |
Related/duplicate of #125738 I reported the lint to clippy back when it was in the beta, I believe. LLVM upstream issue - llvm/llvm-project#104875 |
@Shnatsel if the issue is the order in which |
There is no performance difference between |
On this code in
image-webp
, following the Clippy lint to replace.max(0).min(255)
with.clamp(0,255)
on ani32
value causes a performance regression:https://github.com/image-rs/image-webp/blob/93baf7de7df50977a1fcb3a0bb53036d4780bff3/src/vp8.rs#L994-L999
It's unfortunate that .min().max() and .clamp() are not equivalent, and doubly so when Clippy nags us to rewrite the code in a way that makes it slower.
I've posted a self-contained sample that reproduces the issue on godbolt:
Generated assembly for .min().max(): https://rust.godbolt.org/z/zr7PK8vz3
Generated assembly for .clamp(): https://rust.godbolt.org/z/b898M45vo
You can see that the .clamp() version results in far more assembly; the vectorized loop is roughly twice the amount of instructions.
I've confirmed that the issue exists in rustc 1.75, 1.82 and 1.87 which is the latest as of this writing.
The text was updated successfully, but these errors were encountered: