-
Notifications
You must be signed in to change notification settings - Fork 6k
[Impeller] improve blur performance for Android and iPad Pro. #39291
Conversation
It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat (don't just cc him here, he won't see it! He's on Discord!). If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix? Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM assuming all of the TileModes still behave properly in Play/EntityTest.GaussianBlurFilter/Metal
.
(When checking, note that some of the rarely used "BlurStyles" have been known broken for a while due to the blur optimizations, we need to get around to fixing it)
For future onlookers: Note that times in frame captures are extremely inaccurate and can vary up to 5x in my experience. However, they're loosely precise, and so by normalizing results against the length of items in the capture which are expected to not be changed, loose performance comparisons can be made.
In the "before" screenshot, everything happens at ~half speed, and so at first glance, it appears that the first blur pass is 4x speed in the "after" screenshot, but it's actually closer to 2x. Another interesting observation is that the second blur pass is actually relatively slower in the "after" screenshot!
@@ -281,7 +308,7 @@ std::optional<Rect> DirectionalGaussianBlurFilterContents::GetFilterCoverage( | |||
|
|||
auto transform = inputs[0]->GetTransform(entity) * effect_transform; | |||
auto transformed_blur_vector = | |||
transform.TransformDirection(blur_direction_ * Radius{blur_sigma_}.radius) | |||
transform.TransformDirection(blur_direction_* Radius{blur_sigma_}.radius) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Weird change. Is the formatter forcing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup :(
input_descriptor.height_address_mode = SamplerAddressMode::kRepeat; | ||
source_descriptor.width_address_mode = SamplerAddressMode::kRepeat; | ||
source_descriptor.height_address_mode = SamplerAddressMode::kRepeat; | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah, we should totally be doing this! This has come up a couple of times before when the blur was under pressure.
I have visually verified the examples and they seem identical :) |
…119765) * 4b77e7793 [Impeller] improve blur performance for Android and iPad Pro. (flutter/engine#39291) * 97d27ff59 [Impeller] let images with opacity and filters passthrough. (flutter/engine#39237)
…r#39291) * [Impeller] improve gaussian blur performance on Android * ++ * ++
…r#39291) * [Impeller] improve gaussian blur performance on Android * ++ * ++
…r#39291) * [Impeller] improve gaussian blur performance on Android * ++ * ++
#39367) * [Impeller] improve gaussian blur performance on Android * ++ * ++ Co-authored-by: Jonah Williams <[email protected]>
Create a shader variant for blur that uses natively supported texture addressing modes, and a shader variant that emulates decal tile mode, removing the branch from the loop. I think this is legal? :)
Running some local experiments on wonderous on an iPad pro, the gaussian blur passes are currently taking around 8-16ms. With this change applied, they're more consistently 4-8ms - roughly doubling the performance. While I didn't see an improvement on iOS in my past investigation of this change, this was likely because we were always running below frame time + I wasn't using the frame debugger. I'll investigate this separately, but I believe we'll also see a performance improvment on iPhones - and on Android as well.
While this does increase the number of shaders, its only by 1 - and for the most expensive filter we have Its probably worth it. Note that even 2xing performance, we still need to find a way to make some more gains to stay within budget for the iPad, probably at least one more doubling.
#37927 (comment) contains details on the malioc output which confirm perfomance numbers for Android as well.
Before
After