-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Vectorize box encoding in FCOS #6278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @abhi-glitchhg, thanks for working on this.
On a first pass looks good. But it is important to test that the implementation is equivalent and also test the performance, would you be able to provide these test scripts? If you need support to run benchmark in GPU for instance I am happy to help
@jdsgomes thanks for the review. I have checked this implementation on the CPU device only.
Running the above script, I am getting around 5x improvement. Could you please check the same on GPU? BTW this script is a slight modification of @datumbox's script in #6203 (review) :) |
thanks for the confirmation. I did a benchmark using GPU and got similar results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for working on this
Hey @jdsgomes! You merged this PR, but no labels were added. The list of valid labels is available at https://github.com/pytorch/vision/blob/main/.github/process_commit.py |
Summary: * intial structure * fixed types of few variables * remove the commented code * list -> List * encode method will take input as tensors instead of list of tensor Reviewed By: datumbox Differential Revision: D38154574 fbshipit-source-id: ee4936b9968fc1b0cf751c4884c3d5c8064e7d10 Co-authored-by: Joao Gomes <[email protected]>
Just like #5247, i think it is possible to vectorize the following loop at this line.
Here is my attempt.