Skip to content

Skip big models per platform/device #6539

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 5, 2022

Conversation

datumbox
Copy link
Contributor

@datumbox datumbox commented Sep 5, 2022

Addressing broken CI caused by memory issues on Windows GPU.

It might be worth specifying if possible the skips for vit_h_14 and regnet_y_128gf. Perhaps skipping only windows will do the trick (see #6189).

Copy link
Collaborator

@pmeier pmeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Vasilis!

Comment on lines 352 to 353
"vit_h_14": set(product(_all_platforms, _all_devices)),
"regnet_y_128gf": set(product(_all_platforms, _all_devices)),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we already did this before. Is this a temporary thing? If not, how are we going to make sure that we don't introduce bugs in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I've pinged offline @YosuaMichael who had to introduce those to see if we can specify further the skips instead of blocking every version of it. I'll follow up in this PR with some commits to try to narrow it down on Windows GPU (maybe also CPU) which from memory was breaking. I'll turn this to draft to avoid accidental merges until we know the exact skips required.

Copy link
Contributor

@YosuaMichael YosuaMichael Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have chatted with @datumbox, for vit_h_14 and regnet_y_128gf, TLDR it failed on Windows for both CPU and GPU. Just FYI for some context on the failures, it first fails on Windows GPU first in 27 April 2022 and after ~2 months though it fails on Windows CPU.

I think this PR is a good idea to loosen the test for the big model so we can still test them on Linux and Mac if possible :)

I have looked at the code and it looks good to me too.

Copy link
Contributor Author

@datumbox datumbox Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect thanks for the references. I just pushed skipping only Windows for these tests as well. Let's see how this will go. If green, I'll clean up the PR and merge using the more specific skips as you both proposed.

@datumbox datumbox marked this pull request as draft September 5, 2022 14:30
@datumbox datumbox changed the title [WIP] Skip big models per platform/device Skip big models per platform/device Sep 5, 2022
@datumbox datumbox marked this pull request as ready for review September 5, 2022 15:32
@datumbox datumbox merged commit 74feb19 into pytorch:main Sep 5, 2022
@datumbox datumbox deleted the tests/skip_models branch September 5, 2022 16:42
facebook-github-bot pushed a commit that referenced this pull request Sep 9, 2022
Summary:
* Skip big models per platform/device

* Specifying skips on Windows only.

* Simplify and clean up code.

Reviewed By: YosuaMichael

Differential Revision: D39381956

fbshipit-source-id: a8bd8e73dd7377580beafff0619af34b5904ae83
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants