Skip to content

Dreambooth class sampling to use xformers if enabled #3312

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

mu94-csl
Copy link
Contributor

@mu94-csl mu94-csl commented May 2, 2023

Currently, the training can use xformers, but not the inference for class sampling before the training.
The prior preservation class sampling of ~200 images is a major bottleneck right now (~6x time over actual training).
This PR allows xformers to be used for both training/inference, for both full and LoRA dreambooth.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@patrickvonplaten
Copy link
Contributor

Hmm given that much higher speed-ups can be gained with PT 2.0 now, should we maybe just advertise PT 2.0 ? cc @sayakpaul

@sayakpaul
Copy link
Member

Hmm given that much higher speed-ups can be gained with PT 2.0 now, should we maybe just advertise PT 2.0

I think the community still uses xFormers a lot. Maybe it's better to make it clear from the docs that if someone is using PyTorch 2.0, the efficient attention processor will be used by default and they shouldn't have to enable xFormers for that.

AttnProcessor2_0() if hasattr(F, "scaled_dot_product_attention") and scale_qk else AttnProcessor()

WDYT?

@mu94-csl
Copy link
Contributor Author

mu94-csl commented May 4, 2023

Hmm given that much higher speed-ups can be gained with PT 2.0 now, should we maybe just advertise PT 2.0 ?

Actually yes, I found no difference for Turing and Ampere GPUs, between PT2.0 and xformers.

My PR was 'misled' due to my experiments on a Pascal-gen GPU (still usable :D), for which PT 2.0 perhaps does not launch the correct kernels, but xformers help tremendously.

Also the current DreamBooth script fails on PT 2.0 due to some CUDA errors but works with xformers, suggesting that the PT backend is still quirky - see #3325.

@patrickvonplaten
Copy link
Contributor

Hmm given that much higher speed-ups can be gained with PT 2.0 now, should we maybe just advertise PT 2.0

I think the community still uses xFormers a lot. Maybe it's better to make it clear from the docs that if someone is using PyTorch 2.0, the efficient attention processor will be used by default and they shouldn't have to enable xFormers for that.

AttnProcessor2_0() if hasattr(F, "scaled_dot_product_attention") and scale_qk else AttnProcessor()

WDYT?

Yes good idea, we could throw a warning if we detect both PT and xformers to be installed maybe at init?

@github-actions
Copy link
Contributor

github-actions bot commented Jun 1, 2023

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Jun 1, 2023
@github-actions github-actions bot closed this Jun 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issues that haven't received updates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants