Higher VRAM usage with PyTorch2 without xformers under certain situations #3441

wfng92 · 2023-05-16T03:15:08Z

Describe the bug

Edited to reflect the actual issue

In the latest development version, PR #3365 introduced a confusion which makes users believe that PT2 variant of attention processors are fully supported and xformers is no longer needed. This results in higher VRAM usage under certain situations (using LoRA/custom diffusion without xformers)

If the env is installed with both Pytorch2 and xformers, it will

raise a warning
default to PyTorch's native efficient flash attention

diffusers currently supports the following PT 2.0 variant of attention processors

AttnProcessor => AttnProcessor2_0
AttnAddedKVProcessor => AttnAddedKVProcessor2_0

The following are not supported:

SlicedAttnProcessor
SlicedAttnAddedKVProcessor
LoRAAttnProcessor
CustomDiffusionAttnProcessor

~~It would be great if users can still use xformers when calling the pipe.enable_xformers_memory_efficient_attention() function.~~

Reproduction

from diffusers import (
    DPMSolverMultistepScheduler,
    StableDiffusionPipeline,
)

pipe = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5",
    torch_dtype=torch.float16
).to("cuda")

pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)

pipe.unet.load_attn_procs("pytorch_lora_weights.bin")
pipe.enable_xformers_memory_efficient_attention()

prompt = "a photo of a dog"
image = pipe(prompt=prompt, cross_attention_kwargs={"scale": 1.0}).images

Logs

"You have specified using flash attention using xFormers but you have PyTorch 2.0 already installed. "
"We will default to PyTorch's native efficient flash attention implementation provided by PyTorch 2.0.

System Info

diffusers version: 0.17.0.dev0
Platform: Windows-10-10.0.19045-SP0
Python version: 3.10.11
PyTorch version (GPU?): 2.0.0+cu118 (True)
Huggingface_hub version: 0.13.4
Transformers version: 4.28.1
Accelerate version: 0.18.0
xFormers version: 0.0.19
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No

The text was updated successfully, but these errors were encountered:

patrickvonplaten · 2023-05-16T18:29:22Z

Thanks for the issue @wfng92, agree we should maybe not force disable xformers - @sayakpaul can we maybe revert/change your PR here?

sayakpaul · 2023-05-17T03:04:31Z

@patrickvonplaten on it. I will drop a follow-up PR to clear this regression.

sayakpaul · 2023-05-17T04:52:26Z

@wfng92 opened #3457. Let's continue the discussion there :)

wfng92 added the bug Something isn't working label May 16, 2023

sayakpaul self-assigned this May 17, 2023

sayakpaul mentioned this issue May 17, 2023

[Attention processor] Better warning message when shifting to AttnProcessor2_0 #3457

Merged

wfng92 closed this as completed May 17, 2023

wfng92 changed the title ~~Higher VRAM usage with PyTorch2 under certain situations~~ Higher VRAM usage with PyTorch2 without xformers under certain situations May 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Higher VRAM usage with PyTorch2 without xformers under certain situations #3441

Higher VRAM usage with PyTorch2 without xformers under certain situations #3441

wfng92 commented May 16, 2023 •

edited

Loading

patrickvonplaten commented May 16, 2023

sayakpaul commented May 17, 2023

sayakpaul commented May 17, 2023

Higher VRAM usage with PyTorch2 without xformers under certain situations #3441

Higher VRAM usage with PyTorch2 without xformers under certain situations #3441

Comments

wfng92 commented May 16, 2023 • edited Loading

Describe the bug

Reproduction

Logs

System Info

patrickvonplaten commented May 16, 2023

sayakpaul commented May 17, 2023

sayakpaul commented May 17, 2023

wfng92 commented May 16, 2023 •

edited

Loading