You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the latest development version, PR #3365 introduced a confusion which makes users believe that PT2 variant of attention processors are fully supported and xformers is no longer needed. This results in higher VRAM usage under certain situations (using LoRA/custom diffusion without xformers)
If the env is installed with both Pytorch2 and xformers, it will
raise a warning
default to PyTorch's native efficient flash attention
diffusers currently supports the following PT 2.0 variant of attention processors
AttnProcessor => AttnProcessor2_0
AttnAddedKVProcessor => AttnAddedKVProcessor2_0
The following are not supported:
SlicedAttnProcessor
SlicedAttnAddedKVProcessor
LoRAAttnProcessor
CustomDiffusionAttnProcessor
It would be great if users can still use xformers when calling the pipe.enable_xformers_memory_efficient_attention() function.
Reproduction
fromdiffusersimport (
DPMSolverMultistepScheduler,
StableDiffusionPipeline,
)
pipe=StableDiffusionPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5",
torch_dtype=torch.float16
).to("cuda")
pipe.scheduler=DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe.unet.load_attn_procs("pytorch_lora_weights.bin")
pipe.enable_xformers_memory_efficient_attention()
prompt="a photo of a dog"image=pipe(prompt=prompt, cross_attention_kwargs={"scale": 1.0}).images
Logs
"You have specified using flash attention using xFormers but you have PyTorch 2.0 already installed. "
"We will default to PyTorch's native efficient flash attention implementation provided by PyTorch 2.0.
System Info
diffusers version: 0.17.0.dev0
Platform: Windows-10-10.0.19045-SP0
Python version: 3.10.11
PyTorch version (GPU?): 2.0.0+cu118 (True)
Huggingface_hub version: 0.13.4
Transformers version: 4.28.1
Accelerate version: 0.18.0
xFormers version: 0.0.19
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No
The text was updated successfully, but these errors were encountered:
wfng92
changed the title
Higher VRAM usage with PyTorch2 under certain situations
Higher VRAM usage with PyTorch2 without xformers under certain situations
May 17, 2023
Describe the bug
Edited to reflect the actual issue
In the latest development version, PR #3365 introduced a confusion which makes users believe that PT2 variant of attention processors are fully supported and
xformers
is no longer needed. This results in higher VRAM usage under certain situations (using LoRA/custom diffusion without xformers)If the env is installed with both Pytorch2 and xformers, it will
diffusers
currently supports the following PT 2.0 variant of attention processorsThe following are not supported:
It would be great if users can still usexformers
when calling thepipe.enable_xformers_memory_efficient_attention()
function.Reproduction
Logs
System Info
diffusers
version: 0.17.0.dev0The text was updated successfully, but these errors were encountered: