Skip to content

[regression] StableDiffusionPipeline.from_single_file() does not handle SD 1.5 models with prediction_type v_prediction #9171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lstein opened this issue Aug 14, 2024 · 13 comments
Labels
bug Something isn't working stale Issues that haven't received updates

Comments

@lstein
Copy link
Contributor

lstein commented Aug 14, 2024

Describe the bug

There are a few Stable Diffusion 1.5 models that use a prediction type of v_prediction rather than epsilon. In version 0.27.0, StableDiffusionPipeline.from_single_file() correctly detected and rendered images from such models. However, in version 0.30.0, these models are always treated as epsilon, even when the correct prediction_type and original_config arguments are set.

Reproduction

You will need to download the original config file, EasyFluffV11.yaml into the current directory for this to work. After running, the file sushi.png will show incorrect rendering.

from diffusers import StableDiffusionPipeline
import torch

model_id = 'https://huggingface.co/zatochu/EasyFluff/blob/main/EasyFluffV11.safetensors'
yaml_path = './EasyFluffV11.yaml'

pipe = StableDiffusionPipeline.from_single_file(model_id,
                                                original_config=yaml_path,
                                                prediction_type='v_prediction',
                                                torch_dtype=torch.float16,
                                                ).to("cuda")
prompt = "banana sushi"
image = pipe(prompt, num_inference_steps=25).images[0]
image.save("sushi.png")

Logs

Fetching 11 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11/11 [00:00<00:00, 7330.37it/s]
Loading pipeline components...:   0%|                                                                                                                    | 0/6 [00:00<?, ?it/s]Some weights of the model checkpoint were not used when initializing CLIPTextModel: 
 ['text_model.embeddings.position_ids']
Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 26.26it/s]
You have disabled the safety checker for <class 'diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.StableDiffusionPipeline'> by passing `safety_checker=None`. Ensure that you abide to the conditions of the Stable Diffusion license and do not expose unfiltered results in services or applications open to the public. Both the diffusers team and Hugging Face strongly recommend to keep the safety filter enabled in all public facing circumstances, disabling it only for use-cases that involve analyzing network behavior or auditing its results. For more information, please have a look at https://github.com/huggingface/diffusers/pull/254 .
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 25/25 [00:01<00:00, 16.72it/s]


### System Info

  • 🤗 Diffusers version: 0.30.0
  • Platform: Linux-5.15.0-113-generic-x86_64-with-glibc2.35
  • Running on Google Colab?: No
  • Python version: 3.10.12
  • PyTorch version (GPU?): 2.2.2+cu121 (True)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Huggingface_hub version: 0.23.5
  • Transformers version: 4.41.1
  • Accelerate version: 0.31.0
  • PEFT version: 0.11.1
  • Bitsandbytes version: not installed
  • Safetensors version: 0.4.3
  • xFormers version: 0.0.25.post1
  • Accelerator: NVIDIA GeForce RTX 4070, 12282 MiB
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

### Who can help?

@yiyixuxu @asomoza
@lstein lstein added the bug Something isn't working label Aug 14, 2024
@DN6
Copy link
Collaborator

DN6 commented Aug 14, 2024

@lstein Can you share the image outputs from v0.29.2 and v0.30.0?

@DN6
Copy link
Collaborator

DN6 commented Aug 14, 2024

By any chance do you have runwayml/stable-diffusion-v1-5 saved in your HF Cache directory?

@lstein
Copy link
Contributor Author

lstein commented Aug 14, 2024

@lstein Can you share the image outputs from v0.29.2 and v0.30.0?

My bad. The regression is present in 0.29.2 as well. The previous working version was 0.27.0. I have amended the bug report.

Here is the output from the script run with diffusers 0.27.0 vs 0.30.0. Also note the difference in image size. 0.27.0 apparently thinks this is an sd-2 model.

0.27.0
sushi-0 27
0.30.0
sushi-0 30

@lstein
Copy link
Contributor Author

lstein commented Aug 14, 2024

By any chance do you have runwayml/stable-diffusion-v1-5 saved in your HF Cache directory?

Indeed yes. I've seen that from_single_file() downloads it into the cache if it isn't there already. This seems to be the way it gets the component .json config files for the base model of the checkpoint file being loaded.

@DN6
Copy link
Collaborator

DN6 commented Aug 16, 2024

Hi @lstein yes, we updated single file to rely on the model cache/configs to set up the pipleines. It enables us to support single file on a larger range for models. The prediction_type argument is deprecated and will be removed eventually. Although we should show a warning here. I will open a PR for it.

I noticed that the scheduler in the repo you linked does contain a config that sets v_prediction. You can configure your pipeline in the following way to enable correct inference.

from diffusers import StableDiffusionPipeline
import torch

model_id = 'https://huggingface.co/zatochu/EasyFluff/blob/main/EasyFluffV11.safetensors'

pipe = StableDiffusionPipeline.from_single_file(
    model_id,
    config="zatochu/EasyFluff",
    torch_dtype=torch.float16,
).to("cuda")

prompt = "banana sushi"
image = pipe(prompt, num_inference_steps=25).images[0]
image.save("sushi.png")

@lstein
Copy link
Contributor Author

lstein commented Aug 17, 2024

I noticed that the scheduler in the repo you linked does contain a config that sets v_prediction. You can configure your pipeline in the following way to enable correct inference.

I'm a developer of InvokeAI, and am trying to support users who import arbitrary .safetensors models, so it will be difficult to find a general mechanism to identify the diffusers model with a config that matches what the safetensors file needs. Can you suggest how to do this?

@DN6
Copy link
Collaborator

DN6 commented Aug 19, 2024

In most cases we can auto match to the appropriate config, provided that the .safetensors file is in the original format and not the diffusers format. If you check the keys of the single file checkpoint and the diffusers checkpoints you will notice that the keys are different.

In this particular case you're setting the prediction_type argument anyway since the YAML configs do not contain that information either.

You could configure a scheduler before hand with prediction type and set it in the pipeline.

e.g

from diffusers import StableDiffusionPipeline, DDIMScheduler

ckpt_path = "https://huggingface.co/zatochu/EasyFluff/blob/main/EasyFluffV11.safetensors"
pipe = StableDiffusionPipeline.from_single_file(ckpt_path)
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, prediction_type="v_prediction")
print(pipe.scheduler.config.prediction_type)

from_single_file operates on the assumption that you are trying to load a checkpoint saved in the original format. We could update/add a util function in diffusers.loader.single_file_utils that raises an error if we can't match to an appropriate config . The current behaviour is to default to SD 1.5, which can be confusing.

Do you happen to have a list of models that would need to support these arbitrary .safetensors files? Just so I understand your requirements a bit better?

@yiyixuxu
Copy link
Collaborator

the yaml file does specify the v_prediction though https://huggingface.co/zatochu/EasyFluff/blob/main/EasyFluffV11.yaml#L5

should we consider adding a special check for this config when a yaml is passed? I think this is really an edge case where a fine-tuned checkpoint can have a different configuration from the base checkpoint

@DN6
Copy link
Collaborator

DN6 commented Aug 20, 2024

Ah my bad. Missed that. But even in earlier versions, we relied on the prediction_type argument to configure the scheduler. It wasn't set from the YAML.

scheduler_config["prediction_type"] = prediction_type

In the current version, setting via prediction_type only works if

  1. local_file_only=True
  2. A cached config for the model isn't present locally.

The reasoning was to encourage setting the prediction type via the Scheduler object and passing that object to the pipeline. Like we do for from_pretrained. I think I missed this potential path during the refactor, so it is a breaking change. We can add additional checks for legacy kwargs and update the loading, but these kwargs are slated to be removed and this is a bit of an edge case. I would recommend following the same configuration process as from_pretrained when doing single file loading and configuring the scheduler object before hand or using the config argument.

@yiyixuxu
Copy link
Collaborator

@lstein
can you let us know if the solution @DN6 proposed here works for you? #9171 (comment)

@DN6
Copy link
Collaborator

DN6 commented Aug 20, 2024

PR to address the current issue: #9229

Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Sep 14, 2024
@DN6
Copy link
Collaborator

DN6 commented Jan 16, 2025

Closing since #9229 was merged to fix the issue.

@DN6 DN6 closed this as completed Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale Issues that haven't received updates
Projects
None yet
Development

No branches or pull requests

3 participants