Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to use the feature of supporting for A1111 LoRA #3725

Closed
icech opened this issue Jun 9, 2023 · 21 comments · Fixed by #4147
Closed

failed to use the feature of supporting for A1111 LoRA #3725

icech opened this issue Jun 9, 2023 · 21 comments · Fixed by #4147

Comments

@icech
Copy link

icech commented Jun 9, 2023

I am glad to see the diffusers add the supporting for A1111 LoRA. However, i failed to ues this feature after I update the diffusers.
It did not exporting error as before, but the lora have no effect in the generated images.
I use it as following:

pipe = StableDiffusionPipeline.from_pretrained(repo_id, torch_dtype=torch.float16, 
                                               revision="fp16", safety_checker=None).to("cuda")
pipe.load_lora_weights("./Lora", weight_name="xxx.safetensors")
generator = torch.Generator(device='cuda')
prompt = "a photograph of a man running with dog"
image = pipe(prompt, width=768, height=1280, 
                       generator=generator, num_inference_steps=20, 
                       cross_attention_kwargs={"scale": 1}
                       ).images[0]

and this is how I used to add lora in the past which is form #3064 by @pdoane :

def load_lora_weights(pipeline, checkpoint_path, multiplier, device, dtype):
    LORA_PREFIX_UNET = "lora_unet"
    LORA_PREFIX_TEXT_ENCODER = "lora_te"
    # load LoRA weight from .safetensors
    state_dict = load_file(checkpoint_path, device=device)

    updates = defaultdict(dict)
    for key, value in state_dict.items():
        # it is suggested to print out the key, it usually will be something like below
        # "lora_te_text_model_encoder_layers_0_self_attn_k_proj.lora_down.weight"

        layer, elem = key.split('.', 1)
        updates[layer][elem] = value

    # directly update weight in diffusers model
    for layer, elems in updates.items():

        if "text" in layer:
            layer_infos = layer.split(LORA_PREFIX_TEXT_ENCODER + "_")[-1].split("_")
            curr_layer = pipeline.text_encoder
        else:
            layer_infos = layer.split(LORA_PREFIX_UNET + "_")[-1].split("_")
            curr_layer = pipeline.unet

        # find the target layer
        temp_name = layer_infos.pop(0)
        while len(layer_infos) > -1:
            try:
                curr_layer = curr_layer.__getattr__(temp_name)
                if len(layer_infos) > 0:
                    temp_name = layer_infos.pop(0)
                elif len(layer_infos) == 0:
                    break
            except Exception:
                if len(temp_name) > 0:
                    temp_name += "_" + layer_infos.pop(0)
                else:
                    temp_name = layer_infos.pop(0)

        # get elements for this layer
        weight_up = elems['lora_up.weight'].to(dtype)
        weight_down = elems['lora_down.weight'].to(dtype)
        alpha = elems['alpha']
        if alpha:
            alpha = alpha.item() / weight_up.shape[1]
        else:
            alpha = 1.0

        # update weight
        if len(weight_up.shape) == 4:
            curr_layer.weight.data += multiplier * alpha * torch.mm(weight_up.squeeze(3).squeeze(2), weight_down.squeeze(3).squeeze(2)).unsqueeze(2).unsqueeze(3)
        else:
            curr_layer.weight.data += multiplier * alpha * torch.mm(weight_up, weight_down)

    return pipeline

Is it my incorrect usage or is there a difference between the current code and what @pdoane provided?

@patrickvonplaten
Copy link
Contributor

cc @sayakpaul

@sayakpaul
Copy link
Member

Hi @icech.

Could you share your LoRA file so that we can debug it on our end?

Cc: @takuma104

@icech
Copy link
Author

icech commented Jun 13, 2023

Hi @icech.

Could you share your LoRA file so that we can debug it on our end?

Cc: @takuma104

The link of my LoRA file is
https://civitai.com/models/7501/vivid-watercolors-lora-extraction

@sayakpaul
Copy link
Member

sayakpaul commented Jun 14, 2023

@icech I am able to use the LoRA without any problems. See my Colab:
https://colab.research.google.com/gist/sayakpaul/1fff0ff9c5a059364f80ac0b64920592/scratchpad.ipynb

Of course, I don't know about the base pipeline associated with that. So, you will have to work that one out.

@icech
Copy link
Author

icech commented Jun 21, 2023

@icech I am able to use the LoRA without any problems. See my Colab: https://colab.research.google.com/gist/sayakpaul/1fff0ff9c5a059364f80ac0b64920592/scratchpad.ipynb

Of course, I don't know about the base pipeline associated with that. So, you will have to work that one out.

I'm sorry for the delayed response. I have reviewed your Colab notebook and made some modifications. I can provide a comparison between the version with and without Lora, and generate four images. Only one of the images will be different.
The code is below

import torch

from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler

pipeline = StableDiffusionPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, safety_checker=None
).to("cuda")

pipeline.scheduler = DPMSolverMultistepScheduler.from_config(
    pipeline.scheduler.config, use_karras_sigmas=True
)


images = pipeline(prompt="masterpiece, best quality, mountain landscape",
    negative_prompt="bad quality",
    width=512,
    height=512,
    num_inference_steps=15,
    num_images_per_prompt=4,
    generator=torch.manual_seed(0)
).images


for i, image in enumerate(images):
    image.save(f'images/nonelora_{i}.png')
    
    
pipeline.load_lora_weights(".", weight_name="vividWatercolors_10.safetensors")
images = pipeline(prompt="masterpiece, best quality, mountain landscape",
    negative_prompt="bad quality",
    width=512,
    height=512,
    num_images_per_prompt=4,
    num_inference_steps=15,
    generator=torch.manual_seed(0)
).images

for i, image in enumerate(images):
    image.save(f'images/lora_{i}.png')

@sayakpaul
Copy link
Member

I see what you're saying. Reproduced here: https://colab.research.google.com/gist/sayakpaul/b645715d9144a3a6dc40c93bdceee929/scratchpad.ipynb.

Some questions:

  • Could you also have some images (LoRA) for us so that we can have some one-on-one comparisons?
  • Are the results being affected because we're not likely using the right base model? For example, here, we're using the right base model to load the parameters into. We need to ensure the base model being used here is indeed the correct one.

@takuma104
Copy link
Contributor

That's quite an interesting result. It seems that the effect varies in magnitude, but it's not just the first one that differs; there appears to be some change in all of them. I've posted an image created by merging the two results using the difference mode in Photoshop.

Without LoRA:
ダウンロード (9)

With LoRA:
ダウンロード (10)

Diff:
diff

@icech
Copy link
Author

icech commented Jun 22, 2023

I see what you're saying. Reproduced here: https://colab.research.google.com/gist/sayakpaul/b645715d9144a3a6dc40c93bdceee929/scratchpad.ipynb.

Some questions:

  • Could you also have some images (LoRA) for us so that we can have some one-on-one comparisons?
  • Are the results being affected because we're not likely using the right base model? For example, here, we're using the right base model to load the parameters into. We need to ensure the base model being used here is indeed the correct one.

I've seen the graph in your Colab. they are the same as what I drew before. I'm sorry that I can't provide any pictures these days since I'm on vacation and don't have access to my machine. However, you can test the 'load_lora_weights(pipeline, checkpoint_path, multiplier, device, dtype)' I originally mentioned to replace 'pipeline.load_lora_weights' for loading lora. The usage is 'pipe = load_lora_weights(pipe, lora_path, 1.0, 'cuda', torch.float32)', so you can make a comparison. Based on my previous experience, the two should not be consistent, and the version of @pdoane is the expected result, consistent with A111.

@icech
Copy link
Author

icech commented Jun 22, 2023

That's quite an interesting result. It seems that the effect varies in magnitude, but it's not just the first one that differs; there appears to be some change in all of them. I've posted an image created by merging the two results using the difference mode in Photoshop.

Without LoRA: ダウンロード (9)

With LoRA: ダウンロード (10)

Diff: diff

Your analysis is very rigorous. Indeed, there are differences in the subsequent images, but these differences are not as expected.

@sayakpaul
Copy link
Member

I will dive more to find out what we're missing :) But expect some delay as I am on the move and away for sometime.

@alexblattner
Copy link

@icech use this:
loraLoader.txt

thank me later.

@icech
Copy link
Author

icech commented Jun 27, 2023

@icech use this: loraLoader.txt

thank me later.

I've tried this and I know it's feasible, but I mainly want to use the official API for easier maintenance in the future. Thank you anyway.

@patrickvonplaten
Copy link
Contributor

patrickvonplaten commented Jun 28, 2023

Let's try to fix this this week so it's in the next release cc @sayakpaul , could this maybe be fixed with: #3778 ?

@sayakpaul
Copy link
Member

I originally mentioned to replace 'pipeline.load_lora_weights' for loading lora. The usage is 'pipe = load_lora_weights(pipe, lora_path, 1.0, 'cuda', torch.float32)', so you can make a comparison. Based on my previous experience, the two should not be consistent, and the version of @pdoane is the expected result, consistent with A111.

@icech I am trying to understand this better and would appreciate your inputs here. If I do pipeline.load_lora_weights(".", weight_name="vividWatercolors_10.safetensors") (as done in my Colab), it actually uses "cuda" along with a LoRA scale of 1. Is it not what you used in your experiments too?

Regardless, I will dive deeper into the loaded parameters and see what we're missing out on :)

@icech
Copy link
Author

icech commented Jun 29, 2023

I originally mentioned to replace 'pipeline.load_lora_weights' for loading lora. The usage is 'pipe = load_lora_weights(pipe, lora_path, 1.0, 'cuda', torch.float32)', so you can make a comparison. Based on my previous experience, the two should not be consistent, and the version of @pdoane is the expected result, consistent with A111.

@icech I am trying to understand this better and would appreciate your inputs here. If I do pipeline.load_lora_weights(".", weight_name="vividWatercolors_10.safetensors") (as done in my Colab), it actually uses "cuda" along with a LoRA scale of 1. Is it not what you used in your experiments too?

Regardless, I will dive deeper into the loaded parameters and see what we're missing out on :)

This is same as my experiments. I will provide some images of my result tomorrow(about 10 hours later) for you to compare.

@sayakpaul
Copy link
Member

Went deep into this issue.

TL;DR: With the current support for loading A1111 in Diffusers, we are unable to load certain keys, especially the ones containing 'mlp' and 'ff' in their keys. This is what is causing the differences in the quality of the generated outputs. I believe we'll be able to resolve this with #3756 which we're working on with @takuma104.

I was able to use @pdoane's script and generate the expected outputs. Check out this Colab. You'd notice that their method allows for merging all the weights where the current diffusers support doesn't allow that. We cannot go the merging way in diffusers as it doesn't allow for switching to a later attention processor easily. This is the primary reason. With #3756, this should be addressed and hopefully resolved.

To make this finding even more concrete, I prepared this script: https://gist.github.com/sayakpaul/c269da54270f6d866ef5acafd4bf8319. This shows us that, indeed, we're not loading all the keys and it's actually a known phenomenon.

Thanks for bringing this to our attention. And hopefully, we should be able to fix this soon.

Cc: @patrickvonplaten

@CapsAdmin
Copy link

I was interested in trying to add loras to diffusers and stumbled upoun this code snippet. It works for some loras but not all.

So I took some new code from AUTOMATIC1111/stable-diffusion-webui#11821 which implements more lora support and converted it to work like the snippet above. So it merges the layers into pipeline as opposed to hooking onto torch forward functions.

The code can be found here: https://github.com/CapsAdmin/diffusers-a1111/blob/main/src/merge_lora_to_pipeline.py

I tested all the models mentioned in the a1111 pr, so it supports hada, ia3, lokr, "full" and lora. The script is self contained apart from importing "shared" which is just some dtype and device variables. So perhaps this is of interest to you @sayakpaul for testing on collab or something.

@sayakpaul
Copy link
Member

@CapsAdmin thanks so much! Does that script work for SDXL LoRA checkpoints too? Maybe it would make sense if you created a converter space with your script to let people easily use it (like this one: https://huggingface.co/spaces/diffusers/sd-to-diffusers)?

So it merges the layers into pipeline as opposed to hooking onto torch forward functions.

Unfortunately, by our design, we're a bit hesitant to directly merge the weights into the concerned modules. So, we will have to think about it a bit.

@CapsAdmin
Copy link

CapsAdmin commented Jul 19, 2023

@CapsAdmin thanks so much! Does that script work for SDXL LoRA checkpoints too?

I can try to get SDXL working, there was a very small amount of additional code that supposedly enabled it but I left it out to focus on getting it working with 1.5.

Maybe it would make sense if you created a converter space with your script to let people easily use it (like this one: https://huggingface.co/spaces/diffusers/sd-to-diffusers)?

It was just intended as something you'd plug into someone's diffusers backend, ie sdnext (the a1111 fork) is currently moving to diffusers but it currently does not support loading lora's the way the original backend does.

Since I'm merging this into the pipeline, I guess this is not far from a "merge loras into diffusers checkpoint" utility, but I don't really see the necessity for something like that. The other use case I intended for this is just that it could be an example/debug implementation for diffusers to do it properly.

So it merges the layers into pipeline as opposed to hooking onto torch forward functions.

Unfortunately, by our design, we're a bit hesitant to directly merge the weights into the concerned modules. So, we will have to think about it a bit.

When it comes to merging into the pipeline, I see pros and cons and I'm honestly not sure which is better. Keep in mind I'm not very versed in this space.

pros:

  • Can potentially be simpler
    • if the api is explicit enough you could make it so you can only load loras and even embeddings in the same function where you create a pipeline from a checkpoint
  • Faster inference because you don't need to do any extra calculations

cons:

  • Increases loading time
  • Complicates things for a frontend that load the pipeline once when the checkpoint changes while the user expects being able to load and change weights of loras quickly
    • It seems like all of the popular stable diffusion frontends right now assume you can load and unload loras on the fly
  • Changing the weight of a lora now needs to reload the entire pipeline

If you wanted to support loading and unloading on the fly there are ways to merge internally by keeping track of the changes a lora does to a pipeline, but this is very messy. Maybe you could even unload a lora by reversing the calculation, however with this method I would worry about losing precision.

@sayakpaul
Copy link
Member

sayakpaul commented Jul 19, 2023

Hey @icech could you give #4147 a try?

Just install diffusers using pip install git+https://github.com/isidentical/diffusers@kohya-lora-aux-features.

Here I have hosted a couple of samples for you https://huggingface.co/datasets/sayakpaul/3725_test/.

Here's a side-by-side comparison:

Non-LoRA LoRA
image image

Let us know your findings!

Cc: @isidentical.

@sayakpaul
Copy link
Member

@CapsAdmin also thanks for explaining this. We're trying to improve the support in #4147 thanks to @isidentical. Watch out :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants