Skip to content

"joint_attention_kwargs" don't pass the parameters to AttentionProcessor #8855

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cyysc1998 opened this issue Jul 12, 2024 · 4 comments · Fixed by #9818
Closed

"joint_attention_kwargs" don't pass the parameters to AttentionProcessor #8855

cyysc1998 opened this issue Jul 12, 2024 · 4 comments · Fixed by #9818

Comments

@cyysc1998
Copy link

I want to train my ipadapter in sd3. I noticed that joint_attention_kwargs is used in passing the parameters to AttentionProcessor (https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/transformers/transformer_sd3.py#L309). But it seems joint_attention_kwargs is not passed to transformer_blocks. Is my usage method wrong or this parameter has not been developed yet? Thank you!

@DN6
Copy link
Collaborator

DN6 commented Jul 16, 2024

cc: @sayakpaul could you take a look here.

@sayakpaul
Copy link
Member

Hi.

We can make that happen perhaps with an IPAdapterSD3Processor but for now joint_attention_kwargs is used only for supplying lora_scale which is utilized here

scale_lora_layers(self, lora_scale)

and here

unscale_lora_layers(self, lora_scale)

Does this make sense?

@Nerogar
Copy link
Contributor

Nerogar commented Jul 27, 2024

Is there any chance this can be changed? Not being able to pass parameters (like an attention mask) to the attention processor makes things really difficult.

If this can't or won't be changed, I suggest at least changing the doc string. It currently says that the kwargs are passed to the processor:

joint_attention_kwargs (`dict`, *optional*):
    A kwargs dictionary that if specified is passed along to the `AttentionProcessor` as defined under
    `self.processor` in
    [diffusers.models.attention_processor](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_processor.py).

@sayakpaul
Copy link
Member

Is there any chance this can be changed? Not being able to pass parameters (like an attention mask) to the attention processor makes things really difficult.

We usually update that when there is a use case. If you can showcase a viable use case, we will definitely welcome the change. For SD3 (where joint_attention_kwargs is used) there is no use of attention mask or IP Adapter mask yet.

If this can't or won't be changed, I suggest at least changing the doc string. It currently says that the kwargs are passed to the processor:

You're right. Thanks for pointing that out. Could you open a PR for that? Since you know the solution, we want to honor your contribution via a PR :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants