"joint_attention_kwargs" don't pass the parameters to AttentionProcessor #8855

cyysc1998 · 2024-07-12T09:58:44Z

I want to train my ipadapter in sd3. I noticed that joint_attention_kwargs is used in passing the parameters to AttentionProcessor (https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/transformers/transformer_sd3.py#L309). But it seems joint_attention_kwargs is not passed to transformer_blocks. Is my usage method wrong or this parameter has not been developed yet? Thank you!

DN6 · 2024-07-16T11:03:54Z

cc: @sayakpaul could you take a look here.

sayakpaul · 2024-07-16T11:18:05Z

Hi.

We can make that happen perhaps with an IPAdapterSD3Processor but for now joint_attention_kwargs is used only for supplying lora_scale which is utilized here

diffusers/src/diffusers/models/transformers/transformer_sd3.py

Line 329 in 3b37fef

scale_lora_layers(self, lora_scale)

and here

diffusers/src/diffusers/models/transformers/transformer_sd3.py

Line 391 in 3b37fef

unscale_lora_layers(self, lora_scale)

Does this make sense?

Nerogar · 2024-07-27T14:33:54Z

Is there any chance this can be changed? Not being able to pass parameters (like an attention mask) to the attention processor makes things really difficult.

If this can't or won't be changed, I suggest at least changing the doc string. It currently says that the kwargs are passed to the processor:

joint_attention_kwargs (`dict`, *optional*):
    A kwargs dictionary that if specified is passed along to the `AttentionProcessor` as defined under
    `self.processor` in
    [diffusers.models.attention_processor](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_processor.py).

sayakpaul · 2024-07-27T16:36:28Z

Is there any chance this can be changed? Not being able to pass parameters (like an attention mask) to the attention processor makes things really difficult.

We usually update that when there is a use case. If you can showcase a viable use case, we will definitely welcome the change. For SD3 (where joint_attention_kwargs is used) there is no use of attention mask or IP Adapter mask yet.

If this can't or won't be changed, I suggest at least changing the doc string. It currently says that the kwargs are passed to the processor:

You're right. Thanks for pointing that out. Could you open a PR for that? Since you know the solution, we want to honor your contribution via a PR :)

cyysc1998 closed this as completed Jul 17, 2024

Matrix53 mentioned this issue Oct 31, 2024

Support pass kwargs to sd3 custom attention processor #9818

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"joint_attention_kwargs" don't pass the parameters to AttentionProcessor #8855

"joint_attention_kwargs" don't pass the parameters to AttentionProcessor #8855

cyysc1998 commented Jul 12, 2024

DN6 commented Jul 16, 2024

sayakpaul commented Jul 16, 2024

Nerogar commented Jul 27, 2024

sayakpaul commented Jul 27, 2024

"joint_attention_kwargs" don't pass the parameters to AttentionProcessor #8855

"joint_attention_kwargs" don't pass the parameters to AttentionProcessor #8855

Comments

cyysc1998 commented Jul 12, 2024

DN6 commented Jul 16, 2024

sayakpaul commented Jul 16, 2024

Nerogar commented Jul 27, 2024

sayakpaul commented Jul 27, 2024