Skip to content

Commit 047797e

Browse files
authored
[Bugfix] Triton FA function takes no keyword arguments (#16902)
Signed-off-by: vllmellm <[email protected]>
1 parent eb8ef42 commit 047797e

File tree

1 file changed

+8
-1
lines changed

1 file changed

+8
-1
lines changed

vllm/attention/backends/mla/common.py

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1091,7 +1091,14 @@ def _flash_attn_varlen_diff_headdims(self, q, k, v, softmax_scale,
10911091
q,
10921092
k,
10931093
maybe_padded_v,
1094-
**kwargs,
1094+
None, # output
1095+
kwargs["cu_seqlens_q"],
1096+
kwargs["cu_seqlens_k"],
1097+
kwargs["max_seqlen_q"],
1098+
kwargs["max_seqlen_k"],
1099+
kwargs["causal"],
1100+
softmax_scale,
1101+
None, # bias
10951102
)
10961103
if is_vllm_fa:
10971104
attn_out = self.flash_attn_varlen_func(

0 commit comments

Comments
 (0)