Skip to content

Commit 4c33d67

Browse files
authored
[Bugfix] fix tmp_out and exp_sums dimensions (vllm-project#17438)
Signed-off-by: Hui Liu <[email protected]>
1 parent cb23495 commit 4c33d67

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm/attention/ops/chunked_prefill_paged_decode.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -289,7 +289,7 @@ def chunked_prefill_paged_decode(
289289
max_num_partitions = ((max_seq_len + _PARTITION_SIZE_ROCM - 1) //
290290
_PARTITION_SIZE_ROCM)
291291
assert _PARTITION_SIZE_ROCM % block_size == 0
292-
total_num_seq = query.shape[0]
292+
total_num_seq = block_table.shape[0]
293293
tmp_output = torch.empty(
294294
size=(total_num_seq, num_query_heads, max_num_partitions,
295295
head_size),

0 commit comments

Comments
 (0)