Skip to content

Commit ac60dc7

Browse files
njhilldashanji
andauthored
[V1][BugFix] Fix for mixed top_k batch (#14301)
Signed-off-by: Nick Hill <[email protected]> Co-authored-by: Ye Cao <[email protected]>
1 parent a4f1ee3 commit ac60dc7

File tree

1 file changed

+5
-2
lines changed

1 file changed

+5
-2
lines changed

vllm/v1/worker/gpu_input_batch.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -264,9 +264,12 @@ def add_request(
264264
self.top_p_cpu[req_index] = sampling_params.top_p
265265
if sampling_params.top_p < 1:
266266
self.top_p_reqs.add(req_id)
267-
self.top_k_cpu[req_index] = sampling_params.top_k
268-
if sampling_params.top_k > 0:
267+
top_k = sampling_params.top_k
268+
if 0 < top_k < self.vocab_size:
269269
self.top_k_reqs.add(req_id)
270+
else:
271+
top_k = self.vocab_size
272+
self.top_k_cpu[req_index] = top_k
270273
self.min_p_cpu[req_index] = sampling_params.min_p
271274
self.frequency_penalties_cpu[
272275
req_index] = sampling_params.frequency_penalty

0 commit comments

Comments
 (0)