Skip to content

Commit 93d364d

Browse files
authored
[Bugfix] Include encoder prompts len to non-stream api usage response (#8861)
1 parent d9cfbc8 commit 93d364d

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

vllm/entrypoints/openai/serving_chat.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -726,6 +726,8 @@ async def chat_completion_full_generator(
726726

727727
assert final_res.prompt_token_ids is not None
728728
num_prompt_tokens = len(final_res.prompt_token_ids)
729+
if final_res.encoder_prompt_token_ids is not None:
730+
num_prompt_tokens += len(final_res.encoder_prompt_token_ids)
729731
num_generated_tokens = sum(
730732
len(output.token_ids) for output in final_res.outputs)
731733
usage = UsageInfo(

0 commit comments

Comments
 (0)