Proposal: Adding more Prometheus metrics #2650

ronensc · 2024-01-29T14:07:42Z

Once #2316 is merged, I'm willing to contribute the following metrics which I believe would be helpful for monitoring the usage of vllm.

#	Metric	Type	Labels	Description
1.	vllm:request_success	Counter	finish_reason=`stop\|length`	Count of successfully processed requests.
2.	vllm:request_params_max_tokens	Histogram		Value of max_tokens request parameter.
3.	vllm:request_params_n	Histogram		Value of n request parameter.
4.	vllm:request_total_tokens	Histogram		Total sequence length of request (input tokens + generated tokens).
5.	vllm:request_prompt_tokens	Histogram		Number of prefill tokens processed.
6.	vllm:request_generation_tokens	Histogram		Number of generation tokens processed.

Notes:
metrics 5. and 6. already exist but as counters (vllm:prompt_tokens_total and vllm:generation_tokens_total). I think a Histogram is more meaningful. For backward compatibility, we can keep both types (counters and histograms).

Please let me know what you think.

The text was updated successfully, but these errors were encountered:

robertgshaw2-redhat · 2024-01-29T16:06:05Z

Other thing would be number of aborted_requests

A big limitation of the current profiling of E2E latency is that there is no normalization for the number of tokens processed. There is probably nothing we could do to normalize this perfectly, but perhaps diving E2E latency by number of generation tokens will give a better normalized metric, so perhaps something that expands on this would be good

ronensc · 2024-01-31T09:54:02Z

Thanks for your suggestion! aborted_requests metric sounds like an important addition. However, I'm not entirely clear on how it contributes to the normalization of e2e latency per the number of tokens processed. Could you please provide more details?

robertgshaw2-redhat · 2024-02-01T04:52:37Z

Oh sorry those are completely separate and should have been two bullet points

ronensc mentioned this issue Feb 5, 2024

Add more Prometheus metrics #2764

Merged

hmellor closed this as completed Aug 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Proposal: Adding more Prometheus metrics #2650

Proposal: Adding more Prometheus metrics #2650

ronensc commented Jan 29, 2024 •

edited

Loading

robertgshaw2-redhat commented Jan 29, 2024 •

edited

Loading

Uh oh!

ronensc commented Jan 31, 2024

Uh oh!

robertgshaw2-redhat commented Feb 1, 2024

Uh oh!

Uh oh!

Proposal: Adding more Prometheus metrics #2650

Proposal: Adding more Prometheus metrics #2650

Comments

ronensc commented Jan 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

robertgshaw2-redhat commented Jan 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ronensc commented Jan 31, 2024

Uh oh!

robertgshaw2-redhat commented Feb 1, 2024

Uh oh!

ronensc commented Jan 29, 2024 •

edited

Loading

robertgshaw2-redhat commented Jan 29, 2024 •

edited

Loading