Allow partial metric updates #561

liu-cong · 2025-03-22T04:22:07Z

This PR allows partial metric updates even if the PodMetricsClient returns an error due to failure in processing a subset of the metrics. A partial update is considered better than no update.

A practical issue this fixes is that in vllm v1, if LoRA adapter is not enabled, the lora metrics won't even show up. Therefore the metrics won't get refreshed.

This also helps with any transient error in missing a metric.

Note1: For the LoRA metric, technically we can have a flag to indicate whether we can skip scraping it. That can be a separate followup.

Note2: There should be a separate "conformance test" effort to make sure the supported model server emit the metrics required by our protocol. The PodMetricsClient shouldn't be responsible for that. Therefore it's safe to optimistically allow partial updates.

netlify · 2025-03-22T04:22:23Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`0190c16`
🔍 Latest deploy log	https://app.netlify.com/sites/gateway-api-inference-extension/deploys/67de3af17ddda6000851f406
😎 Deploy Preview	https://deploy-preview-561--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

liu-cong · 2025-03-24T21:39:30Z

/assign @ahg-g

ahg-g · 2025-03-24T22:11:19Z

/approve
/lgtm

We need to flag stale metrics without completely spamming the the logs

k8s-ci-robot · 2025-03-24T22:11:26Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, liu-cong

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [ahg-g]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

liu-cong · 2025-03-24T22:15:05Z

We need to flag stale metrics without completely spamming the the logs

Right. Today these logs will print as TRACE level so by default you shouldn't see them.

What do you think about this:

Short term: Add a flag to indicate whether lora is enabled. If not, don't scrape lora metrics.
Long term: We should emphasize in the model server protocol that if a metric is not applicable, model server should emit a safe "zero" value. Alternatively we can formalize a new metric to indicate whether a certain feature is enabled/disabled.

Allow partial metric updates

0190c16

k8s-ci-robot requested a review from kfswain March 22, 2025 04:22

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 22, 2025

k8s-ci-robot requested a review from robscott March 22, 2025 04:22

k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Mar 22, 2025

k8s-ci-robot assigned ahg-g Mar 24, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 24, 2025

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 24, 2025

k8s-ci-robot merged commit 6b1fbfd into kubernetes-sigs:main Mar 24, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow partial metric updates #561

Allow partial metric updates #561

liu-cong commented Mar 22, 2025

netlify bot commented Mar 22, 2025 •

edited

Loading

liu-cong commented Mar 24, 2025

ahg-g commented Mar 24, 2025

k8s-ci-robot commented Mar 24, 2025

liu-cong commented Mar 24, 2025

Allow partial metric updates #561

Allow partial metric updates #561

Conversation

liu-cong commented Mar 22, 2025

netlify bot commented Mar 22, 2025 • edited Loading

✅ Deploy Preview for gateway-api-inference-extension ready!

liu-cong commented Mar 24, 2025

ahg-g commented Mar 24, 2025

k8s-ci-robot commented Mar 24, 2025

liu-cong commented Mar 24, 2025

netlify bot commented Mar 22, 2025 •

edited

Loading