Refactor: Define PodMetricsClient interface and hide implementation details of vllm metrics processing #26

liu-cong · 2024-10-22T00:07:23Z

No description provided.

…etails of vllm metrics processing

liu-cong · 2024-10-22T00:10:15Z

pkg/ext-proc/backend/provider.go

@@ -130,3 +130,37 @@ func (p *Provider) refreshPodsOnce() error {
 	p.podMetrics.Range(mergeFn)
 	return nil
 }
+
+func (p *Provider) refreshMetricsOnce() error {


This is moved from metrics.go, no new code

The name of this file is slightly confusing. It reads as if it's intended to be a factory class, but then we have metrics related funcs, while there is a metrics.go file.

What is this file intended to do?

yeah probably the name is too broad. It intends to "provide info of the backend", thus "provider", and metrics is part of the "info".

liu-cong · 2024-10-22T00:10:34Z

pkg/ext-proc/backend/vllm/metrics.go

+}
+
+// FetchMetrics fetches metrics from a given pod.
+func (p *PodMetricsClientImpl) FetchMetrics(pod backend.Pod, existing *backend.PodMetrics) (*backend.PodMetrics, error) {


This is moved from pod_client.go, no new code

ahg-g · 2024-10-22T16:09:00Z

pkg/ext-proc/main.go

@@ -75,7 +73,7 @@ func main() {

 	s := grpc.NewServer()

-	pp := backend.NewProvider(&backend.PodMetricsClientImpl{}, &backend.FakePodLister{Pods: pods})
+	pp := backend.NewProvider(&vllm.PodMetricsClientImpl{}, &backend.FakePodLister{Pods: pods})


why did we move it to a vllm package? shouldn't this be agnostic the model server?

Model servers don't always share the same metric names, so I expect we will need some "adapter code" for each model server.

We will likely share some helper functions across model servers, but until we integrate with the next model server, I put metrics scraping code in vllm.

ahg-g · 2024-10-22T19:43:33Z

/lgtm
/approve

just so we move forward

k8s-ci-robot · 2024-10-22T19:43:40Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, liu-cong

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [ahg-g]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Refactor: Define PodMetricsClient interface and hide implementation d…

3fa65ae

…etails of vllm metrics processing

k8s-ci-robot requested review from ahg-g and kfswain October 22, 2024 00:07

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 22, 2024

liu-cong commented Oct 22, 2024

View reviewed changes

ahg-g reviewed Oct 22, 2024

View reviewed changes

k8s-ci-robot assigned ahg-g Oct 22, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 22, 2024

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 22, 2024

k8s-ci-robot merged commit 18bc3a2 into kubernetes-sigs:main Oct 22, 2024
2 checks passed

liu-cong deleted the metrics branch October 28, 2024 18:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor: Define PodMetricsClient interface and hide implementation details of vllm metrics processing #26

Refactor: Define PodMetricsClient interface and hide implementation details of vllm metrics processing #26

liu-cong commented Oct 22, 2024

liu-cong Oct 22, 2024

kfswain Oct 22, 2024

liu-cong Oct 22, 2024

liu-cong Oct 22, 2024

ahg-g Oct 22, 2024

liu-cong Oct 22, 2024

liu-cong Oct 22, 2024

ahg-g commented Oct 22, 2024

k8s-ci-robot commented Oct 22, 2024

Refactor: Define PodMetricsClient interface and hide implementation details of vllm metrics processing #26

Refactor: Define PodMetricsClient interface and hide implementation details of vllm metrics processing #26

Conversation

liu-cong commented Oct 22, 2024

liu-cong Oct 22, 2024

Choose a reason for hiding this comment

kfswain Oct 22, 2024

Choose a reason for hiding this comment

liu-cong Oct 22, 2024

Choose a reason for hiding this comment

liu-cong Oct 22, 2024

Choose a reason for hiding this comment

ahg-g Oct 22, 2024

Choose a reason for hiding this comment

liu-cong Oct 22, 2024

Choose a reason for hiding this comment

liu-cong Oct 22, 2024

Choose a reason for hiding this comment

ahg-g commented Oct 22, 2024

k8s-ci-robot commented Oct 22, 2024