Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert name change to make pool name more descriptive. #516

Closed
wants to merge 1 commit into from

Conversation

kfswain
Copy link
Collaborator

@kfswain kfswain commented Mar 17, 2025

With the addition of the CPU deployment path, we recently changed the name to my-pool which is so generic as to not mean much of anything.

Reverting this change to give some context clues as to what an InferencePool is intended to house. vllm-llama2-7b-pool is fairly descriptive in what the pool contains.

We will soon have helm charts: #416 so this name can be more configurable. But for the getting started guide in the short term I would rather make an opinionated decision on the naming.

@k8s-ci-robot k8s-ci-robot requested review from ahg-g and liu-cong March 17, 2025 20:08
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kfswain

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 17, 2025
Copy link

netlify bot commented Mar 17, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit ff4f6fb
🔍 Latest deploy log https://app.netlify.com/sites/gateway-api-inference-extension/deploys/67d881528863980008af2054
😎 Deploy Preview https://deploy-preview-516--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@ahg-g
Copy link
Contributor

ahg-g commented Mar 17, 2025

I sent #515 to also change the name of the ext_proc.yaml file to inferencepool.yaml and to align the naming of resources that gets created for the pool (the deployment and service). The two PRs will conflict now :)

@kfswain
Copy link
Collaborator Author

kfswain commented Mar 17, 2025

Closing in favor of #515 since they were in the same files, thanks @ahg-g!

@kfswain kfswain closed this Mar 17, 2025
@nirrozenbaum
Copy link
Contributor

@ahg-g @kfswain the name is now descriptive for the gpu deployment, but using the same name for the cpu deployment may be very confusing, as the cpu deployment do NOT use the llama2 model.

my original intention was to have a label that can be used for any of the deployments and describe a general pool.
btw, in many projects quickstart guides, the first sample is called "my-sample" (or "my-xxx", you get the point).

I think this change should be reverted and maybe if we want to describe the deployment with the concrete model we can put additional label with model: llama2 or model: qwen, but for the pods selector should keep using the app: my-pool.

@ahg-g
Copy link
Contributor

ahg-g commented Mar 19, 2025

I agree, we can call it pool-1, we will have #416 merge soon and I think a generic name makes sense.

@kfswain kfswain deleted the naming-change branch March 19, 2025 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants