Skip to content

API Shift/Refactor #93

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Dec 19, 2024
Merged

API Shift/Refactor #93

merged 10 commits into from
Dec 19, 2024

Conversation

kfswain
Copy link
Collaborator

@kfswain kfswain commented Dec 11, 2024

This PR modfies the CRD name(s) and some of the CRD structure as discussed in the Dec 5th, 2024 inference-gateway contributors meeting.

Highlights being:

  • LLMServerPool -> InferencePool
  • LLMService -> InferenceModel
  • LLMService referencing plural 'LLM Use Cases' and 'LLMServerPool' -> singular reference to use case and pool

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Dec 11, 2024
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 11, 2024
@kfswain kfswain force-pushed the api-updates branch 2 times, most recently from 49760cd to 4cfac64 Compare December 11, 2024 17:45
@kfswain
Copy link
Collaborator Author

kfswain commented Dec 11, 2024

This is part of the effort for #68

@kfswain
Copy link
Collaborator Author

kfswain commented Dec 11, 2024

And #69 (nice)

Copy link
Contributor

@ahg-g ahg-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kfswain
Copy link
Collaborator Author

kfswain commented Dec 12, 2024

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Dec 12, 2024
Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kfswain!

// LLMService represents a set of LLM services that are multiplexed onto one
// or more LLMServerPools. This resource is managed by the "LLM Service Owner"
// persona. The Service Owner persona is: a team that trains, verifies, and
// InferenceModelSpec represents a set of Models/Adapters that are multiplexed onto one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused by the model vs. adapter terminology here. Would it be more accurate to say "models or adapters"?

Secondly, I think this shift includes a move to a singular model/adapter, so recommend changing the terminology here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated wording and broke formatting out into loose paragraphs to show separate ideas. LMKWYT

@ahg-g
Copy link
Contributor

ahg-g commented Dec 13, 2024

/hold

just so it doesn't get merged by an accidental lgtm

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 13, 2024
@liu-cong
Copy link
Contributor

There are a couple nits left but generally LGTM. Thanks for the big update!

Will leave defer approval to @ahg-g and @robscott .

Comment on lines +114 to +116
// TargetModel represents a deployed model or a LoRA adapter. The
// Name field is expected to match the name of the LoRA adapter
// (or base model) as it is registered within the model server. Inference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if TargetModel is a bit misleading if it also represents an adapter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, but this mirrors the Open AI API spec, which just expects a modelName. So we chose not to deviate from that pattern.

LMKWYT

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it may not be necessarily be an adapter too

Critical Criticality = "Critical"
// More important than Sheddable, less important than Critical.
// Requests in this band will be shed before critical traffic.
Default Criticality = "Default"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about representing these in numbers like priorities instead of abstract names? Everything is relative to each other.

Copy link
Member

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kfswain Catchy PR title but let's be more descriptive :-)

@kfswain kfswain changed the title Big dawg api change API Shift/Refactor Dec 17, 2024
@kfswain
Copy link
Collaborator Author

kfswain commented Dec 17, 2024

@kfswain Catchy PR title but let's be more descriptive :-)

No worries! I was suprised it lasted as long as it did

@ahg-g
Copy link
Contributor

ahg-g commented Dec 17, 2024

/hold
/lgtm

I am good with this current incarnation of the api

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 17, 2024
Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kfswain! A few more nits, otherwise LGTM.

Critical Criticality = "Critical"
// More important than Sheddable, less important than Critical.
// Requests in this band will be shed before critical traffic.
Default Criticality = "Default"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get on board with starting with 3 simple values here, but likely worth revisiting in the future to see if additional granularity would be helpful. @terrytangyuan did you have any specific use cases in mind that would benefit from having additional levels here?

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kfswain, robscott

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 18, 2024
@ahg-g
Copy link
Contributor

ahg-g commented Dec 19, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 19, 2024
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 19, 2024
@kfswain
Copy link
Collaborator Author

kfswain commented Dec 19, 2024

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 19, 2024
@ahg-g
Copy link
Contributor

ahg-g commented Dec 19, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 19, 2024
@k8s-ci-robot k8s-ci-robot merged commit dabf28c into kubernetes-sigs:main Dec 19, 2024
2 checks passed
kfswain added a commit to kfswain/llm-instance-gateway that referenced this pull request Dec 19, 2024
* Big dawg api change

* Feedback revisions

* more feedback revisions

* more feedback updates

* (hopefully) final cleanup

* adding required decorator, some more validation and defaulting

* ObjRef name update

* grammatical tunes

* Feedback updates

* Comment fix
@kfswain kfswain deleted the api-updates branch January 23, 2025 20:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants