Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llmservice reconciler implementation #48

Merged
merged 1 commit into from
Nov 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion api/v1alpha1/llmserverpool_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ type LLMServerPoolSpec struct {
// TargetPort is the port number that the model servers within the pool expect
// to recieve traffic from.
// This maps to the TargetPort in: https://pkg.go.dev/k8s.io/api/core/v1#ServicePort
TargetPort int32
TargetPort int32 `json:"targetPort,omitempty"`
}

// LLMServerPoolStatus defines the observed state of LLMServerPool
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -40,55 +40,24 @@ spec:
description: LLMServerPoolSpec defines the desired state of LLMServerPool
properties:
modelServerSelector:
additionalProperties:
type: string
description: |-
ModelServerSelector uses label selection to watch model server pods
ModelServerSelector uses a map of label to watch model server pods
that should be included in the LLMServerPool. ModelServers should not
be with any other Service or LLMServerPool, that behavior is not supported
and will result in sub-optimal utilization.
properties:
matchExpressions:
description: matchExpressions is a list of label selector requirements.
The requirements are ANDed.
items:
description: |-
A label selector requirement is a selector that contains values, a key, and an operator that
relates the key and values.
properties:
key:
description: key is the label key that the selector applies
to.
type: string
operator:
description: |-
operator represents a key's relationship to a set of values.
Valid operators are In, NotIn, Exists and DoesNotExist.
type: string
values:
description: |-
values is an array of string values. If the operator is In or NotIn,
the values array must be non-empty. If the operator is Exists or DoesNotExist,
the values array must be empty. This array is replaced during a strategic
merge patch.
items:
type: string
type: array
x-kubernetes-list-type: atomic
required:
- key
- operator
type: object
type: array
x-kubernetes-list-type: atomic
matchLabels:
additionalProperties:
type: string
description: |-
matchLabels is a map of {key,value} pairs. A single {key,value} in the matchLabels
map is equivalent to an element of matchExpressions, whose key field is "key", the
operator is "In", and the values array contains only "value". The requirements are ANDed.
type: object
Due to this selector being translated to a service a simple map is used instead
of: https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#LabelSelector
To avoid footshoot errors when the https://pkg.go.dev/k8s.io/apimachinery/pkg/apis/meta/v1#LabelSelectorAsMap would be used.
type: object
x-kubernetes-map-type: atomic
targetPort:
description: |-
TargetPort is the port number that the model servers within the pool expect
to recieve traffic from.
This maps to the TargetPort in: https://pkg.go.dev/k8s.io/api/core/v1#ServicePort
format: int32
type: integer
type: object
status:
description: LLMServerPoolStatus defines the observed state of LLMServerPool
Expand Down
71 changes: 0 additions & 71 deletions examples/poc/README.md

This file was deleted.

Binary file removed examples/poc/envoy-gateway-bootstrap.png
Binary file not shown.
23 changes: 23 additions & 0 deletions examples/poc/manifests/llmservice.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
apiVersion: inference.networking.x-k8s.io/v1alpha1
kind: LLMService
metadata:
labels:
app.kubernetes.io/name: api
app.kubernetes.io/managed-by: kustomize
name: llmservice-sample
spec:
models:
- name: sql-code-assist
- name: npc-bot
objective:
desiredAveragePerOutputTokenLatencyAtP95OverMultipleRequests: 50
targetModels:
- name: npc-bot-v1
weight: 50
- name: npc-bot-v2
weight: 50
poolRef:
- kind: LLMServerPool
name: test-pool
- name: gemini-pool
kind: LLMServerPool
1 change: 1 addition & 0 deletions pkg/ext-proc/backend/datastore.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import (
// The datastore is a local cache of relevant data for the given LLMServerPool (currently all pulled from k8s-api)
type K8sDatastore struct {
LLMServerPool *v1alpha1.LLMServerPool
LLMServices *sync.Map
Pods *sync.Map
}

Expand Down
57 changes: 57 additions & 0 deletions pkg/ext-proc/backend/llmservice_reconciler.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
package backend

import (
"context"
"strings"

"inference.networking.x-k8s.io/llm-instance-gateway/api/v1alpha1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/client-go/tools/record"
"k8s.io/klog/v2"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
)

type LLMServiceReconciler struct {
client.Client
Scheme *runtime.Scheme
Record record.EventRecorder
Datastore *K8sDatastore
ServerPoolName string
Namespace string
}

func (c *LLMServiceReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
if req.Namespace != c.Namespace {
return ctrl.Result{}, nil
}
klog.V(1).Info("reconciling LLMService", req.NamespacedName)

service := &v1alpha1.LLMService{}
if err := c.Get(ctx, req.NamespacedName, service); err != nil {
klog.Error(err, "unable to get LLMServerPool")
return ctrl.Result{}, err
}

c.updateDatastore(service)
return ctrl.Result{}, nil
}

func (c *LLMServiceReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&v1alpha1.LLMService{}).
Complete(c)
}

func (c *LLMServiceReconciler) updateDatastore(service *v1alpha1.LLMService) {
for _, ref := range service.Spec.PoolRef {
if strings.Contains(strings.ToLower(ref.Kind), strings.ToLower("LLMServerPool")) && ref.Name == c.ServerPoolName {
klog.V(2).Infof("Adding/Updating service: %v", service.Name)
c.Datastore.LLMServices.Store(service.Name, service)
return
}
}
klog.V(2).Infof("Removing/Not adding service: %v", service.Name)
// If we get here. The service is not relevant to this pool, remove.
c.Datastore.LLMServices.Delete(service.Name)
}
Loading