Skip to content

Commit 0562977

Browse files
committed
Allow defining a default base model in the lora syncer configuration
1 parent 8fdd1fa commit 0562977

File tree

6 files changed

+95
-40
lines changed

6 files changed

+95
-40
lines changed

Diff for: config/manifests/vllm/gpu-deployment.yaml

+3-4
Original file line numberDiff line numberDiff line change
@@ -246,11 +246,10 @@ data:
246246
vLLMLoRAConfig:
247247
name: vllm-llama3.1-8b-instruct
248248
port: 8000
249+
defaultBaseModel: meta-llama/Llama-3.1-8B-Instruct
249250
ensureExist:
250251
models:
251-
- base-model: meta-llama/Llama-3.1-8B-Instruct
252-
id: food-review
252+
- id: food-review
253253
source: Kawon/llama3.1-food-finetune_v14_r8
254-
- base-model: meta-llama/Llama-3.1-8B-Instruct
255-
id: cad-fabricator
254+
- id: cad-fabricator
256255
source: redcathode/fabricator

Diff for: site-src/guides/adapter-rollout.md

+6-8
Original file line numberDiff line numberDiff line change
@@ -33,13 +33,12 @@ Change the ConfigMap to match the following (note the new entry under models):
3333
vLLMLoRAConfig:
3434
name: vllm-llama3-8b-instruct-adapters
3535
port: 8000
36+
defaultBaseModel: meta-llama/Llama-3.1-8B-Instruct
3637
ensureExist:
3738
models:
38-
- base-model: meta-llama/Llama-3.1-8B-Instruct
39-
id: food-review-1
39+
- id: food-review-1
4040
source: vineetsharma/qlora-adapter-Llama-2-7b-hf-TweetSumm
41-
- base-model: meta-llama/Llama-3.1-8B-Instruct
42-
id: food-review-2
41+
- id: food-review-2
4342
source: mahimairaja/tweet-summarization-llama-2-finetuned
4443
```
4544
@@ -118,15 +117,14 @@ Unload the older versions from the servers by updating the LoRA syncer ConfigMap
118117
vLLMLoRAConfig:
119118
name: sql-loras-llama
120119
port: 8000
120+
defaultBaseModel: meta-llama/Llama-3.1-8B-Instruct
121121
ensureExist:
122122
models:
123-
- base-model: meta-llama/Llama-3.1-8B-Instruct
124-
id: food-review-2
123+
- id: food-review-2
125124
source: mahimairaja/tweet-summarization-llama-2-finetuned
126125
ensureNotExist:
127126
models:
128-
- base-model: meta-llama/Llama-3.1-8B-Instruct
129-
id: food-review-1
127+
- id: food-review-1
130128
source: vineetsharma/qlora-adapter-Llama-2-7b-hf-TweetSumm
131129
```
132130

Diff for: tools/dynamic-lora-sidecar/README.md

+58-11
Original file line numberDiff line numberDiff line change
@@ -60,20 +60,67 @@ The sidecar supports the following command-line arguments:
6060

6161
## Configuration Fields
6262
- `vLLMLoRAConfig`[**required**] base key
63-
- `host` [*optional*]Model server's host. defaults to localhost
63+
- `host` [*optional*] Model server's host. defaults to localhost
6464
- `port` [*optional*] Model server's port. defaults to 8000
65-
- `name`[*optional*] Name of this config
66-
- `ensureExist`[*optional*] List of models to ensure existence on specified model server.
67-
- `models`[**required**] [list]
68-
- `base-model`[*optional*] Base model for lora adapter
69-
- `id`[**required**] unique id of lora adapter
70-
- `source`[**required**] path (remote or local) to lora adapter
65+
- `name` [*optional*] Name of this config
66+
- `defaultBaseModel` [*optional*] Default base model to use for all adapters when not specified individually
67+
- `ensureExist` [*optional*] List of models to ensure existence on specified model server.
68+
- `models` [**required**] [list]
69+
- `id` [**required**] unique id of lora adapter
70+
- `source` [**required**] path (remote or local) to lora adapter
71+
- `base-model` [*optional*] Base model for lora adapter (overrides defaultBaseModel)
7172
- `ensureNotExist` [*optional*]
72-
- `models`[**required**] [list]
73-
- `id`[**required**] unique id of lora adapter
74-
- `source`[**required**] path (remote or local) to lora adapter
75-
- `base-model`[*optional*] Base model for lora adapter
73+
- `models` [**required**] [list]
74+
- `id` [**required**] unique id of lora adapter
75+
- `source` [**required**] path (remote or local) to lora adapter
76+
- `base-model` [*optional*] Base model for lora adapter (overrides defaultBaseModel)
7677

78+
## Example Configuration
79+
80+
Here's an example of using the `defaultBaseModel` field to avoid repetition in your configuration:
81+
82+
```yaml
83+
apiVersion: v1
84+
kind: ConfigMap
85+
metadata:
86+
name: vllm-llama2-7b-adapters
87+
data:
88+
configmap.yaml: |
89+
vLLMLoRAConfig:
90+
name: vllm-llama2-7b
91+
port: 8000
92+
defaultBaseModel: meta-llama/Llama-2-7b-hf
93+
ensureExist:
94+
models:
95+
- id: tweet-summary-1
96+
source: vineetsharma/qlora-adapter-Llama-2-7b-hf-TweetSumm
97+
- id: tweet-summary-2
98+
source: mahimairaja/tweet-summarization-llama-2-finetuned
99+
```
100+
101+
In this example, both adapters will use `meta-llama/Llama-2-7b-hf` as their base model without needing to specify it for each adapter individually.
102+
103+
You can still override the default base model for specific adapters when needed:
104+
105+
```yaml
106+
apiVersion: v1
107+
kind: ConfigMap
108+
metadata:
109+
name: vllm-mixed-adapters
110+
data:
111+
configmap.yaml: |
112+
vLLMLoRAConfig:
113+
name: vllm-mixed
114+
port: 8000
115+
defaultBaseModel: meta-llama/Llama-2-7b-hf
116+
ensureExist:
117+
models:
118+
- id: tweet-summary-1
119+
source: vineetsharma/qlora-adapter-Llama-2-7b-hf-TweetSumm
120+
- id: code-assistant
121+
source: huggingface/code-assistant-lora
122+
base-model: meta-llama/Llama-2-13b-hf # Override for this specific adapter
123+
```
77124
## Example Deployment
78125

79126
The [deployment.yaml](deployment.yaml) file shows an example of deploying the sidecar with custom parameters:

Diff for: tools/dynamic-lora-sidecar/deployment.yaml

+6-11
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ spec:
6666
- name: lora-adapter-syncer
6767
tty: true
6868
stdin: true
69-
image: <SIDECAR_IMAGE>
69+
image: us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/lora-syncer:main
7070
restartPolicy: Always
7171
imagePullPolicy: Always
7272
env:
@@ -106,22 +106,17 @@ metadata:
106106
data:
107107
configmap.yaml: |
108108
vLLMLoRAConfig:
109-
host: modelServerHost
110109
name: sql-loras-llama
111-
port: modelServerPort
110+
defaultBaseModel: meta-llama/Llama-2-7b-hf
112111
ensureExist:
113112
models:
114-
- base-model: meta-llama/Llama-3.1-8B-Instruct
115-
id: sql-lora-v1
113+
- id: sql-lora-v1
116114
source: yard1/llama-2-7b-sql-lora-test
117-
- base-model: meta-llama/Llama-3.1-8B-Instruct
118-
id: sql-lora-v3
115+
- id: sql-lora-v3
119116
source: yard1/llama-2-7b-sql-lora-test
120-
- base-model: meta-llama/Llama-3.1-8B-Instruct
121-
id: sql-lora-v4
117+
- id: sql-lora-v4
122118
source: yard1/llama-2-7b-sql-lora-test
123119
ensureNotExist:
124120
models:
125-
- base-model: meta-llama/Llama-3.1-8B-Instruct
126-
id: sql-lora-v2
121+
- id: sql-lora-v2
127122
source: yard1/llama-2-7b-sql-lora-test

Diff for: tools/dynamic-lora-sidecar/sidecar/sidecar.py

+15-2
Original file line numberDiff line numberDiff line change
@@ -135,15 +135,24 @@ def port(self):
135135
def model_server(self):
136136
"""Model server {host}:{port}"""
137137
return f"{self.host}:{self.port}"
138+
139+
@property
140+
def default_base_model(self):
141+
"""Default base model to use when not specified at adapter level"""
142+
return self.config.get("defaultBaseModel", "")
138143

139144
@property
140145
def ensure_exist_adapters(self):
141146
"""Lora adapters in config under key `ensureExist` in set"""
142147
adapters = self.config.get("ensureExist", {}).get("models", set())
148+
default_model = self.default_base_model
149+
143150
return set(
144151
[
145152
LoraAdapter(
146-
adapter["id"], adapter["source"], adapter.get("base-model", "")
153+
adapter["id"],
154+
adapter["source"],
155+
adapter.get("base-model", default_model)
147156
)
148157
for adapter in adapters
149158
]
@@ -153,10 +162,14 @@ def ensure_exist_adapters(self):
153162
def ensure_not_exist_adapters(self):
154163
"""Lora adapters in config under key `ensureNotExist` in set"""
155164
adapters = self.config.get("ensureNotExist", {}).get("models", set())
165+
default_model = self.default_base_model
166+
156167
return set(
157168
[
158169
LoraAdapter(
159-
adapter["id"], adapter["source"], adapter.get("base-model", "")
170+
adapter["id"],
171+
adapter["source"],
172+
adapter.get("base-model", default_model)
160173
)
161174
for adapter in adapters
162175
]

Diff for: tools/dynamic-lora-sidecar/sidecar/validation.yaml

+7-4
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,9 @@ properties:
1616
name:
1717
type: string
1818
description: Name of this config
19+
defaultBaseModel:
20+
type: string
21+
description: Default base model to use when not specified at adapter level
1922
ensureExist:
2023
type: object
2124
description: List of models to ensure existence on specified model server
@@ -26,9 +29,9 @@ properties:
2629
items:
2730
type: object
2831
properties:
29-
base_model:
32+
base-model:
3033
type: string
31-
description: Base model for LoRA adapter
34+
description: Base model for LoRA adapter (overrides defaultBaseModel)
3235
id:
3336
type: string
3437
description: Unique ID of LoRA adapter
@@ -50,9 +53,9 @@ properties:
5053
items:
5154
type: object
5255
properties:
53-
base_model:
56+
base-model:
5457
type: string
55-
description: Base model for LoRA adapter
58+
description: Base model for LoRA adapter (overrides defaultBaseModel)
5659
id:
5760
type: string
5861
description: Unique ID of LoRA adapter

0 commit comments

Comments
 (0)