Skip to content

Commit a885ea9

Browse files
committed
nit
Signed-off-by: Kuromesi <[email protected]>
1 parent 6712198 commit a885ea9

File tree

4 files changed

+15
-176
lines changed

4 files changed

+15
-176
lines changed

config/charts/inferencepool/README.md

+13-29
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,20 @@
1-
# Gateway Api Inference Extension
1+
# InferencePool
22

3-
A chart to deploy the inference extension and a InferencePool managed by the extension.
3+
A chart to deploy an InferencePool and a corresponding EndpointPicker (epp) deployment.
44

5-
## Install
65

7-
Suppose now a vllm service with label `app: vllm-llama2-7b` and served on port `8000` is deployed in `default` namespace in the cluster.
6+
## Install
87

9-
To deploy the inference extension, you can run the following command:
8+
To install an InferencePool named `pool-1` that selects from endpoints with label `app: vllm-llama2-7b` and listening on port `8000`, you can run the following command:
109

1110
```txt
12-
$ helm install my-release . -n default \
13-
--set inferencePool.targetPortNumber=8000 \
14-
--set inferencePool.selector.app=vllm-llama2-7b
15-
```
16-
17-
Or you can change the `values.yaml` to:
18-
19-
```yaml
20-
inferencePool:
21-
name: pool-1
22-
targetPortNumber: 8000
23-
selector:
24-
app: vllm-llama2-7b
11+
$ helm install my-release ./config/charts/inferencepool \
12+
--set inferencePool.name=pool-1 \
13+
--set inferencePool.selector.app=vllm-llama2-7b \
14+
--set inferencePool.targetPortNumber=8000
2515
```
2616

27-
where `inferencePool.targetPortNumber` is the pod that vllm backends served on and `inferencePool.selector` is the selector to match the vllm backends. And then run:
28-
29-
```txt
30-
$ helm install my-release .
31-
```
17+
where `inferencePool.targetPortNumber` is the pod that vllm backends served on and `inferencePool.selector` is the selector to match the vllm backends.
3218

3319
## Uninstall
3420

@@ -44,18 +30,16 @@ The following table list the configurable parameters of the chart.
4430

4531
| **Parameter Name** | **Description** |
4632
|---------------------------------------------|-------------------------------------------------------------------------------------------------------------------|
33+
| `inferencePool.name` | Name for the InferencePool, and inference extension will be named as `${inferencePool.name}-epp`. |
34+
| `inferencePool.targetPortNumber` | Target port number for the vllm backends, will be used to scrape metrics by the inference extension. |
35+
| `inferencePool.selector` | Label selector to match vllm backends managed by the inference pool. |
4736
| `inferenceExtension.replicas` | Number of replicas for the inference extension service. Defaults to `1`. |
4837
| `inferenceExtension.image.name` | Name of the container image used for the inference extension. |
4938
| `inferenceExtension.image.hub` | Registry URL where the inference extension image is hosted. |
5039
| `inferenceExtension.image.tag` | Image tag of the inference extension. |
5140
| `inferenceExtension.image.pullPolicy` | Image pull policy for the container. Possible values: `Always`, `IfNotPresent`, or `Never`. Defaults to `Always`. |
5241
| `inferenceExtension.extProcPort` | Port where the inference extension service is served for external processing. Defaults to `9002`. |
53-
| `inferencePool.name` | Name for the InferencePool, and inference extension will be named as `${inferencePool.name}-epp`. |
54-
| `inferencePool.targetPortNumber` | Target port number for the vllm backends, will be used to scrape metrics by the inference extension. |
55-
| `inferencePool.selector` | Label selector to match vllm backends managed by the inference pool. |
5642

5743
## Notes
5844

59-
This chart will only deploy the inference extension and InferencePool, before install the chart, please make sure that the inference extension CRDs have already been installed in the cluster. And You need to apply traffic policies to route traffic to the inference extension from the gateway after the inference extension is deployed.
60-
61-
For more details, please refer to the [website](https://gateway-api-inference-extension.sigs.k8s.io/guides/).
45+
This chart will only deploy an InferencePool and its corresponding EndpointPicker extension. Before install the chart, please make sure that the inference extension CRDs are installed in the cluster. For more details, please refer to the [getting started guide](https://gateway-api-inference-extension.sigs.k8s.io/guides/).
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
Gateway api inference extension deployed.
1+
InferencePool {{ .Values.inferencePool.name }} deployed.

config/charts/inferencepool/values.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,4 @@ inferencePool:
1111
name: pool-1
1212
targetPortNumber: 8000
1313
selector:
14-
app: vllm-llama2-7b
14+
app: vllm-llama2-7b

config/manifests/generated.yaml

-145
This file was deleted.

0 commit comments

Comments
 (0)