Skip to content

Commit 388993c

Browse files
committed
Testing CLA
1 parent 5b82374 commit 388993c

File tree

1 file changed

+5
-3
lines changed

1 file changed

+5
-3
lines changed

Diff for: README.md

+5-3
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
1-
# Gateway API Inference Extension
1+
# DO NOT MERGE
2+
3+
# Gateway API Inference Extension
24

35
This extension upgrades an [ext-proc](https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/ext_proc_filter)-capable proxy or gateway - such as Envoy Gateway, kGateway, or the GKE Gateway - to become an **inference gateway** - supporting inference platform teams self-hosting large language models on Kubernetes. This integration makes it easy to expose and control access to your local [OpenAI-compatible chat completion endpoints](https://platform.openai.com/docs/api-reference/chat) to other workloads on or off cluster, or to integrate your self-hosted models alongside model-as-a-service providers in a higher level **AI Gateway** like LiteLLM, Solo AI Gateway, or Apigee.
46

@@ -26,8 +28,8 @@ See our website at https://gateway-api-inference-extension.sigs.k8s.io/ for deta
2628
## Roadmap
2729

2830
As Inference Gateway builds towards a GA release. We will continue to expand our capabilities, namely:
29-
1. Prefix-cache aware load balancing with interfaces for remote caches
30-
1. Recommended LoRA adapter pipeline for automated rollout
31+
1. Prefix-cache aware load balancing with interfaces for remote caches
32+
1. Recommended LoRA adapter pipeline for automated rollout
3133
1. Fairness and priority between workloads within the same criticality band
3234
1. HPA support for autoscaling on aggregate metrics derived from the load balancer
3335
1. Support for large multi-modal inputs and outputs

0 commit comments

Comments
 (0)