File tree 1 file changed +10
-1
lines changed
1 file changed +10
-1
lines changed Original file line number Diff line number Diff line change @@ -25,7 +25,16 @@ See our website at https://gateway-api-inference-extension.sigs.k8s.io/ for deta
25
25
26
26
## Roadmap
27
27
28
- Coming soon!
28
+ As Inference Gateway builds towards a GA release. We will continue to expand our capabilities, namely:
29
+ 1 . Prefix-cache aware load balancing with interfaces for remote caches
30
+ 1 . Recommended LoRA adapter pipeline for automated rollout
31
+ 1 . Fairness and priority between workloads within the same criticality band
32
+ 1 . HPA support for autoscaling on aggregate metrics derived from the load balancer
33
+ 1 . Support for large multi-modal inputs and outputs
34
+ 1 . Support for other GenAI model types (diffusion and other non-completion protocols)
35
+ 1 . Heterogeneous accelerators - serve workloads on multiple types of accelerator using latency and request cost-aware load balancing
36
+ 1 . Disaggregated serving support with independently scaling pools
37
+
29
38
30
39
## End-to-End Tests
31
40
You can’t perform that action at this time.
0 commit comments