File tree 5 files changed +40
-3
lines changed
5 files changed +40
-3
lines changed Original file line number Diff line number Diff line change 1
1
## Quickstart
2
2
3
+ ### Requirements
4
+ The current manifests rely on Envoy Gateway [ v1.2.1] ( https://gateway.envoyproxy.io/docs/install/install-yaml/#install-with-yaml ) or higher.
5
+
3
6
### Steps
4
7
5
8
1 . ** Deploy Sample vLLM Application**
Original file line number Diff line number Diff line change 61
61
request :
62
62
body : Buffered
63
63
response :
64
- messageTimeout : 5s
64
+ # The timeouts are likely not needed here. We can experiment with removing/tuning them slowly.
65
+ # The connection limits are more important and will cause the opaque: ext_proc_gRPC_error_14 error in Envoy GW if not configured correctly.
66
+ messageTimeout : 1000s
67
+ backendSettings :
68
+ circuitBreaker :
69
+ maxConnections : 40000
70
+ maxPendingRequests : 40000
71
+ maxParallelRequests : 40000
72
+ timeout :
73
+ tcp :
74
+ connectTimeout : 24h
65
75
targetRef :
66
76
group : gateway.networking.k8s.io
67
77
kind : HTTPRoute
Original file line number Diff line number Diff line change 44
44
- backendRefs :
45
45
- group : gateway.envoyproxy.io
46
46
kind : Backend
47
- name : backend-dummy
47
+ name : backend-dummy
48
+ timeouts :
49
+ request : " 24h"
50
+ backendRequest : " 24h"
Original file line number Diff line number Diff line change 26
26
original_dst_lb_config :
27
27
use_http_header : true
28
28
http_header_name : " target-pod"
29
- connect_timeout : 6s
29
+ connect_timeout : 1000s
30
30
lb_policy : CLUSTER_PROVIDED
31
31
dns_lookup_family : V4_ONLY
32
+ circuit_breakers :
33
+ thresholds :
34
+ - max_connections : 40000
35
+ max_pending_requests : 40000
36
+ max_requests : 40000
32
37
33
38
- type : " type.googleapis.com/envoy.config.route.v3.RouteConfiguration"
34
39
name : default/<GATEWAY-NAME>/llm-gw
Original file line number Diff line number Diff line change
1
+ apiVersion : gateway.envoyproxy.io/v1alpha1
2
+ kind : BackendTrafficPolicy
3
+ metadata :
4
+ name : high-connection-route-policy
5
+ spec :
6
+ targetRefs :
7
+ - group : gateway.networking.k8s.io
8
+ kind : HTTPRoute
9
+ name : llm-route
10
+ circuitBreaker :
11
+ maxConnections : 40000
12
+ maxPendingRequests : 40000
13
+ maxParallelRequests : 40000
14
+ timeout :
15
+ tcp :
16
+ connectTimeout : 24h
You can’t perform that action at this time.
0 commit comments