We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 203e03c commit 44769cfCopy full SHA for 44769cf
examples/online_serving/disaggregated_prefill.sh
@@ -3,6 +3,8 @@
3
# We will launch 2 vllm instances (1 for prefill and 1 for decode),
4
# and then transfer the KV cache between them.
5
6
+set -xe
7
+
8
echo "🚧🚧 Warning: The usage of disaggregated prefill is experimental and subject to change 🚧🚧"
9
sleep 1
10
@@ -69,7 +71,7 @@ wait_for_server 8200
69
71
# instance
70
72
# NOTE: the usage of this API is subject to change --- in the future we will
73
# introduce "vllm connect" to connect between prefill and decode instances
-python3 ../benchmarks/disagg_benchmarks/disagg_prefill_proxy_server.py &
74
+python3 ../../benchmarks/disagg_benchmarks/disagg_prefill_proxy_server.py &
75
76
77
# serve two example requests
0 commit comments