[Usage]: Is pipeline parallelism supported on machines that are not in the same local network? #11285

oldcpple · 2024-12-18T06:44:08Z

How would you like to use vllm

Hi there, since the communications between nodes are done by NCCL(which typically relies on RDMA I guess), I wonder if I can setup an inference pipeline with machines from different networks, for example, one on Google Cloud and another on AWS Cloud, through vLLM's pipeline parallelism？
Thanks a lot if anyone can answer this.

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

noooop · 2024-12-18T07:51:35Z

Do you really want to do this?

A typical vllm step takes about 20ms, and copying an intermediate result (a large tenser) over the network is very slow.

And now vllm is scheduled synchronously, so the delay in network transmission of intermediate results will greatly reduce GPU utilization, increase latency, and reduce throughput.

You can pay attention to progress of Disaggregated prefilling

It seems to be asynchronous, that Awesome

lihuahua123 · 2024-12-21T13:10:35Z

Do you really want to do this?

A typical vllm step takes about 20ms, and copying an intermediate result (a large tenser) over the network is very slow.

And now vllm is scheduled synchronously, so the delay in network transmission of intermediate results will greatly reduce GPU utilization, increase latency, and reduce throughput.

You can pay attention to progress of Disaggregated prefilling

It seems to be asynchronous, that Awesome

Is the Disaggregated prefilling better when the network between the machines is slow? And it also need double memory i guess?

github-actions · 2025-03-22T02:01:55Z

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

github-actions · 2025-04-21T02:10:14Z

This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant. Thank you!

oldcpple added the usage How to use vllm label Dec 18, 2024

github-actions bot added the stale Over 90 days of inactivity label Mar 22, 2025

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Usage]: Is pipeline parallelism supported on machines that are not in the same local network? #11285

[Usage]: Is pipeline parallelism supported on machines that are not in the same local network? #11285

oldcpple commented Dec 18, 2024 •

edited

Loading

noooop commented Dec 18, 2024

Uh oh!

lihuahua123 commented Dec 21, 2024

Uh oh!

github-actions bot commented Mar 22, 2025

Uh oh!

github-actions bot commented Apr 21, 2025

Uh oh!

Uh oh!

[Usage]: Is pipeline parallelism supported on machines that are not in the same local network? #11285

[Usage]: Is pipeline parallelism supported on machines that are not in the same local network? #11285

Comments

oldcpple commented Dec 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How would you like to use vllm

Before submitting a new issue...

noooop commented Dec 18, 2024

Uh oh!

lihuahua123 commented Dec 21, 2024

Uh oh!

github-actions bot commented Mar 22, 2025

Uh oh!

github-actions bot commented Apr 21, 2025

Uh oh!

oldcpple commented Dec 18, 2024 •

edited

Loading