How to stream model generation with the vllm.LLM
programing API?
#15239
Replies: 1 comment
-
This cannot be done. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I mean, I know running vllm as a server and use the OpenAI-compatible API can achieve this. But I don't want to start a server. Can I generate output with streaming with programming API? I've walked through the docs and found nothing about this.
Beta Was this translation helpful? Give feedback.
All reactions