Skip to content

Feature request: add support for streaming tool use #1883

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
lsorber opened this issue Dec 25, 2024 · 2 comments · Fixed by superlinear-ai/raglite#71 · May be fixed by #1884
Open

Feature request: add support for streaming tool use #1883

lsorber opened this issue Dec 25, 2024 · 2 comments · Fixed by superlinear-ai/raglite#71 · May be fixed by #1884

Comments

@lsorber
Copy link
Contributor

lsorber commented Dec 25, 2024

The combination stream=True, tool_choice="auto" raises an exception right now, which means that developers are stuck with one of two unfortunate choices:

  1. Developing an application that streams the response but cannot use tools
  2. Developing an LLM application that can use tools but cannot stream the response

Relevant discussion: #1615

@SaymV
Copy link

SaymV commented Apr 22, 2025

Admittedly this is the wrong place to ask this question but as a beginner I feel like you're the right person to answer:

Does something need to be done to llama.cpp directly in order to handle streaming tool calling? I see from your feature branch that you added a RAG layer to this python implementation. I ask because I built llamma.cpp from source figuring it would be better optimized for my system, but I am stuck with this server error
{"code":500,"message":"Cannot use tools with stream","type":"server_error"}.

Is it the case that if I installed the pre-built python version that this would go away?

Edit: I see here that there's a PR in draft. We're too close to the bleeding edge!

@edmcman
Copy link

edmcman commented Apr 22, 2025

Llama.cpp is still waiting on ggml-org/llama.cpp#12379

I'm not sure how this python library handles tools. I think it is somewhat different though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants