-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Feature: Async handling of sampling calls #840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
the one failing test in the one case, I believe fails due to a race condition. When adding sleeps into the test to try to force order of execution, it reliably fails even without my change as far as I can tell. Will continue to see if I can figure it out, but from what I can tell, the second "await session.send_request" returns without ever triggering the message handler. Removing the sleeps allows it to succeed every time on my machine, but adding the sleeps shows that session.send_request is returning before the message handler is called. |
… sleeps and prints and I have no idea why?
OKAY! I've tracked down the test_streamablehettp_client_resumption failure to the following behavior (seemingly)... General layout of the test:
then, the intended flow seems to be
However, this only seems to work correctly if the tool has sent another notification before we reconnect to it. If the tool has yet to send another notification, the call to send_request hangs forever. This is, at this point, very outside the scope of my change I'm trying to make, as I've verified that this happens both with and without my change.. But in the spirit of not breaking everyone else, I'll try to fix this as well. If folks want to keep this out and put it in another bug, that works for me too. I can revert whatever I do to these files as they're unrelated to the change I wanted to present. |
I've tracked this down as far as I can and determined, ultimately it should be out of scope of this PR even if I could find out what's happening.. which I haven't been successful in. So I've added a bit more logic to make the test more reliable and opened another issue to capture this error case. |
Dispatch
_received_request
in asynchronous tasks inside the session's_receive_loop
.Motivation and Context
When writing mcp servers that have the potential to return large amounts of data, one workable pattern is to "map/reduce" the results by chunking up the backend response and summarizing it with an LLM, then combining the summaries before returning that combined summary as the tool response. Sampling is the perfect tool for this, but it is locked into a sequential execution. Meaning if I break my data up into 10 chunks I have to sequentially summarize all of those results before my tool can respond. This can lead to very long runtimes which is not necessary since each sampling call only needs its own data.
This change should allow much more efficient "map/reduce" using sampling from MCP Servers (without them implementing their own LLM integration server-side).
How Has This Been Tested?
I've tested this with fast-agent which is one of the only mcp clients that implement sampling. It greatly speeds up my applications.
Breaking Changes
No. Unless a client sends sampling requests concurrently (vs immediately awaiting which is more standard), the behavior will not change.
Types of changes
Checklist
Additional context