Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stdio_client hangs during session.initialize() due to failed message transfer via internal anyio memory stream #382

Closed
RansSelected opened this issue Mar 27, 2025 · 4 comments

Comments

@RansSelected
Copy link

Labels: bug, transport:stdio, client, server, anyio

Body:

Environment:

  • OS: Debian GNU/Linux 11 (bullseye) / kernel: Linux 5.10.0-34-cloud-amd64 / x86-64 (on GCP Vertex AI Workbench)
  • Python Version: 3.10.16
  • modelcontextprotocol SDK Version: [v1.5.0]
  • anyio Version: [v4.9.0]

Description:

When using the documented mcp.client.stdio.stdio_client to connect to a mcp.server.fastmcp.FastMCP server running via the stdio transport (await mcp.run_stdio_async()), the client consistently hangs during the await session.initialize() call, eventually timing out.

Extensive debugging using monkeypatching revealed the following sequence:

  1. The client connects successfully via stdio_client.
  2. The client sends the initialize request.
  3. The server process starts correctly.
  4. The background task within mcp.server.stdio.stdio_server successfully reads the initialize request from the process's stdin (using anyio.wrap_file(TextIOWrapper(...))).
  5. This background task successfully sends the validated JSONRPCMessage onto the anyio memory stream (read_stream_writer) intended for the server's main processing loop.
  6. The server's main processing loop, specifically within mcp.shared.session.BaseSession._receive_loop, awaits messages on the receiving end of that same anyio memory stream (async for message in self._read_stream:).
  7. Crucially, the async for loop in BaseSession._receive_loop never yields the message that was sent to the memory stream. It remains blocked.
  8. Because the initialize message is never received by the BaseSession loop, no response is generated.
  9. The client eventually times out waiting for the initialize response.

This indicates a failure in message passing across the anyio memory stream used internally by the stdio transport implementation, specifically between the task group managing stdio bridging and the task group managing session message processing, when running under the asyncio backend in this configuration.

A separate test confirmed that replacing the internal anyio memory streams with standard asyncio.Queues does allow the message to be transferred successfully between these task contexts, allowing initialization and subsequent communication to proceed. This strongly suggests the issue lies within the anyio memory stream implementation or its usage in this specific cross-task-group stdio scenario.

Steps to Reproduce:

  1. Save the following server code as mcp_file_server.py:
    (Use the original, unpatched version that calls await mcp.run_stdio_async())

    # mcp_file_server.py (Original - Demonstrates Hang)
    import asyncio
    import sys
    from pathlib import Path
    import logging
    
    logging.basicConfig(level=logging.DEBUG, format='%(asctime)s [%(name)s] %(levelname)s: %(message)s')
    log = logging.getLogger("MCPFileServer_Original")
    
    try:
        import pandas as pd
        from mcp.server.fastmcp import FastMCP
        import mcp.server.stdio as mcp_stdio
    except ImportError as e:
        log.error(f"Import error: {e}")
        sys.exit(1)
    
    mcp = FastMCP("FileToolsServer")
    log.info("FastMCP server 'FileToolsServer' initialized.")
    
    @mcp.tool()
    def FileReaderTool(uri: str) -> str:
        log.info(f"Tool 'FileReaderTool' called with URI: {uri}")
        if not uri.startswith("file:"): return f"Error: Invalid URI scheme."
        try:
            fp = Path(uri.replace("file://", "")).resolve()
            if not fp.is_file(): return f"Error: File not found: {fp}"
            content = fp.read_text(encoding="utf-8")
            log.info(f"Read {len(content)} chars from {fp}")
            return content
        except Exception as e: log.exception(f"Error reading file {uri}"); return f"Error: Failed to read file '{uri}'. Reason: {str(e)}"
    
    @mcp.tool()
    def CsvReaderTool(uri: str) -> str:
        log.info(f"Tool 'CsvReaderTool' called with URI: {uri}")
        if not uri.startswith("file:"): return f"Error: Invalid URI scheme."
        try:
            fp = Path(uri.replace("file://", "")).resolve()
            if not fp.is_file(): return f"Error: CSV file not found: {fp}"
            df = pd.read_csv(fp)
            content_str = df.to_string(index=False)
            log.info(f"Read and formatted CSV from {fp}")
            return content_str
        except Exception as e: log.exception(f"Error reading CSV file {uri}"); return f"Error: Failed to read CSV file '{uri}'. Reason: {str(e)}"
    
    async def main():
        log.info("Starting MCP server main() coroutine.")
        try:
            log.info("Entering stdio_server context manager...")
            # stdio_server yields anyio memory streams
            async with mcp_stdio.stdio_server() as (read_stream, write_stream):
                log.debug(f"stdio_server provided read_stream: {type(read_stream)}")
                log.debug(f"stdio_server provided write_stream: {type(write_stream)}")
                log.info("stdio streams established. Calling mcp.run_stdio_async()...")
                log.debug(">>> About to await mcp.run_stdio_async()")
                # This internally calls Server.run which uses BaseSession._receive_loop
                await mcp.run_stdio_async()
                log.debug("<<< mcp.run_stdio_async() completed") # This is never reached before client disconnect
                log.info("mcp.run_stdio_async() finished.")
            log.info("stdio_server context exited.")
        except Exception as e:
            log.exception("Exception occurred within stdio_server or mcp.run_stdio_async()")
        finally:
            log.info("MCP server main() function exiting.")
    
    if __name__ == "__main__":
        log.info(f"Executing server script: {__file__}")
        try:
            asyncio.run(main())
        except KeyboardInterrupt: log.info("Server stopped by user.")
        except Exception as e: log.exception("An unexpected error occurred at the top level.")
  2. Save the following client code as minimal_client.py:
    (Use the version corrected for Python 3.10 timeouts and list_tools processing)

    # minimal_client.py
    import asyncio
    import sys
    import logging
    from pathlib import Path
    
    logging.basicConfig(level=logging.INFO, format='%(asctime)s [Minimal Client] %(levelname)s: %(message)s')
    log = logging.getLogger("MinimalClient")
    
    try:
        from mcp import ClientSession, StdioServerParameters, types as mcp_types
        from mcp.client.stdio import stdio_client
    except ImportError as e:
        sys.exit(f"Import Error: {e}. Ensure 'modelcontextprotocol' is installed.")
    
    SERVER_SCRIPT_PATH = Path("./mcp_file_server.py").resolve()
    
    async def run_minimal_test_inner():
        log.info("Starting minimal client test.")
        if not SERVER_SCRIPT_PATH.is_file():
            log.error(f"Server script not found: {SERVER_SCRIPT_PATH}")
            return False
        server_params = StdioServerParameters(command=sys.executable, args=[str(SERVER_SCRIPT_PATH)])
        log.info(f"Server params: {sys.executable} {SERVER_SCRIPT_PATH}")
        init_successful = False
        try:
            log.info("Attempting to connect via stdio_client...")
            async with stdio_client(server_params) as (reader, writer):
                log.info("stdio_client connected. Creating ClientSession...")
                async with ClientSession(reader, writer) as session:
                    log.info("ClientSession created. Initializing...")
                    try:
                        init_timeout = 30.0
                        init_result = await asyncio.wait_for(session.initialize(), timeout=init_timeout)
                        log.info(f"Initialize successful! Server capabilities: {init_result.capabilities}")
                        init_successful = True
                        try:
                            list_timeout = 15.0
                            list_tools_response = await asyncio.wait_for(session.list_tools(), timeout=list_timeout)
                            log.info(f"Raw tools list response: {list_tools_response!r}")
                            tools_list = getattr(list_tools_response, 'tools', None)
                            if tools_list is not None and isinstance(tools_list, list):
                                tool_names = [t.name for t in tools_list if hasattr(t, 'name')]
                                if tool_names: log.info(f"Successfully listed tools: {tool_names}")
                                else: log.warning("Tools list present but no tool names found.")
                            else: log.warning("Could not get tools list from response.")
                        except asyncio.TimeoutError: log.error("Timeout listing tools.")
                        except Exception as e_list: log.exception("Error listing tools.")
                    except asyncio.TimeoutError: log.error(f"Timeout ({init_timeout}s) waiting for session.initialize().")
                    except Exception as e_init: log.exception("Error during session.initialize().")
                log.info("Exiting ClientSession context.")
            log.info("Exiting stdio_client context.")
        except Exception as e_main: log.exception(f"An error occurred connecting or during session: {e_main}")
        return init_successful
    
    async def main_with_overall_timeout():
        overall_timeout = 45.0
        log.info(f"Running test with overall timeout: {overall_timeout}s")
        try:
            success = await asyncio.wait_for(run_minimal_test_inner(), timeout=overall_timeout)
            if success: log.info("Minimal client test: INITIALIZATION SUCCEEDED.")
            else: log.error("Minimal client test: INITIALIZATION FAILED (within timeout).")
        except asyncio.TimeoutError: log.error(f"Minimal client test: OVERALL TIMEOUT ({overall_timeout}s) REACHED.")
        except Exception as e: log.exception("Unexpected error in main_with_overall_timeout")
    
    if __name__ == "__main__":
        try: asyncio.run(main_with_overall_timeout())
        except KeyboardInterrupt: log.info("Test interrupted.")
  3. Install dependencies: pip install modelcontextprotocol pandas (or using uv)

  4. Run the client: python minimal_client.py

Expected Behavior:

The client connects, initializes successfully, lists tools, and exits cleanly.

Actual Behavior:

The client connects but hangs at the Initializing... step. After the 30-second timeout expires for session.initialize(), it logs the timeout error and exits. Server logs confirm that mcp.run_stdio_async() was awaited but never processed the incoming message until after the client disconnected.

Logs:

(Logs showing the client timeout and the server hanging after >>> About to await mcp.run_stdio_async())

uv run minimal_client.py 
2025-03-27 09:59:39,140 [Minimal Client] INFO: Running test with overall timeout: 45.0s
2025-03-27 09:59:39,140 [Minimal Client] INFO: Starting minimal client test.
2025-03-27 09:59:39,140 [Minimal Client] INFO: Server params: /home/jupyter/MCP_TEST/.venv/bin/python3 /home/jupyter/MCP_TEST/mcp_file_server.py
2025-03-27 09:59:39,140 [Minimal Client] INFO: Attempting to connect via stdio_client...
2025-03-27 09:59:39,144 [Minimal Client] INFO: stdio_client connected. Creating ClientSession...
2025-03-27 09:59:39,144 [Minimal Client] INFO: ClientSession created. Initializing...
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Initializing server 'FileToolsServer'
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Registering handler for ListToolsRequest
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Registering handler for CallToolRequest
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Registering handler for ListResourcesRequest
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Registering handler for ReadResourceRequest
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Registering handler for PromptListRequest
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Registering handler for GetPromptRequest
2025-03-27 09:59:39,807 [mcp.server.lowlevel.server] DEBUG: Registering handler for ListResourceTemplatesRequest
2025-03-27 09:59:39,811 [MCPFileServer_Original] INFO: FastMCP server 'FileToolsServer' initialized.
2025-03-27 09:59:39,813 [MCPFileServer_Original] INFO: Executing server script: /home/jupyter/MCP_TEST/mcp_file_server.py
2025-03-27 09:59:39,813 [asyncio] DEBUG: Using selector: EpollSelector
2025-03-27 09:59:39,813 [MCPFileServer_Original] INFO: Starting MCP server main() coroutine.
2025-03-27 09:59:39,814 [MCPFileServer_Original] INFO: Entering stdio_server context manager...
2025-03-27 09:59:39,817 [MCPFileServer_Original] DEBUG: stdio_server provided read_stream: <class 'anyio.streams.memory.MemoryObjectReceiveStream'>
2025-03-27 09:59:39,817 [MCPFileServer_Original] DEBUG: stdio_server provided write_stream: <class 'anyio.streams.memory.MemoryObjectSendStream'>
2025-03-27 09:59:39,817 [MCPFileServer_Original] INFO: stdio streams established. Calling mcp.run_stdio_async()...
2025-03-27 09:59:39,817 [MCPFileServer_Original] DEBUG: >>> About to await mcp.run_stdio_async()
2025-03-27 10:00:09,175 [Minimal Client] ERROR: Timeout (30.0s) waiting for session.initialize().
2025-03-27 10:00:09,175 [Minimal Client] INFO: Exiting ClientSession context.
2025-03-27 10:00:09,176 [MCPFileServer_Original] DEBUG: <<< mcp.run_stdio_async() completed
2025-03-27 10:00:09,176 [MCPFileServer_Original] INFO: mcp.run_stdio_async() finished.
2025-03-27 10:00:24,165 [Minimal Client] ERROR: Minimal client test: OVERALL TIMEOUT (45.0s) REACHED.

Additional Context:

  • Further debugging using extensive monkeypatching confirmed that the background task in mcp.server.stdio.stdio_server does successfully read the initialize request from stdin and sends it to the internal anyio memory stream.
  • However, the async for loop within mcp.shared.session.BaseSession._receive_loop (which reads from that memory stream) never yields the message.
  • Replacing the internal anyio memory streams with standard asyncio.Queues allowed the communication to succeed, isolating the problem to the anyio memory stream communication between the stdio bridging task group and the session processing task group.

This appears to be a bug in the stdio transport implementation related to anyio memory streams and task group interaction under the asyncio backend.

The patched working version with asyncio.Queue attached in

[working_code.zip](https://github.com/user-attachments/files/19485125/working_code.zip)

Run vis uv run minimal_client.py

@dsp-ant
Copy link
Member

dsp-ant commented Mar 27, 2025

This might be the same issue as #201, which is fixed on main, but has not been released. Do you mind trying this with the new version from main?

@Stark-X
Copy link

Stark-X commented Mar 28, 2025

This might be the same issue as #201, which is fixed on main, but has not been released. Do you mind trying this with the new version from main?

I've tried with the latest version, error still.

❯ mcp version
MCP version 1.6.1.dev4+2ea1495

@j-z10
Copy link

j-z10 commented Mar 28, 2025

@Stark-X I believe this is the root cause of the timeout. You created a read_stream and write_stream externally, while mcp.run_stdio_async() also creates its own read and write streams. Replacing await mcp.run_stdio_async() with: await mcp._mcp_server.run(read_stream, write_stream, mcp._mcp_server.create_initialization_options()) would fix it.

Image

@dsp-ant
Copy link
Member

dsp-ant commented Mar 31, 2025

@Stark-X I believe this is the root cause of the timeout. You created a read_stream and write_stream externally, while mcp.run_stdio_async() also creates its own read and write streams. Replacing await mcp.run_stdio_async() with: await mcp._mcp_server.run(read_stream, write_stream, mcp._mcp_server.create_initialization_options()) would fix it.

Yes, I think @Stark-X is right here. I think this is a matter of mixing lowlevel primitives and FastCMP

@dsp-ant dsp-ant closed this as completed Mar 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants