Pluggable Transport Abstractions #1591

asheshvidyut · 2025-11-07T09:40:34Z

Motivation and Context

Add support for Pluggable Transport Abstractions in MCP Python SDK.

Add abstractions to support `Pluggable Transport`.

This PR majorly adds two abstract classes and APIs that every transport must implement. The abstract classes are

src/mcp/client/transport_session.py -> ClientTransportSession
src/mcp/server/transport_session.py -> ServerTransportSession

Both the above classes have minimal APIs that every transport must implement in order to achieve the features defined in MCP Specification

Additionally existing transport classes which are based on JSONRPC inherits from these two new classes.

src/mcp/client/session.py -> ClientSession -> inherits from src/mcp/client/transport_session.py -> ClientTransportSession
src/mcp/server/session.py -> ServerSession -> inherits from src/mcp/server/transport_session.py -> ServerTransportSession

Type Hints Fixes

Since ClientSession and ServerSession has a higher level abstraction so this PR also updates the type hints to the parent classes. Precisely - places where we use ClientSession are updated to use ClientTransportSession and similarly ServerSession type hints are updated to use ServerTransportSession.

How Has This Been Tested?

Tested using pyright and uv run pytest. Changes are also validated using CI runs.

Breaking Changes

No.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update

Checklist

I have read the MCP Documentation
My code follows the repository's style guidelines
New and existing tests pass locally
I have added appropriate error handling
I have added or updated documentation as needed

Additional context

In future if we want to add more transports, those could implement abstract classes introduced in the PR which are - ClientTransportSesssion and ServerTransportSession.

Merge Main

Kludex · 2026-01-23T12:30:23Z

The session objects aren't transport specific, what problem are you trying to solve?

Why does this PR has so many "👍" ?

asheshvidyut · 2026-01-23T14:14:29Z

The session objects aren't transport specific, what problem are you trying to solve?

Why does this PR has so many "👍" ?

Hey @Kludex , Thanks for your reply.

Currently BaseSession, ServerSession and ClientSession are dependent on read write streams. If we want to add a gRPC Transport in a pluggable fashion, problem now is that we don't know what are the minimum APIs on Server and Client side which we need to implement to be fully compatible with MCP Specification.

Since gRPC Transport does not require read write streams, we have created abstract classes in this PR following interface segregation principle so that the abstract classes could be inherited by the current transport in this sdk as well as any future transport like gRPC.

This is part of modelcontextprotocol/modelcontextprotocol#1352 initiative.

Please let me know if I need to explain further. Thanks.

Kludex · 2026-01-23T15:49:08Z

[...] dependent on read write streams [...]

Yeah, I don't think there's any problem in that. Again, the session classes do not depend on any transport.

problem now is that we don't know what are the minimum APIs on Server and Client side which we need to implement to be fully compatible with MCP Specification.

The minimum is having the streams, check the stdio.py and websockets.py in this repository.

Since gRPC Transport does not require read write streams

Are you talking about what is implemented in: #1936 ? Streams are the same as queue, you do require queues in that PR, replace the asyncio.Queue by the streams.

krickert · 2026-01-25T16:11:24Z

asheshvidyut#8

I've added a PR to merge in here - it's a full implementation of MCP using gRPC.

I've been using gRPC/thrift/avro for awhile - and this PR implements full backward compatiibility with the proto I created. I'm totally open to changes and hope I can contribute to the efforts.

Most important - it implements true streaming calls. I've added three documents in the proto directory that document the following:

proto/README.md - Overall review of the full implementation.
proto/README-MCP-TUNNELING-PROPOSAL.md - a full description and architecture of the tunneling and how it was implemented
grpc-streaming-use-cases.md - I wanted to highlight the use cases I would use this for.
google-blog-post-grpc_custom_transport_for_MCP.md - can delete this, used it as a reference to the CTA that was posted on the google blog

Kludex · 2026-01-25T16:59:55Z

If you want to make a point, please make it here. I don't think it's reasonable to tell me to read documents somewhere else.

I've been using gRPC/thrift/avro for awhile - and this PR implements full backward compatiibility with the proto I created. I'm totally open to changes and hope I can contribute to the efforts.

Use anyio streams instead of asyncio.Queue - that's the only reason the interface we have right now is not compatible with the one you created.

I'll be closing this, since a new interface or abstraction is not needed.

krickert · 2026-01-25T19:09:45Z

Fair point on the documentation - I will put them as a separate reply. I didn't want to inundate you - apologies for that as I'd love to have a discussion about the tunneling.

I've refactored to use anyio.create_memory_object_stream instead of asyncio.Queue to align with SDK patterns. The queues were internal coordination for streaming responses - the transport itself uses native gRPC stub calls.

I see Google Cloud just pushed their mcp-grpc-transport-proto today. This validates the typed RPC approach - they're not wrapping JSON-RPC in protobuf, they have typed RPCs for each MCP operation:

service Mcp {
  rpc ListResources(ListResourcesRequest) returns (ListResourcesResponse);
  rpc ListTools(ListToolsRequest) returns (ListToolsResponse);
  rpc CallTool(CallToolRequest) returns (stream CallToolResponse);
  // ...
}

This aligns with what I implemented. However, looking at their proto, I think there's room for improvement:

Google's proto is mostly unary RPCs. Only CallTool returns a stream. This means that we can't watch for resource changes (must poll ListResources), there's no parallel tool execution (one tool at a time), it lacks chunked reading for large resources, and there's no bidirectional session multiplexing

They use a dependent_requests/dependent_responses pattern for server-to-client communication, which requires the client to retry requests to receive server-initiated data. This is polling, not streaming.

Here's what I propose:

I've drafted an extension that adds true streaming while staying compatible with Google's base proto:

service McpStreaming {
  // Bidirectional session for multiplexed operations
  rpc Session(stream SessionRequest) returns (stream SessionResponse);

  // Push notifications for resource changes
  rpc WatchResources(WatchResourcesRequest) returns (stream WatchResourcesResponse);

  // Stream large resources in chunks
  rpc ReadResourceChunked(ReadResourceChunkedRequest) returns (stream ResourceChunk);

  // Parallel tool execution
  rpc StreamToolCalls(stream StreamToolCallsRequest) returns (stream StreamToolCallsResponse);
}

Another point I'd like to discuss:

Schema registry integration

There's also an opportunity here for dynamic schema management. MCP tools declare JSON schemas for inputs, but with gRPC we could integrate with schema registries (like Confluent Schema Registry, Apicurio, or Amazon Glue).

This will allow us to:

Fetch protobuf descriptors at runtime without compile-time codegen
Handle schema evolution without redeploying clients
Allow gateways to validate/transform without compiled schemas
Support multi-tenant systems with different tool schemas per tenant

I've forked Google's proto repo and drafted a proposal: https://github.com/ai-pipestream/mcp-grpc-transport-proto/tree/streaming-extensions (i'll push more changes in a moment)

Questions I'd like feedback on:

Is there any resistance to introducing true streaming calls to the protocol definition for MCP?
Should streaming extensions live in the same proto or a separate file?
Is schema registry integration something worth pursuing for the SDK? It'll allow for dynamic strongly typed gRPC implementation integration, making it a lot easier for an AI to understand.
Can the grpc defninition follow the standards more (end with v1/v2 etc in the directory, proper naming of classes per spec, etc)? I have converted mine to follow spec, but the first ones proposed are not using that standard - it'll help grpc developers if we do.

Happy to open PRs to either Google's repo or the MCP SDK to continue the discussion with concrete code.

Kludex · 2026-01-25T20:07:27Z

I think you are antecipating yourself a bit here. I have no interest nor opinion in how the specific gRPC transport implementation should look like.

Is there any resistance to introducing true streaming calls to the protocol definition for MCP?

How do you see that happening, and why does the user needs to define the chunks the server will send themselves? I'm asking those questions based on the snippet I saw in your branch:

import asyncio

from mcp import StreamPromptCompletionChunk
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("Streaming Prompt Completion")


@mcp.stream_prompt_completion()
async def stream_prompt_completion(name: str, arguments: dict[str, str] | None):
    query = (arguments or {}).get("q", "")
    tokens = [f"Prompt {name}: ", query, " ...done"]

    for token in tokens[:-1]:
        yield StreamPromptCompletionChunk(token=token)
        await asyncio.sleep(0.05)

    yield StreamPromptCompletionChunk(
        token=tokens[-1],
        isFinal=True,
        finishReason="stop",
    )

I don't think the above is how we want to do it in any level - we have a lower level server and the high level (more user-friendly FastMCP [today renamed on main to MCPServer]).

Should streaming extensions live in the same proto or a separate file?

I don't have an opinion nor I care about gRPC, but it should not live in this repository given that it's an extension.

Going back to the original intent of this PR: the gRPC transport implemented can't be compliant given that MCP is tightly coupled with JSONRPC, which are reflected by the schemas in https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/schema/2025-11-25/schema.ts, and the documentation. So... Now I understand why the ClientTransportSession and ServerTransportSession are needed here: it's because the BaseSession parses/validates/builds JSONRPC data format.

If we want to have an abstraction that will make sense for transport implementers, the definition of the MCP types itself needs to change.

krickert · 2026-01-25T20:55:55Z

Totally understand - I should've told you my motivation and apologies for trying it here.

I initially saw the gRPC CTA:
https://cloud.google.com/blog/products/networking/grpc-as-a-native-transport-for-mcp

I submitted an email and it was suggested to join the discussion here for the gRPC work that was being done - my mistake though, I didn't realize this was the wrong place for this.

To answer your question about the stream_prompt_completion example - you're totally right. I presented it as complete, which sorta meant a proposed API. My main goal is to push for a true streaming client and your critique is spot-on.

I meant to show what native gRPC streaming could enable, not a thought-out UX. The chunking abstraction definitely shouldn't be exposed to users like that at the high level.

I'll go ahead and focus on the gRPC proto definition with Google's repo. Thanks for the clarification on where this discussion belongs.

Kludex · 2026-01-25T21:06:39Z

I submitted an email and it was suggested to join the discussion here for the gRPC work that was being done - my mistake though, I didn't realize this was the wrong place for this.

Just to be clear, it's the specificities of gRPC that I don't care. I'm happy to discuss ways to create an interface to make it easier for people to work on their own transport implementations. :)

krickert · 2026-01-25T21:37:50Z

That's actually what I was exploring with the tunneling proposal - a way to create an interface that works for both streaming transports (like gRPC) and cursor-based transports (like JSON-RPC) without changing either wire protocol.

The Problem

Right now, if I want to implement a gRPC transport, I'm forced into "fake streaming" - the server has to buffer everything into a ListToolsResult, then I yield items one by one. The streaming is cosmetic:

Layer	What Happens	Issue
Server	`result = handler()` then `for item in result: yield`	Everything loaded into memory first
Client	`items = []; async for r in stream: items.append(r)`	Buffers entire stream into list
Interface	`list_tools() -> ListToolsResult`	Forces complete results

No memory or latency benefits - just extra steps.

The Idea: StreamingAdapter

What if streaming was the internal abstraction, and cursor-based transports just emulated it?

class StreamingAdapter:
    """Unified streaming interface over any transport."""

    def __init__(self, transport: ClientTransportSession):
        self._transport = transport

    async def stream_list_tools(self) -> AsyncIterator[types.Tool]:
        if isinstance(self._transport, GrpcClientTransport):
            # Native streaming - zero overhead
            async for tool in self._transport._stream_list_tools_native():
                yield tool
        else:
            # Cursor-based - iterate pages internally
            cursor = None
            while True:
                result = await self._transport.list_tools(cursor=cursor)
                for tool in result.tools:
                    yield tool
                cursor = result.nextCursor
                if cursor is None:
                    break

    async def list_tools(self) -> types.ListToolsResult:
        """Backward compatible - collects stream into result."""
        tools = [t async for t in self.stream_list_tools()]
        return types.ListToolsResult(tools=tools, nextCursor=None)

The key points:

gRPC transports pass through natively (no overhead)
JSON-RPC/SSE transports iterate cursors internally (hidden from app code)
list_tools() still works exactly as before - it just buffers the stream
New code can use stream_list_tools() if it wants real streaming

What Changes, What Doesn't

Component	Knows Streaming?	Knows Cursors?	Changes?
gRPC Transport	Yes (native)	No	None
JSON-RPC Transport	No	Yes (native)	None
StreamingAdapter	Yes	Yes	New
Old app code	No	Via `list_tools()`	None
New app code	Optional	Hidden	None

The JSON-RPC wire protocol stays identical. Existing servers and clients don't need updates. Streaming is purely additive.

Backpressure Reality

One thing I didn't want to hide - transports have different backpressure characteristics:

Transport	Backpressure	Behavior
gRPC	Native (HTTP/2 flow control)	Server slows if client can't keep up
JSON-RPC	None (request-response)	Client controls pace via cursor timing

The adapter preserves gRPC's backpressure. For cursor-based transports, "backpressure" is implicit in when the client requests the next page. I think adapters should preserve these realities rather than pretending they don't exist.

Why I like this approach

Simply put, I think this gives us the best of both worlds for transport implementers. This approach:

Doesn't touch JSON-RPC at all
Isolates complexity to one new component (StreamingAdapter)
Keeps backward compatibility - old code just works
Lets streaming transports actually stream

I'm not saying this is the right answer - there's probably many ways to do this. I was inspired by how IPv6 tunneling works - same concept of letting the new protocol work natively where supported while transparently bridging over the old one.

I can write this up in Java and Python. The real win is that MCP implementations could react instantly to requests and maintain 2-way communication between agents without blocking on replies.

asheshvidyut · 2026-01-26T06:56:00Z

Thanks for followup @krickert

So... Now I understand why the ClientTransportSession and ServerTransportSession are needed here: it's because the BaseSession parses/validates/builds JSONRPC data format.

@Kludex How should we go about moving ahead?

Kludex · 2026-01-26T08:16:41Z

Thanks for followup @krickert

So... Now I understand why the ClientTransportSession and ServerTransportSession are needed here: it's because the BaseSession parses/validates/builds JSONRPC data format.

@Kludex How should we go about moving ahead?

The types in the spec itself need to be decoupled from JSONRPC.

Once they do, we can look for ways to make the BaseSession to be more "pluggable".

krickert · 2026-01-26T12:09:52Z

@Kludex that's exactly the direction I'd like to work on.

I'd suggest starting with the gRPC proto definition in that repo, with the goal that the JSON-RPC interface stays unchanged and works alongside it - not a separate spec maintained in parallel. The current spec can do this, but it needs true streaming added in the design.

The biggest win from the gRPC spec would be introducing true streaming without cursors - this is where we see significant performance (memory and network) gains as well as better integration with AI workloads in data mesh scenarios and client chatbot services. Once we have a solid spec, a tunneling approach will emerge because the gRPC service would be streaming OOTB.

Also, it's best to design the gRPC definition using gRPC specs and avoid a 100% 1:1 mapping of the JSON-RPC API since the streaming aspect already deviates from it. The reason is that gRPC specs prioritize backward compatibility and cross-language design with an emphasis on schemas. Working through the gRPC definition first will surface the right design questions for the transport layer.

I noticed https://github.com/GoogleCloudPlatform/mcp-grpc-transport-proto was pushed - if we address some of the spec issues there (I opened 2 issues and an initial PR that surfaces some of the gRPC concerns), it will make the individual SDK conversations easier to understand how a transport layer can be defined.

asheshvidyut marked this pull request as ready for review November 7, 2025 09:48

asheshvidyut added 29 commits November 12, 2025 14:12

add transport abstraction

dd4939d

fix ruff

11d1249

fix ruff format

c8f3a42

add transport session for server

03cc6c5

clientsession and server session to implement abstract classes

1327a9c

add raise not implemented

0018679

fix abstract server transport session

af7ff5a

removed unused import

7f468d0

fix type hints

e895d90

revert type hints

d01e477

fix import

7bdafa3

fix import

e9f63dd

fix ruff format

5b156a1

request context as optional param

f26d861

fix format

3097cb3

ruff check --fix

9e8dca3

fix pyright

5b7b458

ruff fix

8ca511e

removed fat abstract class

53e02fe

removed client a thin interface

cf0f152

add description

ccbdde8

revert context change in this pr

380710e

rename classes

3f977b3

ruff fix

ec7b6d6

merge main

0359aa8

fix type hints for serversession

b733fcf

fix ruff

cdc39f4

uv run scripts/update_readme_snippets.py

65a3b0f

some fixes

f34e8fe

asheshvidyut and others added 2 commits November 14, 2025 07:01

Merge branch 'fix-improts' into pluggable-transport

3869564

Merge branch 'main' into pluggable-transport

195bfe5

maxisbey added v2 Ideas, requests and plans for v2 of the SDK which will incorporate major changes and fixes enhancement Request for a new feature that's not currently supported P3 Nice to haves, rare edge cases labels Nov 18, 2025

asheshvidyut and others added 2 commits January 23, 2026 17:47

merge main

a0f1136

Merge pull request #7 from asheshvidyut/fixes-abstractions

cc3a755

Merge Main

asheshvidyut added 8 commits January 23, 2026 12:46

fixes ci

48324fe

fixes tests

4e6800e

some pyright fixes

a85068e

fix some pyright

f0fac5c

fix pyright

f0a8a41

ruff fixes

e522882

fix readme

7b25f8e

fix tests and readme

99f194f

revert anyurl change

1a4ad49

Kludex closed this Jan 25, 2026

Pluggable Transport Abstractions #1591

Pluggable Transport Abstractions #1591

Conversation

asheshvidyut commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Add abstractions to support Pluggable Transport.

Type Hints Fixes

How Has This Been Tested?

Breaking Changes

Types of changes

Checklist

Additional context

Uh oh!

Kludex commented Jan 23, 2026

Uh oh!

asheshvidyut commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kludex commented Jan 23, 2026

Uh oh!

krickert commented Jan 25, 2026

Uh oh!

Kludex commented Jan 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

krickert commented Jan 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kludex commented Jan 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

krickert commented Jan 25, 2026

Uh oh!

Kludex commented Jan 25, 2026

Uh oh!

krickert commented Jan 25, 2026

The Problem

The Idea: StreamingAdapter

What Changes, What Doesn't

Backpressure Reality

Why I like this approach

Uh oh!

asheshvidyut commented Jan 26, 2026

Uh oh!

Kludex commented Jan 26, 2026

Uh oh!

krickert commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

asheshvidyut commented Nov 7, 2025 •

edited

Loading

Add abstractions to support `Pluggable Transport`.

asheshvidyut commented Jan 23, 2026 •

edited

Loading

Kludex commented Jan 25, 2026 •

edited

Loading

krickert commented Jan 25, 2026 •

edited

Loading

Kludex commented Jan 25, 2026 •

edited

Loading