Skip to content

Add stop/cancel mechanism for SSE streaming generation #4823

@ferponse

Description

@ferponse

Problem

When using runner.run_async() with StreamingMode.SSE, there is no way for consumers to signal the flow to stop generating mid-stream. This is needed for implementing a "stop generating" button in chat UIs.

Current workarounds

  • Task cancellation (asyncio.Task.cancel()): Works but is a hard stop — may not clean up properly and doesn't allow the flow to return gracefully.
  • Breaking out of async for: Triggers aclose() on the generator chain, but the LLM may continue generating in the background until the connection is closed.

Neither approach gives the flow a chance to stop cleanly between chunks.

Proposed solution

Add an optional stop_event: asyncio.Event parameter to runner.run_async() that:

  • Is checked before each LLM call in the while True loop of run_async
  • Is checked before yielding each streaming chunk in _run_one_step_async
  • When set (stop_event.set()), causes the flow to stop yielding and return cleanly

Usage

stop_event = asyncio.Event()

# Start streaming in a task
async def stream():
    async for event in runner.run_async(
        user_id=user_id,
        session_id=session_id,
        new_message=content,
        run_config=run_config,
        stop_event=stop_event,
    ):
        yield event

# When user clicks "stop generating":
stop_event.set()

Use case

We build a chat UI that streams responses via WebSocket. While the agent is generating, we show a "stop" button. When clicked, the frontend sends a stop message, and the backend sets the stop_event to immediately stop the LLM from producing more tokens.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions