fix(session): fix root causes and reconstruction of tool_use/tool_result mismatch (#16749)#16751
fix(session): fix root causes and reconstruction of tool_use/tool_result mismatch (#16749)#16751altendky wants to merge 3 commits intoanomalyco:devfrom
Conversation
…smatch (anomalyco#16749) Add failing test demonstrating that when step-finish/step-start parts are missing (due to retryable stream errors), toModelMessages produces a single assistant block with interleaved tool-call and text parts. Currently fails with: Content types in this message: [text, tool-call, text, tool-call].
|
Thanks for your contribution! This PR doesn't have a linked issue. All PRs must reference an existing issue. Please:
See CONTRIBUTING.md for details. |
|
The following comment was made by an LLM, it may be inaccurate: Found related PRs that may address similar issues:
These PRs may already be working on fixes for the issues this test case is reproducing. Check if they've been merged or if there's overlap in the solution approaches. |
…terleaved due to missing step boundaries When the finish-step handler throws during a retryable error, step-finish for step 1 and step-start for step 2 are never saved. Both steps' content merges into one DB message without boundaries. On replay, convertToModelMessages() produces a single assistant block with interleaved tool_use/text, which the Anthropic API rejects. Fix: track whether we've seen a tool part in the current step. If text or reasoning appears after a tool part without an intervening step-start, inject a synthetic step-start to force the AI SDK to split content into separate blocks.
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
Issue for this PR
Closes #16749
Related: #10616, #8377, #2720, #1662, #5750, #2214, #8312, #8010
Type of change
What does this PR do?
Fixes the root causes and provides a reconstruction-time safety net for the widespread
tool_use ids were found without tool_result blocks immediately aftererror that corrupts sessions and makes them unrecoverable.The fix is three layers of defense-in-depth, each catching what the previous one misses:
Layer 1 —
processor.ts: Tool-error race condition (line 211)The
tool-errorhandler only processed errors for tools in"running"status. Due to the AI SDK's merged-stream event ordering,tool-errorcan arrive beforetool-call, when the tool is still"pending". The error was silently dropped, leaving the tool in"pending"state to be cleaned up later as"Tool execution aborted"with emptyinput: {}.Fix: Accept
tool-errorfor both"running"and"pending"status. UsesDate.now()as start time for pending tools (which don't have atime.startfield).Layer 2 —
processor.ts: Recovery step-finish before retry (line 374)When a stream error interrupts processing before
finish-stepis reached, or thefinish-stephandler itself throws, the step boundary is never written. The retry loop'scontinuecreates a new stream whose events are appended to the same DB message without astep-finish/step-startboundary. Both steps' content merges into one message, andtoModelMessages()produces a single assistant block with interleavedtool_use/textthat the Anthropic API rejects.Fix: Before
continueing the retry loop, scan parts backward for an unclosed step (step-start without a matching step-finish). If found, write a recoverystep-finishwithreason: "error"and zero tokens/cost. Wrapped in try/catch so recovery failures don't block the retry.Layer 3 —
message-v2.ts: Synthetic step-start injection (line 623)A reconstruction-time safety net that handles already-corrupted DB data regardless of how step boundaries were lost.
Fix: In
toModelMessages(), track whether we've seen a tool part in the current step (sawToolflag). Iftextorreasoningappears after a tool part without an interveningstep-start, inject a synthetic{ type: "step-start" }to force the AI SDK to split content into separate assistant+tool blocks.How layers interact
continueReal-world evidence
Session
ses_32fb35486ffeeJAHmplKU1gB2t, messagemsg_cd05ba534001gICo48Lsy1NHWp:input: {}—tool-errorwas dropped because status was"pending"(Layer 1 root cause)step-finish/step-startboundary between the two groups (Layer 2 root cause)How did you verify your code works?
[step-start, text, tool(error), text, tool(completed)]) and asserts the structural invariant: notextorreasoningpart appears after atool-callpart in the same assistantModelMessageContent types in this message: [text, tool-call, text, tool-call]Files changed
packages/opencode/src/session/processor.ts"pending") + recovery step-finish before retrypackages/opencode/src/session/message-v2.tstoModelMessages()packages/opencode/test/session/message-v2.test.tsChecklist