feat(session): rethink tool budget with per-call counting and graceful wind-down#280
Merged
Conversation
…ge, and graceful wind-down (#277) Replace the iteration-based tool circuit breaker (MaxToolIterationsPerTurn=10) with per-call counting (MaxToolCallsPerTurn=30). Each tool call in a batch counts individually, giving a more accurate budget. Three-phase approach inspired by OpenCode and LangGraph: 1. At ~75% budget: inject a system nudge so the model can self-regulate 2. At 100% budget: strip tools AND inject a "summarize your work" prompt 3. If model still hallucinate tool calls: fail with a specific message ("I used all available tool calls") instead of generic "I didn't produce a reply" The key insight: give the model awareness and agency to wrap up gracefully, rather than just pulling the rug out.
This was referenced Mar 18, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Continues #277 — replaces the iteration-based tool circuit breaker with a smarter three-phase approach inspired by OpenCode (tool stripping + summary prompt) and LangGraph (budget awareness nudge).
Problem: The old circuit breaker counted iteration rounds (default 10), then hard-stripped tools. Models (especially Qwen) often hallucinated tool calls after tools were removed, causing a
turn_force_no_tools_violationwith the misleading "I didn't produce a reply" message.Solution: Three-phase budget with per-call counting:
Per-call counting (
MaxToolCallsPerTurn=30): Each tool call counts individually. 3 parallel calls in one batch = 3 against the budget. More accurate than counting rounds.Budget nudge at ~75%: Injects a system message: "You have used X of Y tool calls. Start wrapping up." Gives the model awareness to self-regulate.
Graceful wind-down at 100%: Strips tools AND injects "Summarize your work and answer the user's question." The model gets one turn to produce a proper summary instead of being abruptly cut off.
Specific violation message: If the model still hallucinate tool calls, the error says "I used all available tool calls" instead of the generic fallback.
Files changed
SessionConfig.csMaxToolIterationsPerTurn→MaxToolCallsPerTurn(default 30)LlmSessionActor.csMaxToolIterationTests.csconfiguration.mdTest plan
Tool_iteration_limit_forces_text_response— budget triggers at correct call countTool_iteration_limit_fails_turn_when_model_keeps_emitting_tool_calls_without_tools— violation uses specific messageTool_iteration_counter_resets_between_turns— counters reset correctlyNormal_tool_use_within_limit_works_unchanged— normal flow unaffected