fix(agent): compact at end of a final-answer turn to prevent context overflow#4079
Merged
Merged
Conversation
…overflow A turn that ends with a final answer (no trailing tool batch) skipped compaction entirely — maybeCompact ran only after tool batches and in the retry paths. So a large context carried into the next turn un-folded, and across a multi-turn session it accumulated until the next request exceeded the model's hard context limit and the provider returned 400, breaking the session. Compact at the end of the final-answer path too. It is a no-op below the trigger, so normal turns keep their warm cache; it folds only when the context is already over the threshold — exactly when the next turn would risk overflow.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Found while stress-testing long multi-turn sessions with repeated heavy compaction.
The bug
maybeCompactruns only after a tool batch (and in the readiness/empty-final retry paths). A turn that ends with a clean final answer (no trailing tool call) returns without compacting. So when a turn ends with a large context, it carries into the next turn un-folded; across a multi-turn session the context accumulates until a request exceeds the model's hard limit and the provider returns 400 (maximum context length) — the session then breaks (every--continuere-sends the over-limit context).Most acute for turns that add large content but call no tools (pasting big input, pure Q&A over a large context).
Repro (real provider)
6-turn session, each turn fed a ~380k-token batch,
context_window700k:400: maximum context length is 1048565 tokens, requested 1158757→ session deadAfter the fix, the same session runs to completion (9 folds, no 400).
The fix
Call
maybeCompacton the final-answer path too. It's a no-op below the trigger (normal turns keep their warm cache); it folds only when the context is already over the threshold — exactly when the next turn would otherwise overflow.Test
TestRunCompactsAfterFinalAnswer: a final-answer turn over the trigger must compact. Verified it fails without the one-line change and passes with it. Fullinternal/agentsuite green.