fix(agent): include tools schema in post-compression token estimate (#14695)#15433
Closed
Tranquil-Flow wants to merge 1 commit into
Closed
fix(agent): include tools schema in post-compression token estimate (#14695)#15433Tranquil-Flow wants to merge 1 commit into
Tranquil-Flow wants to merge 1 commit into
Conversation
…ousResearch#14695) After compression, the token estimate was computed using only the system prompt and compressed messages, ignoring the tools schema entirely. With 50+ tools this can add 20-30K tokens — a significant blind spot that caused pressure heuristics to under-report context usage and trigger premature re-compression. Switch from the two-function estimate (estimate_tokens_rough + estimate_messages_tokens_rough) to estimate_request_tokens_rough() which already accepts a tools parameter, matching what the API call actually sends.
Collaborator
Contributor
Author
|
Closing — the fix is now on main. On current
Same diagnosis, same fix. No further action needed on this PR. Thanks for the original write-up. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
After context compression,
last_prompt_tokensis set usingestimate_tokens_rough(system_prompt) + estimate_messages_tokens_rough(compressed), which counts only the system prompt and message content. Tools schema tokens are excluded, causing the system to underestimate context usage and delay the next compression cycle.The codebase already has
estimate_request_tokens_rough()which accepts atoolsparameter — it just wasn't being used in the post-compression path. This PR routes the post-compression estimate through that function so the tools schema overhead is included.Related Issue
Fixes #14695
Type of Change
Changes Made
run_agent.py: Replace the two-function estimate with a single call toestimate_request_tokens_rough(compressed, system_prompt=new_system_prompt, tools=self.tools), which includes tools schema overhead in the token count.How to Test
estimate_request_tokens_roughwith 30 tools produces a substantially larger estimate than without (2000+ tokens overhead)._compress_context()end-to-end: storedlast_prompt_tokensincludes tools overhead.self.toolsisNone.Tested on macOS (Python 3.11).
Checklist
Code
fix(scope):,feat(scope):, etc.)pytest tests/ -qand all tests passDocumentation & Housekeeping
docs/, docstrings) — or N/Acli-config.yaml.exampleif I added/changed config keys — or N/ACONTRIBUTING.mdorAGENTS.mdif I changed architecture or workflows — or N/AScreenshots / Logs
N/A — see commit description and PR diff.