feat(core): Unified Context Management and Tool Distillation.#24157
feat(core): Unified Context Management and Tool Distillation.#24157joshualitt merged 2 commits intomainfrom
Conversation
|
Hi @joshualitt, thank you so much for your contribution to Gemini CLI! We really appreciate the time and effort you've put into this. We're making some updates to our contribution process to improve how we track and review changes. Please take a moment to review our recent discussion post: Improving Our Contribution Process & Introducing New Guidelines. Key Update: Starting January 26, 2026, the Gemini CLI project will require all pull requests to be associated with an existing issue. Any pull requests not linked to an issue by that date will be automatically closed. Thank you for your understanding and for being a part of our community! |
🧠 Model Steering GuidanceThis PR modifies files that affect the model's behavior (prompts, tools, or instructions).
This is an automated guidance message triggered by steering logic signatures. |
|
Size Change: +19.5 kB (+0.07%) Total Size: 26.5 MB
ℹ️ View Unchanged
|
b2ba95f to
90334df
Compare
This commit introduces a comprehensive, multi-tiered approach to managing the
agent's context window, ensuring stability and long-term continuity during
complex multi-turn workflows.
Key Changes:
1. Unified Configuration: Consolidates history and distillation settings into a
new `contextManagement` schema, configurable via CLI settings.
2. Progressive Message Normalization: Introduces `normalTokenLimit` and
`maximumTokenLimit` to dynamically bound message sizes. Messages are kept
at full fidelity within a "grace zone" and proportionally compressed as
they age or if they exceed extreme limits.
3. Tool Distillation: `ToolOutputDistillationService` intercepts massive tool
outputs (e.g., heavy compiler logs, raw web fetches), saving the full
content to disk and providing the agent with a structurally truncated
version. Extremely large outputs trigger a secondary LLM to generate an
intent/factual summary.
4. Intelligent Truncation: Calculates truncation boundaries based on a precise
token budget (`targetRetainedTokens`), falling back to an LLM-generated
state summary ("Agent Continuity") to prevent the agent from losing its
strategic context when the oldest messages are dropped.
90334df to
8e06e71
Compare
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a robust, multi-tiered system for managing the agent's context window. By centralizing configuration and implementing advanced techniques like progressive message normalization and tool output distillation, the changes ensure stability and long-term continuity during complex, multi-turn workflows while optimizing token usage. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive Context Management system that replaces the previous experimental agent history truncation logic. Key enhancements include the implementation of a ToolOutputDistillationService, which handles oversized tool outputs by offloading them to disk and providing summarized previews to the LLM, and a refactored AgentHistoryProvider that uses token-based budgeting and proportional message normalization. Configuration schemas and documentation have been updated to support these new settings. Review feedback identifies critical security vulnerabilities related to prompt injection in the summarization logic of the history provider and distillation service, as well as within the web-fetch tool's handling of untrusted content.
730eaf1 to
448f289
Compare
448f289 to
513d83c
Compare
This commit introduces a comprehensive, multi-tiered approach to managing the
agent's context window, ensuring stability and long-term continuity during
complex multi-turn workflows.
Key Changes:
contextManagementschema, configurable via CLI settings.normalTokenLimitandmaximumTokenLimitto dynamically bound message sizes. Messages are kept at full fidelity within a "grace zone" and proportionally compressed as they age or if they exceed extreme limits.ToolOutputDistillationServiceintercepts massive tool outputs (e.g., heavy compiler logs, raw web fetches), saving the full content to disk and providing the agent with a structurally truncated version. Extremely large outputs trigger a secondary LLM to generate an intent/factual summary.targetRetainedTokens), falling back to an LLM-generated state summary ("Agent Continuity") to prevent the agent from losing its strategic context when the oldest messages are dropped.Addresses #21889