Fix bulk of remaining issues with generalist profile#26073
Conversation
|
Size Change: +12.9 kB (+0.04%) Total Size: 33.9 MB
ℹ️ View Unchanged
|
bd374e8 to
d2f38de
Compare
79cda0e to
e1f3b7d
Compare
- Inject stable synthetic IDs in GeminiChat for tool calls lacking them. - Remove brittle positional matching in ContextGraphBuilder. - Transition ContextGraph to deterministic ID-based lookup for tool results. - Update system snapshots to reflect cleaner history mapping.
- Move synthetic ID generation to GeminiChat stream processing.
- Inject IDs directly into chunk objects to synchronize executor (Turn.ts) and history.
- Implement stateful deduplication during streaming to handle SDK repeats/deltas.
- Use stable synth_${promptId}_${index} format for absolute determinism.
- Standardize Turn.ts to trust and fallback to the same stable ID scheme.
- Refactor ContextGraphBuilder to process history in a single, stateless pass. - Eliminate turn-to-turn persistent state to prevent mapping drift. - Simplify ContextGraphMapper to utilize atomic reconstruction. - Ensure structural resilience to history truncation and reordering.
- Introduce HistoryHardener utility to enforce Gemini API invariants. - Patch role alternation, user-anchoring, and trailing-user requirements. - Automatically repair orphaned tool calls with sentinel response nodes. - Integrate hardener as the final step in context rendering. - Update tests and snapshots to reflect defensive sentinel turns.
- Update graph nodes to store and serialize thoughtSignatures. - Fix structural misalignment where signatures were dropped during projection. - Fix bug in GeminiChat where fixed contents with signatures were discarded. - Ensure tool calls are correctly linked to thoughts for Gemini-3 models.
- Pivot IR from a reconstructive model to a lossless 1:1 mirror of Gemini Parts. - Update ContextGraphBuilder to perform 1:1 part-to-node projection. - Update ContextProcessors (Truncation, Distillation, Masking, Snapshotting) to operate directly on pristine Part payloads. - Simplify reconstruction logic in fromGraph to preserve original Part objects. - Refactor all context tests and utilities to align with the flattened structure. - Add safe Part cloning helpers with justified lint suppressions. - Update system golden snapshots to reflect improved token estimation efficiency.
…ping - Remove the logic that dropped orphaned function responses, as it was causing 'function name mismatch' errors when the API state became inconsistent. - Instead, rely on the existing logic to inject missing responses for calls, which is the safer way to restore API invariants. - This simplification ensures that we don't inadvertently create state mismatches by removing one half of a call/response pair while keeping the other.
- Re-enable the logic in HistoryHardener to drop 'functionResponse' parts if their preceding 'functionCall' was truncated. This is required by the Gemini API to maintain turn pairing invariants. - Add ID-based deduplication in 'fromGraph' to ensure each node appears at most once in the history, preventing 'function name mismatch' and duplicate ID errors. - These changes together ensure that the reconstructed history is both valid and consistent with the API's expectations.
…ardening - Added 'Node Pinning' in ContextManager to prevent 'in-flight' tool calls (calls without responses) from being truncated during context management. - Refactored HistoryHardener into a multi-pass validator that strictly enforces role alternation and pairing invariants. - Fixed 'starts with model' and 'mismatched call/response' API errors by ensuring structural integrity is enforced AFTER all tool pairing modifications. - Updated golden snapshots to reflect improved sentinel messaging.
- Implementation of globally unique synthetic IDs in GeminiChat and Turn. - IDs now include promptId, turn timestamp, and a persistent counter to prevent collisions. - Updated ContextGraphBuilder (toGraph.ts) to prioritize API/Synthetic IDs for node identity. - Ensured that tool execution results preserve the call's unique ID, enabling 1:1 mapping in the graph. - Verified with system lifecycle tests.
… observation - Modified HistoryObserver to always process the FULL history from AgentChatHistory. Previously, it was only processing the incremental PUSH payload, which caused the ContextManager to erroneously prune all existing history nodes. - Updated renderHistory in ContextManager to correctly identify and protect active tool calls during the synchronous rendering phase. - These changes ensure that the Context Graph remains stable and only prunes nodes when they actually exceed the configured token budget.
…plication and token overcounting - Implementation of stable, content-based IDs in 'toGraph.ts' using type-safe salted hashing (content + turn/part indices). This ensures node IDs remain consistent even when history objects are re-created (e.g., during thought stripping). - Used idiomatic TypeScript type guards to safely handle Part subtypes while complying with strict ESLint rules. - Added ID-based deduplication in 'ContextTokenCalculator' to ensure each unique node is only counted once toward the token budget. - Refactored 'appendPristineNodes' in 'ContextWorkingBuffer' to skip nodes already present in the buffer by ID. - These changes resolve the 'cutting too deep' issue where re-synchronization events caused the history to double in the internal buffer, triggering premature and aggressive truncation/summarization. - Verified with all 113 context and system lifecycle tests.
- Fixed context graph ID collisions by namespacing tool call and response IDs ('call_' and 'resp_'). This ensures responses are not filtered out as duplicates of their corresponding calls.
- Refined token estimation heuristics to exclude massive 'thoughtSignature' metadata blobs (saving 100k+ phantom tokens in complex sessions).
- Updated image token estimates to Gemini 1.5 standards (258 tokens vs 3000).
- Verified with all context unit tests and system lifecycle golden tests.
- Added 'trace-improvements.md' documenting observability goals. - Implemented 'Budget Audit' tracing to track max/retained tokens and pressure. - Implemented 'Protection Audit' with structured reasons (e.g., system_prompt, in_flight_tool_call). - Implemented 'Transformation Lineage' tracing in Orchestrator to track N->M node replacements and size deltas. - Implemented 'Estimation Calibration' breakdown (Text, Media, Tool, Overhead) to verify token heuristics. - All 113 context tests and system lifecycle tests passed.
- Updated 'render.ts' to protect all nodes in the most recent turn (logical episode) rather than just the single last node. This prevents partial pruning of multi-part turns which was causing orphaned tool calls. - Updated 'toGraph.ts' with robust ID extraction that only uses API IDs if they are valid strings, and added content-based hashing fallback to ensure stability across history re-syncs even for tool parts without IDs. - Verified with system lifecycle tests.
- Implemented index-independent stable IDs in toGraph.ts using content-based turn salts and occurrence counters. This prevents IDs from changing when history is re-indexed. - Updated renderHistory to return metadata about whether permanent context management was applied. - Updated GeminiClient to only overwrite chat history if real management (pruning/summarization) occurred, preventing transient hardening sentinels from poisoning the pristine conversation record. - Ensures atomic protection of all parts in the most recent turn. - Verified with system lifecycle tests.
…calls - Fixed timing bug where tool responses were missing during context rendering by adopting requests into history immediately in 'GeminiClient'. - Implemented 'refineToolResponses' in 'HistoryHardener' to hoist and re-order tool responses, ensuring strict API compliance and better model follow-up. - Upgraded 'getStableId' to use SHA-256 for robust collision resistance in large graphs. - Updated 'hardening-improvements.md' to track long-term stability goals. - Verified with system lifecycle tests (Scenario 1-3).
- Updated hardening roadmap to reflect the decision to preserve mixed turns (tool responses + text) for better hinting support and role alternation stability. - Confirmed core structural improvements (re-ordering, hoisting, SHA-256) are complete.
…d 'SILENT_SYNC' to break history re-sync loops.\n- Added Synchronous Pressure Barrier in 'renderHistory' to resolve sync/async race conditions.\n- Centralized node protection in 'ContextManager' to globally pin System Prompt and Recent Context.\n- Implemented render caching using graph-hash to reduce redundant overhead.\n- Refined processor thresholds for increased pruning sensitivity.\n- Fixed core test failures and updated system lifecycle golden snapshots.\n- Updated 'bugs.md' to reflect resolved status for all identified anomalies.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the robustness and reliability of the context management system. By introducing history hardening, refining graph-to-history reconstruction, and enhancing token calculation accuracy, the changes ensure that the agent's context remains valid and within budget constraints. Additionally, the orchestrator has been updated to handle asynchronous pipeline execution more safely, and a new render cache reduces unnecessary computation. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request refactors the context management system to improve fidelity by implementing a 1:1 mapping between graph nodes and Gemini Part objects, removing logical nodes like Episodes and Tasks. It introduces a history hardening pass to enforce API invariants and synchronous pressure barriers to ensure state consistency. Token estimation heuristics and profile configurations were also updated. A security vulnerability was identified in the BlobDegradationProcessor where unsanitized mimeType strings could lead to path traversal; a suggestion was provided to sanitize the file extension.
43cf716 to
bb5b8bb
Compare
1c59feb to
a1b677d
Compare
logging) mitigations
Fixes #26072