fix(agent): prevent silent tool result loss during context compression#1976
Closed
Gutslabs wants to merge 1 commit into
Closed
fix(agent): prevent silent tool result loss during context compression#1976Gutslabs wants to merge 1 commit into
Gutslabs wants to merge 1 commit into
Conversation
_align_boundary_backward only checked messages[idx-1] when deciding whether the compress-end boundary splits a tool_call/result group. When an assistant issues 3+ parallel tool calls, the resulting tool messages span multiple consecutive positions. If the tail protection boundary (n_messages - protect_last_n) fell in the MIDDLE of that group, the method saw another tool result at idx-1 (not the parent assistant) and did nothing. This caused a silent data loss chain: 1. The parent assistant message was summarized away (middle region) 2. The orphaned tool result(s) remained in the tail 3. _sanitize_tool_pairs removed them as orphans 4. The content was neither in the summary nor preserved — gone forever The fix walks backward through all consecutive tool results to find the parent assistant, then pulls the boundary before the entire group. Includes 6 regression tests covering: - Clean boundaries (no adjustment needed) - Original case (assistant at idx-1) - NEW: boundary in middle of tool result group - NEW: boundary after last tool result in a group - NEW: consecutive tool groups (only walks to nearest parent) - End-to-end: compress() with 7 parallel tool calls
Contributor
|
Merged via PR #1993. Your core fix for |
spiky02plateau
added a commit
to spiky02plateau/hermes-agent
that referenced
this pull request
Jun 3, 2026
…xt override unset
Subprocess env builders (_sanitize_subprocess_env, _make_run_env) pin
HOME to the profile's home/ dir but only inject HERMES_HOME from the
_HERMES_HOME_OVERRIDE ContextVar, which is unset for background/PTY/cron
spawns. The child then has HOME={profile}/home but no HERMES_HOME, so
get_hermes_home() falls back to ~/.hermes and reads the default profile's
config/auth/memory instead of its own — cross-profile data corruption.
Add a single os.getenv("HERMES_HOME") fallback in the shared
_inject_context_hermes_home() so the common single-profile-gateway case
is covered. The ContextVar keeps precedence (per-session profile
mutation, NousResearch#1976); only one key (HERMES_HOME, a non-secret path) is
touched, so the secret-isolation invariant is intact.
Fixes NousResearch#4707
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
11 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix silent data loss in context compression when parallel tool calls (3+) span multiple messages and the compression boundary falls in the middle of a tool-result group.
The Bug
_align_boundary_backwardonly checkedmessages[idx-1]to decide if the compress-end boundary splits a tool_call/result group. When an assistant issues 3+ parallel tool calls, their results span multiple consecutive messages. If the tail protection boundary (n_messages - protect_last_n) fell in the middle of that group, the method saw another tool result atidx-1(not the parent assistant) and did nothing.This caused a silent data loss chain:
_sanitize_tool_pairsremoved them as orphansNo error, no warning, no indication that data was lost. The agent continued operating with incomplete information.
Triggering scenario
Any conversation where the model issues >= 3 parallel tool calls AND
(total_messages - protect_last_n)falls in the middle of the resulting tool result messages. With defaultprotect_last_n=4, this happens whenever the last 4 messages start inside a tool result group.Concrete example with 15 messages (protect_first_n=3, protect_last_n=4):
tc_G's result at [11] is orphaned: parent assistant [4] was summarized, result [11] is in the tail.
_sanitize_tool_pairsdeletes it. Content lost forever.The Fix
_align_boundary_backwardnow walks backward through all consecutive tool results to find the parent assistant, then pulls the boundary before the entire group so it gets summarized together.Test Plan
6 regression tests added (
tests/test_compression_boundary.py):compress()with 7 parallel tool calls — verifies no orphans and no stub resultsTests confirmed: 3/6 FAIL with old code, 6/6 PASS with fix.