fix(agent): prevent silent tool result loss during context compression by Gutslabs · Pull Request #1976 · NousResearch/hermes-agent

Gutslabs · 2026-03-18T19:21:15Z

Summary

Fix silent data loss in context compression when parallel tool calls (3+) span multiple messages and the compression boundary falls in the middle of a tool-result group.

The Bug

_align_boundary_backward only checked messages[idx-1] to decide if the compress-end boundary splits a tool_call/result group. When an assistant issues 3+ parallel tool calls, their results span multiple consecutive messages. If the tail protection boundary (n_messages - protect_last_n) fell in the middle of that group, the method saw another tool result at idx-1 (not the parent assistant) and did nothing.

This caused a silent data loss chain:

The parent assistant message was summarized away (included in the middle region)
The orphaned tool result(s) remained in the tail
_sanitize_tool_pairs removed them as orphans
The content was neither in the summary nor preserved — gone forever

No error, no warning, no indication that data was lost. The agent continued operating with incomplete information.

Triggering scenario

Any conversation where the model issues >= 3 parallel tool calls AND (total_messages - protect_last_n) falls in the middle of the resulting tool result messages. With default protect_last_n=4, this happens whenever the last 4 messages start inside a tool result group.

Concrete example with 15 messages (protect_first_n=3, protect_last_n=4):

tc_G's result at [11] is orphaned: parent assistant [4] was summarized, result [11] is in the tail. _sanitize_tool_pairs deletes it. Content lost forever.

The Fix

_align_boundary_backward now walks backward through all consecutive tool results to find the parent assistant, then pulls the boundary before the entire group so it gets summarized together.

Test Plan

6 regression tests added (tests/test_compression_boundary.py):

Clean boundary (no adjustment needed)
Original case: assistant at idx-1
NEW: boundary in middle of tool result group (the bug)
NEW: boundary after last tool result in a group
NEW: consecutive tool groups (only walks to nearest parent)
End-to-end: compress() with 7 parallel tool calls — verifies no orphans and no stub results

Tests confirmed: 3/6 FAIL with old code, 6/6 PASS with fix.

_align_boundary_backward only checked messages[idx-1] when deciding whether the compress-end boundary splits a tool_call/result group. When an assistant issues 3+ parallel tool calls, the resulting tool messages span multiple consecutive positions. If the tail protection boundary (n_messages - protect_last_n) fell in the MIDDLE of that group, the method saw another tool result at idx-1 (not the parent assistant) and did nothing. This caused a silent data loss chain: 1. The parent assistant message was summarized away (middle region) 2. The orphaned tool result(s) remained in the tail 3. _sanitize_tool_pairs removed them as orphans 4. The content was neither in the summary nor preserved — gone forever The fix walks backward through all consecutive tool results to find the parent assistant, then pulls the boundary before the entire group. Includes 6 regression tests covering: - Clean boundaries (no adjustment needed) - Original case (assistant at idx-1) - NEW: boundary in middle of tool result group - NEW: boundary after last tool result in a group - NEW: consecutive tool groups (only walks to nearest parent) - End-to-end: compress() with 7 parallel tool calls

teknium1 · 2026-03-18T22:22:57Z

Merged via PR #1993. Your core fix for _align_boundary_backward was cherry-picked onto current main with authorship preserved. The api_key constructor addition and print→logger cleanup you included were already independently merged to main, so only the boundary fix and your tests were needed. Thank you for catching this silent data loss bug!

…xt override unset Subprocess env builders (_sanitize_subprocess_env, _make_run_env) pin HOME to the profile's home/ dir but only inject HERMES_HOME from the _HERMES_HOME_OVERRIDE ContextVar, which is unset for background/PTY/cron spawns. The child then has HOME={profile}/home but no HERMES_HOME, so get_hermes_home() falls back to ~/.hermes and reads the default profile's config/auth/memory instead of its own — cross-profile data corruption. Add a single os.getenv("HERMES_HOME") fallback in the shared _inject_context_hermes_home() so the common single-profile-gateway case is covered. The ContextVar keeps precedence (per-session profile mutation, NousResearch#1976); only one key (HERMES_HOME, a non-secret path) is touched, so the secret-isolation invariant is intact. Fixes NousResearch#4707 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

teknium1 mentioned this pull request Mar 18, 2026

fix(agent): prevent silent tool result loss during context compression #1993

Merged

teknium1 closed this Mar 18, 2026

spiky02plateau mentioned this pull request Jun 3, 2026

fix(agent): inject process HERMES_HOME into subprocess env when context override unset #38308

Open

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): prevent silent tool result loss during context compression#1976

fix(agent): prevent silent tool result loss during context compression#1976
Gutslabs wants to merge 1 commit into
NousResearch:mainfrom
Gutslabs:fix/compression-tool-result-loss

Gutslabs commented Mar 18, 2026

Uh oh!

teknium1 commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Gutslabs commented Mar 18, 2026

Summary

The Bug

Triggering scenario

The Fix

Test Plan

Uh oh!

teknium1 commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants