Skip to content

fix: preserve context when summary generation fails#31242

Closed
Hinotoi-agent wants to merge 3 commits into
NousResearch:mainfrom
Hinotoi-agent:fix/deterministic-compression-fallback
Closed

fix: preserve context when summary generation fails#31242
Hinotoi-agent wants to merge 3 commits into
NousResearch:mainfrom
Hinotoi-agent:fix/deterministic-compression-fallback

Conversation

@Hinotoi-agent

@Hinotoi-agent Hinotoi-agent commented May 24, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR hardens the context-compaction fallback that runs when Hermes cannot generate an LLM-written handoff summary.

  • Replaces the previous content-free “messages were removed” marker with a deterministic local fallback summary.
  • Preserves recoverable continuity anchors from compacted turns: recent user asks, assistant/tool actions, summarized tool results, relevant file paths, and blocker/error snippets.
  • Keeps secrets redacted before fallback content is persisted into the handoff summary.
  • Adds regression coverage for summary-generation failures so future compactions do not silently lose actionable context.
  • Incorporates the overlap called out with fix(compression): add deterministic handoff for fallback compaction #26189 by preserving bounded last-dropped-turn snippets from the exact compacted window in addition to user asks, tool calls, file paths, and error snippets.

Issues covered

Issue Impact Severity
Summary-generation fallback dropped useful compacted-turn details After a summarizer/provider failure, the next model only learned that earlier messages were removed, making it more likely to lose task continuity or redo work unnecessarily. Reliability / context-continuity fix

Before this PR

  • If context compression needed to remove middle turns and the LLM summary call failed, Hermes inserted a static marker saying that messages were removed.
  • That marker did not preserve locally recoverable details from the removed messages.
  • Tool calls, file paths, user asks, and error snippets that were still available locally were not carried into the fallback handoff.
  • Regression coverage only verified that compression fell back, not that the fallback contained useful continuity context.

After this PR

  • Failed summary generation now produces a deterministic, structured fallback summary using the same handoff-style sections as normal context summaries.
  • The fallback includes bounded, redacted context that can be recovered without another model call.
  • The fallback explicitly tells future turns to verify current file/system state instead of assuming the fallback is complete.
  • The regression tests verify that a failed summary keeps user asks, tool names, relevant file paths, error/blocker snippets, and bounded last-dropped-turn context while avoiding the old content-free removal text.
  • The fallback is built from only the compacted window; protected tail messages remain as live conversation context and are not copied back into the dropped-turn receipt.

Why this matters

Hermes can hit context pressure at exactly the moment summary generation is unavailable. In that state, the safest behavior is not just to acknowledge that context was removed; it should preserve the useful facts that are still locally available. This keeps long-running coding and operations sessions recoverable without depending on a second successful LLM call.

Affected code

Area Files
Context compression fallback agent/context_compressor.py
Regression tests tests/agent/test_context_compressor.py

Root cause

The old fallback path treated summary failure as an all-or-nothing case:

  • LLM summary succeeds: preserve a structured handoff.
  • LLM summary fails: insert a generic marker with no extracted task details.

That meant Hermes already had the compacted turns locally, but did not mine them for safe, low-cost continuity signals before dropping them.

Changes in this PR

  • Adds _build_static_fallback_summary() to construct a deterministic fallback summary from compacted turns.
  • Extracts and deduplicates recent user asks, assistant actions, tool calls, tool-result summaries, relevant file paths, and blocker/error snippets.
  • Redacts fallback content before it is inserted into the summary.
  • Replaces the old static marker path with the deterministic fallback builder.
  • Adds regression coverage for failed summary generation with recovered context.
  • Adds explicit coverage for bounded last-dropped-turn snippets from the exact compacted window without duplicating protected tail messages.
  • Adds GitHub-token-prefix redaction coverage in the deterministic fallback receipt so truncated turn snippets do not reintroduce secret-looking values.
  • Updates the existing fallback test comment to reflect the deterministic fallback behavior.

Files changed

Category Files What changed
Runtime behavior agent/context_compressor.py Builds a structured deterministic handoff when summary generation is unavailable.
Tests tests/agent/test_context_compressor.py Verifies fallback summaries preserve actionable context and avoid the old content-free marker.

Related PRs / duplicate note

  • fix(compression): add deterministic handoff for fallback compaction #26189: same core direction. This PR now folds in the review-relevant behavior from that branch: a bounded deterministic handoff from the exact dropped window, preservation of user asks, tool activity, file/path mentions, error snippets, and recent dropped-turn snippets, plus redaction before insertion. The current branch keeps that behavior in the existing structured handoff-summary format and adds tests for dict-style tool calls, object-style tool calls, nested path extraction, Windows/POSIX paths, bounded output, redaction, and protected-tail exclusion.
  • fix: preserve context on compression failures #26051: broader compression-failure work that preserves full conversation history / native Codex compaction behavior. This PR is narrower: it only changes the lossy fallback path when abort_on_summary_failure is false and Hermes is already dropping a middle window.
  • fix: harden context compression against silent failures #5457: related resilience/logging and post-compaction recovery work. This PR does not change provider routing or empty-response retry behavior; it only makes the summary-failure marker actionable when compaction still proceeds.

If maintainers prefer one canonical branch, this PR is intended as the focused current-base version of the deterministic fallback handoff and can supersede #26189, rather than landing both independently.

Maintainer impact

  • Narrow change limited to context-compression failure handling and its tests.
  • No provider routing, model-calling, gateway, CLI, tool execution, or storage behavior is changed outside the fallback path.
  • Normal LLM-generated summaries are unchanged.
  • The fallback remains deliberately conservative: it is less rich than an LLM summary and tells the next turn to verify current state before making claims.

Fix rationale

  • Deterministic extraction is the right fallback because it works when the summary model/provider is unavailable.
  • Redaction keeps the fallback aligned with existing summary-safety expectations.
  • Bounded snippets and deduplication keep the generated fallback concise enough for context recovery.
  • The test locks in the important behavior: failed summary generation should still preserve the most useful local continuity anchors.

Type of change

  • Security fix
  • Tests
  • Documentation update
  • Refactor with no behavior change
  • Bug fix / reliability improvement

Test plan

  • Focused context-compressor regression suite passed.
  • Touched Python files compile.
  • Diff whitespace check passed.
  • Full repository test suite was not run locally; CI is running the broader matrix.

Executed with:

  • python -m pytest tests/agent/test_context_compressor.py -q -o 'addopts='
  • python -m py_compile agent/context_compressor.py tests/agent/test_context_compressor.py
  • git diff --check

Notes

  • No unrelated source files are changed in this PR.

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels May 24, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #26189 — both add a deterministic handoff fallback builder for ContextCompressor when the LLM summary call fails, extracting user asks, tool calls, file paths, and error snippets from compacted turns. Also related to #26051 and #5457. One of these should be closed in favor of the other.

@Hinotoi-agent

Copy link
Copy Markdown
Contributor Author

Thanks for calling out the overlap. I updated this PR to make the relationship with #26189 explicit and to fold in the missing coverage from that branch rather than leaving two near-identical fallback implementations.

What changed here:

  • The deterministic fallback now also includes a bounded ## Last Dropped Turns section built from the exact compacted window, so it preserves recent dropped-turn snippets in addition to the existing user asks, tool calls, file paths, and error/blocker snippets.
  • The fallback continues to use the structured context-handoff format already used by this PR, with redaction before insertion and an overall fallback-size cap.
  • Added regression coverage that the last dropped turns are preserved while protected tail messages are not copied back into the fallback receipt.
  • Added/kept coverage for dict-style tool calls, object-style tool calls, nested path extraction, POSIX/~/Windows path mentions, error snippets, bounded output, and token-prefix redaction.
  • Updated the PR description with a “Related PRs / duplicate note” section explaining the relationship to fix(compression): add deterministic handoff for fallback compaction #26189, fix: preserve context on compression failures #26051, and fix: harden context compression against silent failures #5457.

My intended resolution is that this branch is the focused current-base version of the deterministic fallback handoff and can supersede #26189, rather than landing both independently. #26051 and #5457 remain related but broader/different in scope: this PR only changes the lossy summary-failure fallback path when compaction still proceeds.

@teknium1

Copy link
Copy Markdown
Contributor

Merged via #34310 (commit 042c1d6 on main). Your 3 commits were cherry-picked onto current main with authorship preserved in git log. Thanks for the careful fallback design — extracting user asks, tool calls, file paths, error snippets, and bounded dropped-turn excerpts with redaction and size caps is exactly the right shape for a deterministic handoff when the LLM summarizer is unavailable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants