fix(compressor,webui): keep last visible assistant reply readable after compaction (#29824) by xxxigm · Pull Request #29862 · NousResearch/hermes-agent

xxxigm · 2026-05-21T14:18:45Z

What does this PR do?

Fix for #29824 — after context compaction the WebUI session viewer (and the TUI chat panel) can show the [CONTEXT COMPACTION — REFERENCE ONLY] handoff block in the slot where the user had just been reading the assistant's actual reply. The reporter put it best:

i cant see the output of the last message you sent, i did see it previously, however now see 'context compaction'

The underlying work isn't lost — the assistant's reply is rolled into the compaction summary written into the session DB, so when the operator scrolls back they see opaque metadata where their answer used to be.

Two-pronged fix:

Agent layer (the real fix — agent/context_compressor.py). _find_tail_cut_by_tokens already had _ensure_last_user_message_in_tail anchoring the most recent user request out of the compressed middle (#10896), but no symmetric anchor on the assistant side. When the conversation has an oversized recent tool result or a long stretch of tool_call / tool_result pairs after the assistant's last visible reply, the token-budget walk can stop with the previously-visible reply on the wrong side of cut_idx and the summariser rolls it into the single handoff blob.

Added _ensure_last_assistant_message_in_tail mirror that walks cut_idx back to include the most recent assistant message with non-empty text content — deliberately skipping tool_calls-only stubs, which the UI renders as small "calling tool X" indicators rather than a readable bubble, so they aren't what the reporter means by "the last message you sent". Re-runs the standard _align_boundary_backward so we don't split a preceding tool group. The two anchors are chained — each only walks cut_idx backward, so the tail can only grow.

WebUI layer (clarity + double-collision handling — web/src/pages/SessionsPage.tsx).

Added isCompactionMessage/splitCompactionContent detection of the [CONTEXT COMPACTION — REFERENCE ONLY] (current) and [CONTEXT SUMMARY]: (legacy) prefixes, plus a new compaction entry in MessageBubble's ROLE_STYLES map. Standalone summary messages now render as muted, italicised Context handoff rows instead of pretending to be regular assistant replies.
The compressor's double-collision fallback _merge_summary_into_tail prepends the summary + --- END OF CONTEXT SUMMARY — respond to the message below … --- marker onto the first tail message. With the assistant anchor that tail message is now usually the visible reply, so the persisted assistant turn becomes [SUMMARY] … END MARKER … <real reply>. Without splitting it, the viewer renders one big "Context handoff" bubble with the reply buried inside the metadata — same bug, one layer deeper. MessageBubble now detects the marker via splitCompactionContent and recurses on the two halves, so the summary renders as the muted handoff row and the actual reply gets its normal styling.

Related Issue

Fixes #29824.

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
🔒 Security fix
📝 Documentation update
✅ Tests (adding or improving test coverage)
♻️ Refactor (no behavior change)
🎯 New skill (bundled or hub)

Changes Made

agent/context_compressor.py — added _find_last_assistant_message_idx and _ensure_last_assistant_message_in_tail (mirror of the existing user-side anchor; prefers content-bearing assistant messages so tool_calls-only stubs don't shadow the real reply). Wired the new anchor into _find_tail_cut_by_tokens immediately after the user anchor — chained, monotonic.
web/src/pages/SessionsPage.tsx — new COMPACTION_PREFIXES / COMPACTION_END_MARKER constants (cross-referenced to the Python source-of-truth), splitCompactionContent helper, compaction entry in ROLE_STYLES, and a recursive split in MessageBubble so merged-into-tail compactions render as two visually distinct rows.
tests/agent/test_compressor_assistant_tail_anchor.py (new, 518 lines) — 21 regression cases across five classes covering the helper contract, the anchor function, integration with the existing tail-cut path, the end-to-end compress() rollup repro, and a source guardrail that fails the build if the anchor is silently removed or re-ordered.

How to Test

Check out this branch and ensure .venv is set up: python3 -m venv .venv && source .venv/bin/activate && pip install -e ".[all,dev]".
Run the new regression suite:
scripts/run_tests.sh tests/agent/test_compressor_assistant_tail_anchor.py
Expected: 21 passed.
Run the broader compression-related test sweep to confirm no cross-file regressions:
scripts/run_tests.sh tests/agent/test_context_compressor.py tests/agent/test_compressor_assistant_tail_anchor.py tests/agent/test_context_compressor_summary_continuity.py tests/agent/test_compressor_image_tokens.py tests/run_agent/test_compression_boundary.py tests/run_agent/test_compression_boundary_hook.py tests/run_agent/test_compression_feasibility.py tests/run_agent/test_compression_persistence.py tests/run_agent/test_413_compression.py tests/cli/test_manual_compress.py
Expected: 170 passed.
Type-check the WebUI:
(cd web && npm install --no-audit --no-fund && npx tsc -b)
Expected: exits 0 with no diagnostics.

Checklist

Code

I've read the Contributing Guide
My commit messages follow Conventional Commits (fix(compressor): …, fix(webui): …, test(compressor): …)
I searched for existing PRs to make sure this isn't a duplicate
My PR contains only changes related to this fix (no unrelated commits)
I've run scripts/run_tests.sh on the affected suites and all tests pass
I've added tests for my changes (21 new cases in tests/agent/test_compressor_assistant_tail_anchor.py, including a source-guardrail class so a refactor can't silently drop the anchor)
I've tested on my platform: macOS 15.2 (Darwin 24.6.0), Python 3.12, Node 22

Documentation & Housekeeping

I've updated relevant documentation (README, docs/, docstrings) — extensive docstrings on the new helpers describing the WebUI symptom and pointing at web/src/pages/SessionsPage.tsx; the WebUI side has source-of-truth cross-reference comments back to agent/context_compressor.py. No user-facing docs change required (no new config knobs, env vars, or behaviour the operator needs to know about — the fix is invisible until they look for the previously-disappeared reply).
I've updated cli-config.yaml.example if I added/changed config keys — N/A (no new config).
I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A.
I've considered cross-platform impact (Windows, macOS) per the compatibility guide — purely Python list ops + TSX rendering, no platform-specific code paths.
I've updated tool descriptions/schemas if I changed tool behavior — N/A.

Screenshots / Logs

$ scripts/run_tests.sh tests/agent/test_compressor_assistant_tail_anchor.py
Discovered 1 test files (21 tests) under [...]; running with -j 24
[100.0% |    21/21 | ✓21 | ✗ 0] ✓ tests/agent/test_compressor_assistant_tail_anchor.py (21✓, 0.4s)
=== Summary: 1 files, 21 tests passed, 0 failed (100% complete) in 0.4s (24 workers) ===

$ scripts/run_tests.sh tests/agent/test_context_compressor.py tests/agent/test_compressor_assistant_tail_anchor.py tests/agent/test_context_compressor_summary_continuity.py tests/agent/test_compressor_image_tokens.py tests/run_agent/test_compression_boundary.py tests/run_agent/test_compression_boundary_hook.py tests/run_agent/test_compression_feasibility.py tests/run_agent/test_compression_persistence.py tests/run_agent/test_413_compression.py tests/cli/test_manual_compress.py
Discovered 10 test files (170 tests) under [...]; running with -j 24
[100.0% |   170/170 | ✓170 | ✗  0] ✓ ... (10 files)
=== Summary: 10 files, 170 tests passed, 0 failed (100% complete) in 9.3s (24 workers) ===

$ ./web/node_modules/.bin/tsc -b web
(exit 0, no diagnostics)

…summary + label handoffs in WebUI (NousResearch#29824) Two-pronged fix for the WebUI "context compaction block in place of last assistant response" regression. Agent layer (the real fix). ``_find_tail_cut_by_tokens`` already had ``_ensure_last_user_message_in_tail`` to keep the most recent user request out of the compressed middle (NousResearch#10896), but no symmetric anchor for the assistant side. When the conversation has an oversized recent tool result or a long stretch of tool-call/result pairs *after* the assistant's last visible reply, the token-budget walk can stop with the previously-visible reply on the wrong side of ``cut_idx``. The summariser then rolls it into the single ``[CONTEXT COMPACTION — REFERENCE ONLY]`` block persisted as ``role="user"`` or ``role="assistant"``, and from the operator's perspective the WebUI session viewer (``web/src/pages/SessionsPage.tsx``) and the TUI chat panel both suddenly show the opaque "Context compaction" block in the slot where they were just reading the actual answer: User: "i cant see the output of the last message you sent, i did see it previously, however now see 'context compaction'" Added ``_ensure_last_assistant_message_in_tail`` mirror of the user-side anchor. It looks for the most recent assistant message with non-empty text content (skipping tool-call-only assistant "stubs" which the UI renders as small "calling tool X" indicators rather than a readable bubble) and walks ``cut_idx`` back through the standard ``_align_boundary_backward`` so we don't split a tool_call/result group that immediately precedes it. The two anchors are chained — each only walks ``cut_idx`` backward, so the tail can only grow. Falls back to "most recent assistant of any kind" only when no content-bearing reply exists in the compressible region (fresh multi-step tool sequence with no prior reply) — in that case the agent-side fix is effectively a no-op and the existing user-message anchor carries the load. WebUI layer (clarity). Added ``isCompactionMessage`` detector that recognises the ``[CONTEXT COMPACTION — REFERENCE ONLY]`` (current) and ``[CONTEXT SUMMARY]:`` (legacy) prefixes from ``agent/context_compressor.py``, and a new ``compaction`` entry in ``MessageBubble``'s ``ROLE_STYLES`` map. Compaction blocks now render as muted, italicised system-style rows labelled ``Context handoff`` — clearly metadata, not the assistant's actual reply — so an operator scrolling back through a long session can't mistake the summary for a real answer. Keeping the detected prefixes inline (rather than importing them) because the WebUI bundle has no Python interop. A guardrail comment points readers at the source-of-truth constants in ``agent/context_compressor.py``.

…own bubble (NousResearch#29824) The compressor has a "double-collision" fallback path: when the chosen ``summary_role`` collides with the first tail message AND the flipped role would collide with the last head message, it can't emit a standalone summary turn (consecutive same-role messages break Anthropic and friends). It instead prepends the summary + end-of-summary marker to the first tail message's content via ``_merge_summary_into_tail``. With the matching anchor from the previous commit, that first tail message is now usually the user's previously-visible assistant reply — so the persisted assistant turn ends up shaped as ``[CONTEXT COMPACTION ...] ... --- END OF CONTEXT SUMMARY --- ... THE ACTUAL REPLY``. Without splitting it, the session viewer renders one big "Context handoff" bubble and the reply text is buried inside the metadata blob — which is exactly the "can't see the last reply" experience NousResearch#29824 reports, just one layer deeper. Added ``splitCompactionContent`` that detects the merge marker (kept in sync with ``--- END OF CONTEXT SUMMARY — respond to the message below, not the summary above ---`` in ``agent/context_compressor.py``) and ``MessageBubble`` now recurses on the two halves: the prefix half renders as the muted "Context handoff" row, the remainder half renders with the original assistant styling. Pure (non-merged) summary messages hit the no-remainder branch and still render as a single "Context handoff" row, preserving the original behaviour.

…paction rollup (NousResearch#29824) 21 cases pinning the new ``_ensure_last_assistant_message_in_tail`` anchor and its interaction with the existing tail-cut path: * ``TestFindLastAssistantMessageIdx`` — helper contract: prefers a content-bearing assistant message, skips ``tool_calls``-only stubs, multimodal text-block content counts, falls back to "any assistant" when no content-bearing reply exists, honours ``head_end``, returns -1 when there's none. * ``TestEnsureLastAssistantMessageInTail`` — direct: no-op when already in the tail, walks ``cut_idx`` back when the reply is in the compressed middle, never crosses into the head region, re-aligns through a preceding ``tool_call`` / ``tool_result`` group instead of orphaning it. * ``TestFindTailCutByTokensAnchorsAssistant`` — integration: reporter repro (long tool-output run after the visible reply) now preserves the reply; user and assistant anchors compose in a single tail-cut call; a soft-ceiling-overrunning oversized tool result no longer strands the prior reply. * ``TestCompactionRollupReproduction`` — end-to-end through ``compress()`` with a stubbed ``_generate_summary``: the visible reply text survives either as its own standalone assistant message (normal path) or concatenated onto the merged summary tail (double-collision path the WebUI then re-splits). The standalone-summary case is asserted strictly (exactly one summary row, exactly one separate assistant row carrying the reply) — that's the dominant path and any drift there reintroduces the original bug. * ``TestSourceGuardrail`` — static asserts on ``agent/context_compressor.py``: the helper exists, the anchor is wired into ``_find_tail_cut_by_tokens`` AFTER the user-message anchor (so chaining is monotonic), the content-bearing preference is preserved, and the issue number is referenced so future bisects can find this fix.

teknium1 · 2026-06-12T22:42:15Z

Merged via PR #45249 — all three of your commits were cherry-picked onto current main with your authorship preserved in git log (691ff7c, 2fef3e2, 68536d4). Verified your WebUI end-marker constant still byte-matches the compressor's marker after our recent _SUMMARY_END_MARKER hoist. Thanks!

#45249

xxxigm added 3 commits May 21, 2026 20:59

alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists labels May 21, 2026

alt-glitch mentioned this pull request May 27, 2026

Context compression summary leaks into user-visible chat output #33256

Closed

This was referenced Jun 12, 2026

fix(compressor,webui): keep last visible assistant reply readable after compaction (salvages #29862) #45249

Merged

WebUI can show context compaction block instead of latest assistant response after compression #29824

Closed

teknium1 closed this Jun 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(compressor,webui): keep last visible assistant reply readable after compaction (#29824)#29862

fix(compressor,webui): keep last visible assistant reply readable after compaction (#29824)#29862
xxxigm wants to merge 3 commits into
NousResearch:mainfrom
xxxigm:fix/29824-webui-compaction-hides-last-message

xxxigm commented May 21, 2026 •

edited

Loading

Uh oh!

teknium1 commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

xxxigm commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Related Issue

Type of Change

Changes Made

How to Test

Checklist

Code

Documentation & Housekeeping

Screenshots / Logs

Uh oh!

teknium1 commented Jun 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xxxigm commented May 21, 2026 •

edited

Loading