Skip to content

fix(compressor,webui): keep last visible assistant reply readable after compaction (#29824)#29862

Closed
xxxigm wants to merge 3 commits into
NousResearch:mainfrom
xxxigm:fix/29824-webui-compaction-hides-last-message
Closed

fix(compressor,webui): keep last visible assistant reply readable after compaction (#29824)#29862
xxxigm wants to merge 3 commits into
NousResearch:mainfrom
xxxigm:fix/29824-webui-compaction-hides-last-message

Conversation

@xxxigm

@xxxigm xxxigm commented May 21, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Fix for #29824 — after context compaction the WebUI session viewer (and the TUI chat panel) can show the [CONTEXT COMPACTION — REFERENCE ONLY] handoff block in the slot where the user had just been reading the assistant's actual reply. The reporter put it best:

i cant see the output of the last message you sent, i did see it previously, however now see 'context compaction'

The underlying work isn't lost — the assistant's reply is rolled into the compaction summary written into the session DB, so when the operator scrolls back they see opaque metadata where their answer used to be.

Two-pronged fix:

Agent layer (the real fix — agent/context_compressor.py). _find_tail_cut_by_tokens already had _ensure_last_user_message_in_tail anchoring the most recent user request out of the compressed middle (#10896), but no symmetric anchor on the assistant side. When the conversation has an oversized recent tool result or a long stretch of tool_call / tool_result pairs after the assistant's last visible reply, the token-budget walk can stop with the previously-visible reply on the wrong side of cut_idx and the summariser rolls it into the single handoff blob.

Added _ensure_last_assistant_message_in_tail mirror that walks cut_idx back to include the most recent assistant message with non-empty text content — deliberately skipping tool_calls-only stubs, which the UI renders as small "calling tool X" indicators rather than a readable bubble, so they aren't what the reporter means by "the last message you sent". Re-runs the standard _align_boundary_backward so we don't split a preceding tool group. The two anchors are chained — each only walks cut_idx backward, so the tail can only grow.

WebUI layer (clarity + double-collision handling — web/src/pages/SessionsPage.tsx).

  1. Added isCompactionMessage/splitCompactionContent detection of the [CONTEXT COMPACTION — REFERENCE ONLY] (current) and [CONTEXT SUMMARY]: (legacy) prefixes, plus a new compaction entry in MessageBubble's ROLE_STYLES map. Standalone summary messages now render as muted, italicised Context handoff rows instead of pretending to be regular assistant replies.

  2. The compressor's double-collision fallback _merge_summary_into_tail prepends the summary + --- END OF CONTEXT SUMMARY — respond to the message below … --- marker onto the first tail message. With the assistant anchor that tail message is now usually the visible reply, so the persisted assistant turn becomes [SUMMARY] … END MARKER … <real reply>. Without splitting it, the viewer renders one big "Context handoff" bubble with the reply buried inside the metadata — same bug, one layer deeper. MessageBubble now detects the marker via splitCompactionContent and recurses on the two halves, so the summary renders as the muted handoff row and the actual reply gets its normal styling.

Related Issue

Fixes #29824.

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • agent/context_compressor.py — added _find_last_assistant_message_idx and _ensure_last_assistant_message_in_tail (mirror of the existing user-side anchor; prefers content-bearing assistant messages so tool_calls-only stubs don't shadow the real reply). Wired the new anchor into _find_tail_cut_by_tokens immediately after the user anchor — chained, monotonic.
  • web/src/pages/SessionsPage.tsx — new COMPACTION_PREFIXES / COMPACTION_END_MARKER constants (cross-referenced to the Python source-of-truth), splitCompactionContent helper, compaction entry in ROLE_STYLES, and a recursive split in MessageBubble so merged-into-tail compactions render as two visually distinct rows.
  • tests/agent/test_compressor_assistant_tail_anchor.py (new, 518 lines) — 21 regression cases across five classes covering the helper contract, the anchor function, integration with the existing tail-cut path, the end-to-end compress() rollup repro, and a source guardrail that fails the build if the anchor is silently removed or re-ordered.

How to Test

  1. Check out this branch and ensure .venv is set up: python3 -m venv .venv && source .venv/bin/activate && pip install -e ".[all,dev]".
  2. Run the new regression suite:
    scripts/run_tests.sh tests/agent/test_compressor_assistant_tail_anchor.py
    Expected: 21 passed.
  3. Run the broader compression-related test sweep to confirm no cross-file regressions:
    scripts/run_tests.sh tests/agent/test_context_compressor.py tests/agent/test_compressor_assistant_tail_anchor.py tests/agent/test_context_compressor_summary_continuity.py tests/agent/test_compressor_image_tokens.py tests/run_agent/test_compression_boundary.py tests/run_agent/test_compression_boundary_hook.py tests/run_agent/test_compression_feasibility.py tests/run_agent/test_compression_persistence.py tests/run_agent/test_413_compression.py tests/cli/test_manual_compress.py
    Expected: 170 passed.
  4. Type-check the WebUI:
    (cd web && npm install --no-audit --no-fund && npx tsc -b)
    Expected: exits 0 with no diagnostics.

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(compressor): …, fix(webui): …, test(compressor): …)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix (no unrelated commits)
  • I've run scripts/run_tests.sh on the affected suites and all tests pass
  • I've added tests for my changes (21 new cases in tests/agent/test_compressor_assistant_tail_anchor.py, including a source-guardrail class so a refactor can't silently drop the anchor)
  • I've tested on my platform: macOS 15.2 (Darwin 24.6.0), Python 3.12, Node 22

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — extensive docstrings on the new helpers describing the WebUI symptom and pointing at web/src/pages/SessionsPage.tsx; the WebUI side has source-of-truth cross-reference comments back to agent/context_compressor.py. No user-facing docs change required (no new config knobs, env vars, or behaviour the operator needs to know about — the fix is invisible until they look for the previously-disappeared reply).
  • I've updated cli-config.yaml.example if I added/changed config keys — N/A (no new config).
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A.
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — purely Python list ops + TSX rendering, no platform-specific code paths.
  • I've updated tool descriptions/schemas if I changed tool behavior — N/A.

Screenshots / Logs

$ scripts/run_tests.sh tests/agent/test_compressor_assistant_tail_anchor.py
Discovered 1 test files (21 tests) under [...]; running with -j 24
[100.0% |    21/21 | ✓21 | ✗ 0] ✓ tests/agent/test_compressor_assistant_tail_anchor.py (21✓, 0.4s)
=== Summary: 1 files, 21 tests passed, 0 failed (100% complete) in 0.4s (24 workers) ===

$ scripts/run_tests.sh tests/agent/test_context_compressor.py tests/agent/test_compressor_assistant_tail_anchor.py tests/agent/test_context_compressor_summary_continuity.py tests/agent/test_compressor_image_tokens.py tests/run_agent/test_compression_boundary.py tests/run_agent/test_compression_boundary_hook.py tests/run_agent/test_compression_feasibility.py tests/run_agent/test_compression_persistence.py tests/run_agent/test_413_compression.py tests/cli/test_manual_compress.py
Discovered 10 test files (170 tests) under [...]; running with -j 24
[100.0% |   170/170 | ✓170 | ✗  0] ✓ ... (10 files)
=== Summary: 10 files, 170 tests passed, 0 failed (100% complete) in 9.3s (24 workers) ===

$ ./web/node_modules/.bin/tsc -b web
(exit 0, no diagnostics)

xxxigm added 3 commits May 21, 2026 20:59
…summary + label handoffs in WebUI (NousResearch#29824)

Two-pronged fix for the WebUI "context compaction block in place of
last assistant response" regression.

Agent layer (the real fix). ``_find_tail_cut_by_tokens`` already had
``_ensure_last_user_message_in_tail`` to keep the most recent user
request out of the compressed middle (NousResearch#10896), but no symmetric
anchor for the assistant side. When the conversation has an
oversized recent tool result or a long stretch of tool-call/result
pairs *after* the assistant's last visible reply, the token-budget
walk can stop with the previously-visible reply on the wrong side
of ``cut_idx``. The summariser then rolls it into the single
``[CONTEXT COMPACTION — REFERENCE ONLY]`` block persisted as
``role="user"`` or ``role="assistant"``, and from the operator's
perspective the WebUI session viewer
(``web/src/pages/SessionsPage.tsx``) and the TUI chat panel both
suddenly show the opaque "Context compaction" block in the slot
where they were just reading the actual answer:

    User:  "i cant see the output of the last message you sent,
            i did see it previously, however now see 'context
            compaction'"

Added ``_ensure_last_assistant_message_in_tail`` mirror of the
user-side anchor. It looks for the most recent assistant message
with non-empty text content (skipping tool-call-only assistant
"stubs" which the UI renders as small "calling tool X" indicators
rather than a readable bubble) and walks ``cut_idx`` back through
the standard ``_align_boundary_backward`` so we don't split a
tool_call/result group that immediately precedes it. The two
anchors are chained — each only walks ``cut_idx`` backward, so
the tail can only grow.

Falls back to "most recent assistant of any kind" only when no
content-bearing reply exists in the compressible region (fresh
multi-step tool sequence with no prior reply) — in that case the
agent-side fix is effectively a no-op and the existing
user-message anchor carries the load.

WebUI layer (clarity). Added ``isCompactionMessage`` detector that
recognises the ``[CONTEXT COMPACTION — REFERENCE ONLY]`` (current)
and ``[CONTEXT SUMMARY]:`` (legacy) prefixes from
``agent/context_compressor.py``, and a new ``compaction`` entry
in ``MessageBubble``'s ``ROLE_STYLES`` map. Compaction blocks
now render as muted, italicised system-style rows labelled
``Context handoff`` — clearly metadata, not the assistant's
actual reply — so an operator scrolling back through a long
session can't mistake the summary for a real answer.

Keeping the detected prefixes inline (rather than importing them)
because the WebUI bundle has no Python interop. A guardrail comment
points readers at the source-of-truth constants in
``agent/context_compressor.py``.
…own bubble (NousResearch#29824)

The compressor has a "double-collision" fallback path: when the
chosen ``summary_role`` collides with the first tail message AND
the flipped role would collide with the last head message, it can't
emit a standalone summary turn (consecutive same-role messages
break Anthropic and friends). It instead prepends the summary +
end-of-summary marker to the first tail message's content via
``_merge_summary_into_tail``.

With the matching anchor from the previous commit, that first tail
message is now usually the user's previously-visible assistant
reply — so the persisted assistant turn ends up shaped as
``[CONTEXT COMPACTION ...] ... --- END OF CONTEXT SUMMARY --- ...
THE ACTUAL REPLY``. Without splitting it, the session viewer
renders one big "Context handoff" bubble and the reply text is
buried inside the metadata blob — which is exactly the
"can't see the last reply" experience NousResearch#29824 reports, just one
layer deeper.

Added ``splitCompactionContent`` that detects the merge marker
(kept in sync with ``--- END OF CONTEXT SUMMARY — respond to the
message below, not the summary above ---`` in
``agent/context_compressor.py``) and ``MessageBubble`` now
recurses on the two halves: the prefix half renders as the muted
"Context handoff" row, the remainder half renders with the
original assistant styling. Pure (non-merged) summary messages
hit the no-remainder branch and still render as a single
"Context handoff" row, preserving the original behaviour.
…paction rollup (NousResearch#29824)

21 cases pinning the new ``_ensure_last_assistant_message_in_tail``
anchor and its interaction with the existing tail-cut path:

* ``TestFindLastAssistantMessageIdx`` — helper contract: prefers a
  content-bearing assistant message, skips ``tool_calls``-only
  stubs, multimodal text-block content counts, falls back to
  "any assistant" when no content-bearing reply exists, honours
  ``head_end``, returns -1 when there's none.

* ``TestEnsureLastAssistantMessageInTail`` — direct: no-op when
  already in the tail, walks ``cut_idx`` back when the reply is
  in the compressed middle, never crosses into the head region,
  re-aligns through a preceding ``tool_call`` / ``tool_result``
  group instead of orphaning it.

* ``TestFindTailCutByTokensAnchorsAssistant`` — integration:
  reporter repro (long tool-output run after the visible reply)
  now preserves the reply; user and assistant anchors compose
  in a single tail-cut call; a soft-ceiling-overrunning oversized
  tool result no longer strands the prior reply.

* ``TestCompactionRollupReproduction`` — end-to-end through
  ``compress()`` with a stubbed ``_generate_summary``: the
  visible reply text survives either as its own standalone
  assistant message (normal path) or concatenated onto the
  merged summary tail (double-collision path the WebUI then
  re-splits). The standalone-summary case is asserted strictly
  (exactly one summary row, exactly one separate assistant
  row carrying the reply) — that's the dominant path and any
  drift there reintroduces the original bug.

* ``TestSourceGuardrail`` — static asserts on
  ``agent/context_compressor.py``: the helper exists, the
  anchor is wired into ``_find_tail_cut_by_tokens`` AFTER the
  user-message anchor (so chaining is monotonic), the
  content-bearing preference is preserved, and the issue
  number is referenced so future bisects can find this fix.
@teknium1

Copy link
Copy Markdown
Contributor

Merged via PR #45249 — all three of your commits were cherry-picked onto current main with your authorship preserved in git log (691ff7c, 2fef3e2, 68536d4). Verified your WebUI end-marker constant still byte-matches the compressor's marker after our recent _SUMMARY_END_MARKER hoist. Thanks!

#45249

@teknium1 teknium1 closed this Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WebUI can show context compaction block instead of latest assistant response after compression

3 participants