fix(compressor,webui): keep last visible assistant reply readable after compaction (#29824)#29862
Closed
xxxigm wants to merge 3 commits into
Closed
Conversation
…summary + label handoffs in WebUI (NousResearch#29824) Two-pronged fix for the WebUI "context compaction block in place of last assistant response" regression. Agent layer (the real fix). ``_find_tail_cut_by_tokens`` already had ``_ensure_last_user_message_in_tail`` to keep the most recent user request out of the compressed middle (NousResearch#10896), but no symmetric anchor for the assistant side. When the conversation has an oversized recent tool result or a long stretch of tool-call/result pairs *after* the assistant's last visible reply, the token-budget walk can stop with the previously-visible reply on the wrong side of ``cut_idx``. The summariser then rolls it into the single ``[CONTEXT COMPACTION — REFERENCE ONLY]`` block persisted as ``role="user"`` or ``role="assistant"``, and from the operator's perspective the WebUI session viewer (``web/src/pages/SessionsPage.tsx``) and the TUI chat panel both suddenly show the opaque "Context compaction" block in the slot where they were just reading the actual answer: User: "i cant see the output of the last message you sent, i did see it previously, however now see 'context compaction'" Added ``_ensure_last_assistant_message_in_tail`` mirror of the user-side anchor. It looks for the most recent assistant message with non-empty text content (skipping tool-call-only assistant "stubs" which the UI renders as small "calling tool X" indicators rather than a readable bubble) and walks ``cut_idx`` back through the standard ``_align_boundary_backward`` so we don't split a tool_call/result group that immediately precedes it. The two anchors are chained — each only walks ``cut_idx`` backward, so the tail can only grow. Falls back to "most recent assistant of any kind" only when no content-bearing reply exists in the compressible region (fresh multi-step tool sequence with no prior reply) — in that case the agent-side fix is effectively a no-op and the existing user-message anchor carries the load. WebUI layer (clarity). Added ``isCompactionMessage`` detector that recognises the ``[CONTEXT COMPACTION — REFERENCE ONLY]`` (current) and ``[CONTEXT SUMMARY]:`` (legacy) prefixes from ``agent/context_compressor.py``, and a new ``compaction`` entry in ``MessageBubble``'s ``ROLE_STYLES`` map. Compaction blocks now render as muted, italicised system-style rows labelled ``Context handoff`` — clearly metadata, not the assistant's actual reply — so an operator scrolling back through a long session can't mistake the summary for a real answer. Keeping the detected prefixes inline (rather than importing them) because the WebUI bundle has no Python interop. A guardrail comment points readers at the source-of-truth constants in ``agent/context_compressor.py``.
…own bubble (NousResearch#29824) The compressor has a "double-collision" fallback path: when the chosen ``summary_role`` collides with the first tail message AND the flipped role would collide with the last head message, it can't emit a standalone summary turn (consecutive same-role messages break Anthropic and friends). It instead prepends the summary + end-of-summary marker to the first tail message's content via ``_merge_summary_into_tail``. With the matching anchor from the previous commit, that first tail message is now usually the user's previously-visible assistant reply — so the persisted assistant turn ends up shaped as ``[CONTEXT COMPACTION ...] ... --- END OF CONTEXT SUMMARY --- ... THE ACTUAL REPLY``. Without splitting it, the session viewer renders one big "Context handoff" bubble and the reply text is buried inside the metadata blob — which is exactly the "can't see the last reply" experience NousResearch#29824 reports, just one layer deeper. Added ``splitCompactionContent`` that detects the merge marker (kept in sync with ``--- END OF CONTEXT SUMMARY — respond to the message below, not the summary above ---`` in ``agent/context_compressor.py``) and ``MessageBubble`` now recurses on the two halves: the prefix half renders as the muted "Context handoff" row, the remainder half renders with the original assistant styling. Pure (non-merged) summary messages hit the no-remainder branch and still render as a single "Context handoff" row, preserving the original behaviour.
…paction rollup (NousResearch#29824) 21 cases pinning the new ``_ensure_last_assistant_message_in_tail`` anchor and its interaction with the existing tail-cut path: * ``TestFindLastAssistantMessageIdx`` — helper contract: prefers a content-bearing assistant message, skips ``tool_calls``-only stubs, multimodal text-block content counts, falls back to "any assistant" when no content-bearing reply exists, honours ``head_end``, returns -1 when there's none. * ``TestEnsureLastAssistantMessageInTail`` — direct: no-op when already in the tail, walks ``cut_idx`` back when the reply is in the compressed middle, never crosses into the head region, re-aligns through a preceding ``tool_call`` / ``tool_result`` group instead of orphaning it. * ``TestFindTailCutByTokensAnchorsAssistant`` — integration: reporter repro (long tool-output run after the visible reply) now preserves the reply; user and assistant anchors compose in a single tail-cut call; a soft-ceiling-overrunning oversized tool result no longer strands the prior reply. * ``TestCompactionRollupReproduction`` — end-to-end through ``compress()`` with a stubbed ``_generate_summary``: the visible reply text survives either as its own standalone assistant message (normal path) or concatenated onto the merged summary tail (double-collision path the WebUI then re-splits). The standalone-summary case is asserted strictly (exactly one summary row, exactly one separate assistant row carrying the reply) — that's the dominant path and any drift there reintroduces the original bug. * ``TestSourceGuardrail`` — static asserts on ``agent/context_compressor.py``: the helper exists, the anchor is wired into ``_find_tail_cut_by_tokens`` AFTER the user-message anchor (so chaining is monotonic), the content-bearing preference is preserved, and the issue number is referenced so future bisects can find this fix.
Contributor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fix for #29824 — after context compaction the WebUI session viewer (and the TUI chat panel) can show the
[CONTEXT COMPACTION — REFERENCE ONLY]handoff block in the slot where the user had just been reading the assistant's actual reply. The reporter put it best:The underlying work isn't lost — the assistant's reply is rolled into the compaction summary written into the session DB, so when the operator scrolls back they see opaque metadata where their answer used to be.
Two-pronged fix:
Agent layer (the real fix —
agent/context_compressor.py)._find_tail_cut_by_tokensalready had_ensure_last_user_message_in_tailanchoring the most recent user request out of the compressed middle (#10896), but no symmetric anchor on the assistant side. When the conversation has an oversized recent tool result or a long stretch oftool_call/tool_resultpairs after the assistant's last visible reply, the token-budget walk can stop with the previously-visible reply on the wrong side ofcut_idxand the summariser rolls it into the single handoff blob.Added
_ensure_last_assistant_message_in_tailmirror that walkscut_idxback to include the most recent assistant message with non-empty text content — deliberately skippingtool_calls-only stubs, which the UI renders as small "calling tool X" indicators rather than a readable bubble, so they aren't what the reporter means by "the last message you sent". Re-runs the standard_align_boundary_backwardso we don't split a preceding tool group. The two anchors are chained — each only walkscut_idxbackward, so the tail can only grow.WebUI layer (clarity + double-collision handling —
web/src/pages/SessionsPage.tsx).Added
isCompactionMessage/splitCompactionContentdetection of the[CONTEXT COMPACTION — REFERENCE ONLY](current) and[CONTEXT SUMMARY]:(legacy) prefixes, plus a newcompactionentry inMessageBubble'sROLE_STYLESmap. Standalone summary messages now render as muted, italicisedContext handoffrows instead of pretending to be regular assistant replies.The compressor's double-collision fallback
_merge_summary_into_tailprepends the summary +--- END OF CONTEXT SUMMARY — respond to the message below … ---marker onto the first tail message. With the assistant anchor that tail message is now usually the visible reply, so the persisted assistant turn becomes[SUMMARY] … END MARKER … <real reply>. Without splitting it, the viewer renders one big "Context handoff" bubble with the reply buried inside the metadata — same bug, one layer deeper.MessageBubblenow detects the marker viasplitCompactionContentand recurses on the two halves, so the summary renders as the muted handoff row and the actual reply gets its normal styling.Related Issue
Fixes #29824.
Type of Change
Changes Made
agent/context_compressor.py— added_find_last_assistant_message_idxand_ensure_last_assistant_message_in_tail(mirror of the existing user-side anchor; prefers content-bearing assistant messages sotool_calls-only stubs don't shadow the real reply). Wired the new anchor into_find_tail_cut_by_tokensimmediately after the user anchor — chained, monotonic.web/src/pages/SessionsPage.tsx— newCOMPACTION_PREFIXES/COMPACTION_END_MARKERconstants (cross-referenced to the Python source-of-truth),splitCompactionContenthelper,compactionentry inROLE_STYLES, and a recursive split inMessageBubbleso merged-into-tail compactions render as two visually distinct rows.tests/agent/test_compressor_assistant_tail_anchor.py(new, 518 lines) — 21 regression cases across five classes covering the helper contract, the anchor function, integration with the existing tail-cut path, the end-to-endcompress()rollup repro, and a source guardrail that fails the build if the anchor is silently removed or re-ordered.How to Test
.venvis set up:python3 -m venv .venv && source .venv/bin/activate && pip install -e ".[all,dev]".scripts/run_tests.sh tests/agent/test_compressor_assistant_tail_anchor.pyExpected: 21 passed.
scripts/run_tests.sh tests/agent/test_context_compressor.py tests/agent/test_compressor_assistant_tail_anchor.py tests/agent/test_context_compressor_summary_continuity.py tests/agent/test_compressor_image_tokens.py tests/run_agent/test_compression_boundary.py tests/run_agent/test_compression_boundary_hook.py tests/run_agent/test_compression_feasibility.py tests/run_agent/test_compression_persistence.py tests/run_agent/test_413_compression.py tests/cli/test_manual_compress.pyExpected: 170 passed.
(cd web && npm install --no-audit --no-fund && npx tsc -b)Expected: exits 0 with no diagnostics.
Checklist
Code
fix(compressor): …,fix(webui): …,test(compressor): …)scripts/run_tests.shon the affected suites and all tests passtests/agent/test_compressor_assistant_tail_anchor.py, including a source-guardrail class so a refactor can't silently drop the anchor)Documentation & Housekeeping
docs/, docstrings) — extensive docstrings on the new helpers describing the WebUI symptom and pointing atweb/src/pages/SessionsPage.tsx; the WebUI side has source-of-truth cross-reference comments back toagent/context_compressor.py. No user-facing docs change required (no new config knobs, env vars, or behaviour the operator needs to know about — the fix is invisible until they look for the previously-disappeared reply).cli-config.yaml.exampleif I added/changed config keys — N/A (no new config).CONTRIBUTING.mdorAGENTS.mdif I changed architecture or workflows — N/A.Screenshots / Logs