Skip to content

fix(compressor,webui): keep last visible assistant reply readable after compaction (salvages #29862)#45249

Merged
teknium1 merged 3 commits into
mainfrom
fix/compaction-tail-assistant-anchor
Jun 12, 2026
Merged

fix(compressor,webui): keep last visible assistant reply readable after compaction (salvages #29862)#45249
teknium1 merged 3 commits into
mainfrom
fix/compaction-tail-assistant-anchor

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

The most recent user-visible assistant reply is now anchored into the protected tail during compaction, so it can no longer be rolled into the [CONTEXT COMPACTION — REFERENCE ONLY] block — and the WebUI renders compaction handoffs as muted "Context handoff" rows instead of impersonating real messages (#29824).

Closes #29824.

Changes

  • agent/context_compressor.py: _ensure_last_assistant_message_in_tail — mirror of the existing last-user-message anchor, applied after it (anchors are monotonic, tail only grows); skips tool-call-only assistant messages, re-aligns tool groups so no orphaned tool rows
  • web/src/pages/SessionsPage.tsx: compaction handoff blocks detected by prefix and rendered as muted/labelled rows; merge-into-tail content split on the END marker so the original reply renders as its own bubble
  • tests/agent/test_compressor_assistant_tail_anchor.py: 19 regression tests (518 lines)

All three commits by @xxxigm (#29862), cherry-picked clean — no conflicts with the just-merged #45153 marker hoist; verified the TSX COMPACTION_END_MARKER byte-matches _SUMMARY_END_MARKER.

Validation

Before After
Long tool stretch after last reply reply compressed into handoff block, WebUI shows "Context compaction" where reply was reply anchored in tail, still visible
WebUI handoff rows styled as real user/assistant bubbles muted "Context handoff" row, merged tail split into its own bubble
anchor + compressor + prefix + continuity suites 152 passed*
web build (tsc -b && vite build) clean

*4 pre-existing SSL-socket env flakes in test_context_compressor.py reproduce identically on clean origin/main (verified in a throwaway worktree); not introduced here.

Attribution

Salvages #29862 by @xxxigm — three commits cherry-picked with authorship preserved; rebase-merge to keep per-commit credit.

Infographic

assistant-tail-anchor

xxxigm added 3 commits June 12, 2026 15:07
…summary + label handoffs in WebUI (#29824)

Two-pronged fix for the WebUI "context compaction block in place of
last assistant response" regression.

Agent layer (the real fix). ``_find_tail_cut_by_tokens`` already had
``_ensure_last_user_message_in_tail`` to keep the most recent user
request out of the compressed middle (#10896), but no symmetric
anchor for the assistant side. When the conversation has an
oversized recent tool result or a long stretch of tool-call/result
pairs *after* the assistant's last visible reply, the token-budget
walk can stop with the previously-visible reply on the wrong side
of ``cut_idx``. The summariser then rolls it into the single
``[CONTEXT COMPACTION — REFERENCE ONLY]`` block persisted as
``role="user"`` or ``role="assistant"``, and from the operator's
perspective the WebUI session viewer
(``web/src/pages/SessionsPage.tsx``) and the TUI chat panel both
suddenly show the opaque "Context compaction" block in the slot
where they were just reading the actual answer:

    User:  "i cant see the output of the last message you sent,
            i did see it previously, however now see 'context
            compaction'"

Added ``_ensure_last_assistant_message_in_tail`` mirror of the
user-side anchor. It looks for the most recent assistant message
with non-empty text content (skipping tool-call-only assistant
"stubs" which the UI renders as small "calling tool X" indicators
rather than a readable bubble) and walks ``cut_idx`` back through
the standard ``_align_boundary_backward`` so we don't split a
tool_call/result group that immediately precedes it. The two
anchors are chained — each only walks ``cut_idx`` backward, so
the tail can only grow.

Falls back to "most recent assistant of any kind" only when no
content-bearing reply exists in the compressible region (fresh
multi-step tool sequence with no prior reply) — in that case the
agent-side fix is effectively a no-op and the existing
user-message anchor carries the load.

WebUI layer (clarity). Added ``isCompactionMessage`` detector that
recognises the ``[CONTEXT COMPACTION — REFERENCE ONLY]`` (current)
and ``[CONTEXT SUMMARY]:`` (legacy) prefixes from
``agent/context_compressor.py``, and a new ``compaction`` entry
in ``MessageBubble``'s ``ROLE_STYLES`` map. Compaction blocks
now render as muted, italicised system-style rows labelled
``Context handoff`` — clearly metadata, not the assistant's
actual reply — so an operator scrolling back through a long
session can't mistake the summary for a real answer.

Keeping the detected prefixes inline (rather than importing them)
because the WebUI bundle has no Python interop. A guardrail comment
points readers at the source-of-truth constants in
``agent/context_compressor.py``.
…own bubble (#29824)

The compressor has a "double-collision" fallback path: when the
chosen ``summary_role`` collides with the first tail message AND
the flipped role would collide with the last head message, it can't
emit a standalone summary turn (consecutive same-role messages
break Anthropic and friends). It instead prepends the summary +
end-of-summary marker to the first tail message's content via
``_merge_summary_into_tail``.

With the matching anchor from the previous commit, that first tail
message is now usually the user's previously-visible assistant
reply — so the persisted assistant turn ends up shaped as
``[CONTEXT COMPACTION ...] ... --- END OF CONTEXT SUMMARY --- ...
THE ACTUAL REPLY``. Without splitting it, the session viewer
renders one big "Context handoff" bubble and the reply text is
buried inside the metadata blob — which is exactly the
"can't see the last reply" experience #29824 reports, just one
layer deeper.

Added ``splitCompactionContent`` that detects the merge marker
(kept in sync with ``--- END OF CONTEXT SUMMARY — respond to the
message below, not the summary above ---`` in
``agent/context_compressor.py``) and ``MessageBubble`` now
recurses on the two halves: the prefix half renders as the muted
"Context handoff" row, the remainder half renders with the
original assistant styling. Pure (non-merged) summary messages
hit the no-remainder branch and still render as a single
"Context handoff" row, preserving the original behaviour.
…paction rollup (#29824)

21 cases pinning the new ``_ensure_last_assistant_message_in_tail``
anchor and its interaction with the existing tail-cut path:

* ``TestFindLastAssistantMessageIdx`` — helper contract: prefers a
  content-bearing assistant message, skips ``tool_calls``-only
  stubs, multimodal text-block content counts, falls back to
  "any assistant" when no content-bearing reply exists, honours
  ``head_end``, returns -1 when there's none.

* ``TestEnsureLastAssistantMessageInTail`` — direct: no-op when
  already in the tail, walks ``cut_idx`` back when the reply is
  in the compressed middle, never crosses into the head region,
  re-aligns through a preceding ``tool_call`` / ``tool_result``
  group instead of orphaning it.

* ``TestFindTailCutByTokensAnchorsAssistant`` — integration:
  reporter repro (long tool-output run after the visible reply)
  now preserves the reply; user and assistant anchors compose
  in a single tail-cut call; a soft-ceiling-overrunning oversized
  tool result no longer strands the prior reply.

* ``TestCompactionRollupReproduction`` — end-to-end through
  ``compress()`` with a stubbed ``_generate_summary``: the
  visible reply text survives either as its own standalone
  assistant message (normal path) or concatenated onto the
  merged summary tail (double-collision path the WebUI then
  re-splits). The standalone-summary case is asserted strictly
  (exactly one summary row, exactly one separate assistant
  row carrying the reply) — that's the dominant path and any
  drift there reintroduces the original bug.

* ``TestSourceGuardrail`` — static asserts on
  ``agent/context_compressor.py``: the helper exists, the
  anchor is wired into ``_find_tail_cut_by_tokens`` AFTER the
  user-message anchor (so chaining is monotonic), the
  content-bearing preference is preserved, and the issue
  number is referenced so future bisects can find this fix.
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: fix/compaction-tail-assistant-anchor vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 10881 on HEAD, 10878 on base (🆕 +3)

🆕 New issues (3):

Rule Count
unresolved-attribute 2
unresolved-import 1
First entries
tests/agent/test_compressor_assistant_tail_anchor.py:48: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/run_agent/test_credits_notices_toggle.py:76: [unresolved-attribute] unresolved-attribute: Unresolved attribute `_credits_session_start_micros` on type `AIAgent`
run_agent.py:2886: [unresolved-attribute] unresolved-attribute: Object of type `Self@get_credits_spent_micros` has no attribute `_credits_session_start_micros`

✅ Fixed issues (1):

Rule Count
invalid-assignment 1
First entries
tests/run_agent/test_credits_notices_toggle.py:76: [invalid-assignment] invalid-assignment: Object of type `None` is not assignable to attribute `_credits_session_start_micros` of type `int`

Unchanged: 5702 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@teknium1 teknium1 merged commit 68536d4 into main Jun 12, 2026
28 checks passed
@teknium1 teknium1 deleted the fix/compaction-tail-assistant-anchor branch June 12, 2026 22:42
@alt-glitch alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround labels Jun 12, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related: salvages #29862 by @xxxigm (three commits cherry-picked, authorship preserved). Closes #29824. Builds on the just-merged #45153 marker hoist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WebUI can show context compaction block instead of latest assistant response after compression

3 participants