fix(compaction): preserve assistant boundary replies#90641
Conversation
|
Codex review: needs real behavior proof before merge. Reviewed June 12, 2026, 9:23 AM ET / 13:23 UTC. Summary PR surface: Source +69, Tests +84. Total +153 across 2 files. Reproducibility: yes. at source level. Put a model_change, thinking_level_change, or session_info entry between a summarized assistant and the retained user; the PR scan stops at that marker although SessionManager does not replay it as a conversational message, so the assistant is removed. Review metrics: none identified. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Proof guidance:
Mantis proof suggestion Risk before merge
Maintainer options:
Next step before merge
Security Review findings
Review detailsBest possible solution: Classify intervening entries by canonical replay effect: skip non-context state and metadata entries while treating appendable custom_message and branch_summary content as hard boundaries, then add the focused state-marker regression and prove a real rotated session plus its next model turn. Do we have a high-confidence way to reproduce the issue? Yes at source level. Put a model_change, thinking_level_change, or session_info entry between a summarized assistant and the retained user; the PR scan stops at that marker although SessionManager does not replay it as a conversational message, so the assistant is removed. Is this the best way to solve the issue? No, not yet. Preserving the assistant boundary is the right fix direction, but the implementation must distinguish non-context state markers from appendable custom context rather than treating every non-message entry identically. Full review comments:
Overall correctness: patch is incorrect AGENTS.md: found and applied where relevant. Codex review notes: model internal, reasoning high; reviewed against 0fc5a57a3440. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +69, Tests +84. Total +153 across 2 files. View PR surface stats
Acceptance criteria:
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ba959f1933
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (candidate?.type !== "message") { | ||
| continue; |
There was a problem hiding this comment.
Stop re-exposing summarized custom messages
When a surviving user message is preceded by a summarized assistant but has a custom_message or branch_summary between them, this loop skips the intervening non-message entry and still marks the assistant as the replay boundary. Since resolveSuccessorCompactionFirstKeptEntryId then moves firstKeptEntryId back to that assistant, buildSessionContext replays every appendable entry from there to the compaction (packages/agent-core/src/harness/session/session.ts:81-95), so summarized custom/branch content that previously stayed hidden behind the old first-kept user is sent to the model again after transcript rotation.
Useful? React with 👍 / 👎.
Summary
summary -> assistant -> userinstead of hiding the replyFixes #76729
Tests
Real behavior proof
summary -> assistant -> userboundary.ba959f19332c40305f241e15ef483c4a8923cc7on macOS 15.5 with Node v22.22.1 and pnpm 11.2.2.node scripts/run-vitest.mjs src/agents/embedded-agent-runner/compaction-successor-transcript.test.ts;git diff --check upstream/main...HEAD.