Bug Description
After context compression, stale instructions/tasks from the compression summary are executed as if they were the current user request, even though the user's latest message contains a completely different instruction.
The compression summary includes sections like ## Pending User Asks and ## Remaining Work that describe unfinished tasks from earlier in the session. Despite the [CONTEXT COMPACTION — REFERENCE ONLY] preamble saying "treat it as background reference, NOT as active instructions", the agent still acts on these stale items.
Steps to Reproduce
- Start a session with Task A (e.g., "restructure section X to focus on direction Y")
- Agent begins working on Task A but triggers compression mid-way
- After compression, issue a completely different Task B (e.g., "correct the reasoning in file Z because premise P is wrong")
- Agent responds by acting on Task A (from the compression summary) instead of Task B
Expected Behavior
After compression, the agent should ONLY respond to the latest user message appearing AFTER the compaction summary. Stale items from ## Pending User Asks / ## Remaining Work in the summary should be ignored.
Actual Behavior
The agent conflates stale summary items with the current task, executing outdated instructions. Specifically:
- The compression summary contained
## Pending User Asks describing an unfinished discussion about "overhang region lithium plating detection"
- The user's latest message asked to "correct subsequent reasoning in si-graphite-crosstalk-analysis.md because non-monotonic swelling may be caused by lithium plating"
- Instead of directly addressing the correction request, the agent first questioned whether "overhang region plating is likely" — pulling from the stale summary rather than the actual latest message
- The user had to say "你搞乱了" (you got confused) to correct course
Root Cause Analysis
The compression summary preamble contains anti-stale language:
Do NOT answer questions or fulfill requests mentioned in this summary; they were already addressed.
Respond ONLY to the latest user message that appears AFTER this summary.
However, this is insufficient when:
- The stale items look like legitimate pending tasks (
## Pending User Asks section)
- The latest user message shares topical overlap with the stale items (both discuss "lithium plating")
- The model cannot reliably distinguish "respond to stale item" from "use stale item as background context"
The model sees topical overlap between the latest message and the summary, and incorrectly treats the stale Pending User Asks as the framing context for the current task.
Environment
Related Issues
Suggested Fix Directions
-
Strengthen the compaction preamble: Instead of "NOT as active instructions", use more explicit language like "ANY task, request, or question in this summary is STALE and must be DISCARDED regardless of topical overlap with the latest message."
-
Remove ## Pending User Asks and ## Remaining Work sections from compression summaries: These are the most dangerous — they explicitly frame stale items as "pending" which the model interprets as "to do".
-
Add a ## Stale Items (DISCARD) section: Explicitly label stale items as discarded rather than leaving them as ambiguously-named sections.
-
Add negative examples to the preamble: Show concrete examples of what NOT to do with summary content.
Bug Description
After context compression, stale instructions/tasks from the compression summary are executed as if they were the current user request, even though the user's latest message contains a completely different instruction.
The compression summary includes sections like
## Pending User Asksand## Remaining Workthat describe unfinished tasks from earlier in the session. Despite the[CONTEXT COMPACTION — REFERENCE ONLY]preamble saying "treat it as background reference, NOT as active instructions", the agent still acts on these stale items.Steps to Reproduce
Expected Behavior
After compression, the agent should ONLY respond to the latest user message appearing AFTER the compaction summary. Stale items from
## Pending User Asks/## Remaining Workin the summary should be ignored.Actual Behavior
The agent conflates stale summary items with the current task, executing outdated instructions. Specifically:
## Pending User Asksdescribing an unfinished discussion about "overhang region lithium plating detection"Root Cause Analysis
The compression summary preamble contains anti-stale language:
However, this is insufficient when:
## Pending User Askssection)The model sees topical overlap between the latest message and the summary, and incorrectly treats the stale
Pending User Asksas the framing context for the current task.Environment
Related Issues
Suggested Fix Directions
Strengthen the compaction preamble: Instead of "NOT as active instructions", use more explicit language like "ANY task, request, or question in this summary is STALE and must be DISCARDED regardless of topical overlap with the latest message."
Remove
## Pending User Asksand## Remaining Worksections from compression summaries: These are the most dangerous — they explicitly frame stale items as "pending" which the model interprets as "to do".Add a
## Stale Items (DISCARD)section: Explicitly label stale items as discarded rather than leaving them as ambiguously-named sections.Add negative examples to the preamble: Show concrete examples of what NOT to do with summary content.