Skip to content

Compression summary stale instructions executed as current task #41607

@laoli-no1

Description

@laoli-no1

Bug Description

After context compression, stale instructions/tasks from the compression summary are executed as if they were the current user request, even though the user's latest message contains a completely different instruction.

The compression summary includes sections like ## Pending User Asks and ## Remaining Work that describe unfinished tasks from earlier in the session. Despite the [CONTEXT COMPACTION — REFERENCE ONLY] preamble saying "treat it as background reference, NOT as active instructions", the agent still acts on these stale items.

Steps to Reproduce

  1. Start a session with Task A (e.g., "restructure section X to focus on direction Y")
  2. Agent begins working on Task A but triggers compression mid-way
  3. After compression, issue a completely different Task B (e.g., "correct the reasoning in file Z because premise P is wrong")
  4. Agent responds by acting on Task A (from the compression summary) instead of Task B

Expected Behavior

After compression, the agent should ONLY respond to the latest user message appearing AFTER the compaction summary. Stale items from ## Pending User Asks / ## Remaining Work in the summary should be ignored.

Actual Behavior

The agent conflates stale summary items with the current task, executing outdated instructions. Specifically:

  • The compression summary contained ## Pending User Asks describing an unfinished discussion about "overhang region lithium plating detection"
  • The user's latest message asked to "correct subsequent reasoning in si-graphite-crosstalk-analysis.md because non-monotonic swelling may be caused by lithium plating"
  • Instead of directly addressing the correction request, the agent first questioned whether "overhang region plating is likely" — pulling from the stale summary rather than the actual latest message
  • The user had to say "你搞乱了" (you got confused) to correct course

Root Cause Analysis

The compression summary preamble contains anti-stale language:

Do NOT answer questions or fulfill requests mentioned in this summary; they were already addressed.
Respond ONLY to the latest user message that appears AFTER this summary.

However, this is insufficient when:

  1. The stale items look like legitimate pending tasks (## Pending User Asks section)
  2. The latest user message shares topical overlap with the stale items (both discuss "lithium plating")
  3. The model cannot reliably distinguish "respond to stale item" from "use stale item as background context"

The model sees topical overlap between the latest message and the summary, and incorrectly treats the stale Pending User Asks as the framing context for the current task.

Environment

Related Issues

Suggested Fix Directions

  1. Strengthen the compaction preamble: Instead of "NOT as active instructions", use more explicit language like "ANY task, request, or question in this summary is STALE and must be DISCARDED regardless of topical overlap with the latest message."

  2. Remove ## Pending User Asks and ## Remaining Work sections from compression summaries: These are the most dangerous — they explicitly frame stale items as "pending" which the model interprets as "to do".

  3. Add a ## Stale Items (DISCARD) section: Explicitly label stale items as discarded rather than leaving them as ambiguously-named sections.

  4. Add negative examples to the preamble: Show concrete examples of what NOT to do with summary content.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundcomp/agentCore agent loop, run_agent.py, prompt buildertype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions