Summary
During long-running sessions with multiple tool calls, Context Compaction (267 turns were removed to free context space) causes historical messages to be erroneously re-injected as new user input, leading the agent to suddenly execute stale/irrelevant instructions. This has been confirmed to occur 3 times.
Environment
- Platform: Weixin (Home Channel)
- Model: kimi-k2.6:cloud
- Hermes Version: v0.7.0+
Expected Behavior
When context compaction occurs, old messages should remain in history. They must never be surfaced as "new user input" to the agent.
Actual Behavior
After compaction, a historical message (e.g. a MiniMax URL from a previous session) appears in the agent's input stream as a fresh user message. The agent then executes it, derailing the current task.
Reproduction Pattern
- Start a long-running session with many tool calls
- Eventually trigger Context Compaction:
Summary generation was unavailable. N conversation turns were removed...
- Agent suddenly receives a phantom message that was part of the removed/compacted history
- Agent executes the phantom message, producing irrelevant output
Specific Incident (3rd occurrence, 2026-04-23)
- Current task: Searching for cattle behavioral datasets on Zenodo
- Phantom message injected:
https://platform.minimaxi.com/user-center/payment/token-plan
- User confirmed: This URL was never sent by the user during this session
- Agent impact: Agent navigated to the URL and also attempted to execute an unrelated historical task (the Zenodo supplemental search from a prior session), resulting in a mixed/confused response
- Compaction log:
Summary generation was unavailable. 267 conversation turns were removed to free context space but could not be summarized.
Root Cause Hypothesis
Context Compaction's state reconstruction phase has a bug in message deduplication / rehydration. When summary generation fails (Summary generation was unavailable) and old turns are discarded, the compaction logic may be incorrectly promoting residual message units from the removed turns into the active input stream.
Impact
- 🔴 High severity: Agent executes phantom commands, wasting tokens and derailing user tasks
- Breaks user trust (now on the 3rd occurrence)
- Could lead to harmful actions if the phantom message were an execute command
Suggested Fixes
- Add source validation during compaction rehydration: verify a message's origin before allowing it into the active input stream
- When summary generation fails, fall back to safe mode (discard unrecognizable units rather than guess)
- Physically isolate user-input queue from historical context storage so compaction can never bridge them
Additional Context
- Previously flagged in agent memory as a recurring issue: "用户批评我多次出现上下文混乱('这不是第一次了')"
- Two prior instances occurred under similar long-session conditions (details to be added if reproducible)
Summary
During long-running sessions with multiple tool calls, Context Compaction (
267 turns were removed to free context space) causes historical messages to be erroneously re-injected as new user input, leading the agent to suddenly execute stale/irrelevant instructions. This has been confirmed to occur 3 times.Environment
Expected Behavior
When context compaction occurs, old messages should remain in history. They must never be surfaced as "new user input" to the agent.
Actual Behavior
After compaction, a historical message (e.g. a MiniMax URL from a previous session) appears in the agent's input stream as a fresh user message. The agent then executes it, derailing the current task.
Reproduction Pattern
Summary generation was unavailable. N conversation turns were removed...Specific Incident (3rd occurrence, 2026-04-23)
https://platform.minimaxi.com/user-center/payment/token-planRoot Cause Hypothesis
Context Compaction's state reconstruction phase has a bug in message deduplication / rehydration. When summary generation fails (
Summary generation was unavailable) and old turns are discarded, the compaction logic may be incorrectly promoting residual message units from the removed turns into the active input stream.Impact
Suggested Fixes
Additional Context