Skip to content

Background review notification includes stale tool results from conversation history #14944

@pixelated-salt

Description

@pixelated-salt

Bug Description

The background memory/skill review can repeatedly surface old successful tool actions as if they had just happened.

For example, after creating a one-shot cron reminder, later unrelated background review notifications may include the old cron creation success message again:

💾 Cron job '<reminder name>' created. · User profile updated

This can happen multiple times even though the cron job was not recreated.

Expected Behavior

Background memory/skill review notifications should only summarize actions performed by the background review agent itself.

For example:

💾 User profile updated

They should not include successful tool results that already existed in the conversation history before the background review started.

Actual Behavior

The background review appears to scan successful tool messages from the review agent’s full _session_messages, including tool messages copied from the prior conversation history.

As a result, older tool results such as cron job creation can be included again in a new background notification summary.

Diagnosis

The relevant logic appears to be in run_agent.py, around _spawn_background_review():

review_agent.run_conversation(
    user_message=prompt,
    conversation_history=messages_snapshot,
)

# Scan the review agent's messages for successful tool actions
# and surface a compact summary to the user.
actions = []
for msg in getattr(review_agent, "_session_messages", []):
    if not isinstance(msg, dict) or msg.get("role") != "tool":
        continue
    ...
    message = data.get("message", "")
    ...
    if "created" in message.lower():
        actions.append(message)

Since the review agent is initialized with:

conversation_history=messages_snapshot

its _session_messages may include old tool messages from the main conversation. The summarizer then treats those historical tool results as newly performed actions.

Steps to Reproduce

  1. In a gateway session, perform a tool action that returns a successful message containing "created".

    Example:

    Create a one-shot cron reminder.
    
  2. Continue chatting until background memory/profile review triggers.

  3. Trigger or allow a user profile or memory update.

  4. Observe the background review notification.

Observed Result

The notification may include a stale tool result:

💾 Cron job '<reminder name>' created. · User profile updated

even though the cron job was created earlier and was not recreated.

Suggested Fix

When summarizing actions from the background review agent, ignore tool messages that were already present in messages_snapshot.

Possible approaches:

  1. Record existing tool_call_ids before running the background review, then skip tool messages with those IDs.
  2. Record existing tool message contents as a fallback for messages without tool_call_id.
  3. Alternatively, collect only tool messages appended after review_agent.run_conversation() starts, instead of scanning the full _session_messages.

Example:

existing_tool_call_ids = {
    msg.get("tool_call_id")
    for msg in messages_snapshot
    if isinstance(msg, dict)
    and msg.get("role") == "tool"
    and msg.get("tool_call_id")
}

...

for msg in getattr(review_agent, "_session_messages", []):
    if msg.get("role") != "tool":
        continue
    if msg.get("tool_call_id") in existing_tool_call_ids:
        continue
    ...

Workaround

Setting:

memory:
  nudge_interval: 0

prevents automatic background memory review from triggering, but this is not ideal because it disables useful automatic memory review behavior.

The issue can still occur when automatic memory review is enabled, for example:

memory:
  nudge_interval: 10

Affected Area

  • Gateway sessions
  • Background memory/skill review
  • Background review notification summaries
  • run_agent.py / _spawn_background_review()

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/agentCore agent loop, run_agent.py, prompt buildertool/memoryMemory tool and memory providerstool/skillsSkills system (list, view, manage)type/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions