Skip to content

fix(agent): exclude prior-history tool messages from background review summary#14967

Closed
luyao618 wants to merge 1 commit into
NousResearch:mainfrom
luyao618:fix/bg-review-stale-tool-results
Closed

fix(agent): exclude prior-history tool messages from background review summary#14967
luyao618 wants to merge 1 commit into
NousResearch:mainfrom
luyao618:fix/bg-review-stale-tool-results

Conversation

@luyao618

Copy link
Copy Markdown
Contributor

What does this PR do?

Fixes a bug where the background memory/skill review's user-visible summary (💾 ...) re-surfaces stale tool successes from the prior conversation as if they had just happened.

_spawn_background_review forks a new AIAgent initialized with conversation_history=messages_snapshot. The forked agent's _session_messages therefore contains tool messages copied from the prior conversation. The post-review scan that builds the summary walked the entire _session_messages list and reported every successful tool result it found, so historical actions (e.g. an earlier Cron job '...' created.) were re-announced — sometimes repeatedly across unrelated background-review runs.

Related Issue

Fixes #14944

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)

Changes Made

  • Extracted the scan into a new AIAgent._summarize_background_review_actions staticmethod for testability.
  • Before scanning, collect every tool_call_id already present in messages_snapshot and skip review messages whose tool_call_id matches — those are inherited from the prior conversation, not new actions.
  • For tool messages without a tool_call_id, fall back to content-equality against the prior snapshot's anonymous tool messages.
  • Hardened data handling so a non-dict JSON payload no longer raises in the data.get("success") branch.

How to Test

  1. Repro per the issue: in a gateway session create a one-shot cron reminder, then continue chatting until the background memory/profile review fires. Before this fix, the next review's 💾 notification could include Cron job '<reminder>' created. even though no cron was created during that review. After the fix it doesn't.
  2. Run the targeted tests: pytest tests/run_agent/test_background_review_summary.py -v
  3. Run the broader run_agent suite: pytest tests/run_agent/test_run_agent.py -q

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes
  • I've tested on my platform: macOS (Darwin 25.4.0, Apple Silicon), Python 3.11

Documentation & Housekeeping

  • Updated relevant documentation — N/A
  • Updated cli-config.yaml.example — N/A
  • Updated contributing / agents docs — N/A
  • Considered cross-platform impact — N/A (logic-only, no platform-specific paths)
  • Updated tool descriptions/schemas — N/A

…w summary

The background memory/skill review forks a new AIAgent with
conversation_history=messages_snapshot. The forked agent's _session_messages
therefore contains tool messages copied from the prior conversation. The
post-review scan that builds the user-visible 💾 summary walked the entire
_session_messages list, so historical successes (e.g. 'Cron job '...'
created.') were re-surfaced as if they had just happened — sometimes
multiple times across unrelated background-review runs.

Extract the scan into a staticmethod and skip any tool message whose
tool_call_id was already present in messages_snapshot, with a
content-equality fallback for tool messages that lack one.

Fixes NousResearch#14944
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 24, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Competing fix for #14944 alongside #14969 and #9696 — all three PRs address the same stale-tool-result bug in background review. Recommend maintainer pick one.

teknium1 pushed a commit that referenced this pull request Apr 24, 2026
…w summary

Cherry-pick-of: 27b6a21 (PR #14967 by @luyao618)

Co-authored-by: luyao618 <364939526@qq.com>
@teknium1

Copy link
Copy Markdown
Contributor

Merged via #15057 — your commit was cherry-picked onto current main with your authorship preserved in git log.

Chose this implementation over the parallel #14969 because:

  1. ID-based matching is robust to any future change to how _session_messages gets populated during agent init
  2. The extracted _summarize_background_review_actions staticmethod is directly testable without mocking the spawn path
  3. Matches the issue author's explicit suggested approach verbatim
  4. Bonus: hardens the data.get('success') branch against non-dict JSON payloads (latent crash)

Thanks for the clean refactor and comprehensive tests.

@teknium1 teknium1 closed this Apr 24, 2026
nekorytaylor666 pushed a commit to nekorytaylor666/hermes-agent that referenced this pull request Apr 24, 2026
…w summary

Cherry-pick-of: 27b6a21 (PR NousResearch#14967 by @luyao618)

Co-authored-by: luyao618 <364939526@qq.com>
justrhoto pushed a commit to justrhoto/hermes-agent that referenced this pull request Apr 24, 2026
…w summary

Cherry-pick-of: 27b6a21 (PR NousResearch#14967 by @luyao618)

Co-authored-by: luyao618 <364939526@qq.com>
ulasbilgen pushed a commit to ulasbilgen/hermes-adhd-agent that referenced this pull request May 1, 2026
…w summary

Cherry-pick-of: 27b6a21 (PR NousResearch#14967 by @luyao618)

Co-authored-by: luyao618 <364939526@qq.com>
aj-nt pushed a commit to aj-nt/hermes-agent that referenced this pull request May 1, 2026
…w summary

Cherry-pick-of: 27b6a21 (PR NousResearch#14967 by @luyao618)

Co-authored-by: luyao618 <364939526@qq.com>
donald131 pushed a commit to donald131/hermes-agent that referenced this pull request May 2, 2026
…w summary

Cherry-pick-of: 27b6a21 (PR NousResearch#14967 by @luyao618)

Co-authored-by: luyao618 <364939526@qq.com>
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…w summary

Cherry-pick-of: 27b6a21 (PR NousResearch#14967 by @luyao618)

Co-authored-by: luyao618 <364939526@qq.com>
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…w summary

Cherry-pick-of: 27b6a21 (PR NousResearch#14967 by @luyao618)

Co-authored-by: luyao618 <364939526@qq.com>
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…w summary

Cherry-pick-of: 27b6a21 (PR NousResearch#14967 by @luyao618)

Co-authored-by: luyao618 <364939526@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Background review notification includes stale tool results from conversation history

3 participants