fix(agent): exclude prior-history tool messages from background review summary (salvage #14967)#15057
Merged
Merged
Conversation
This was referenced Apr 24, 2026
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Salvage of #14967 by @luyao618 onto current main. Chosen over the parallel #14969 for robustness.
Closes #14944.
What this PR does
Stops the background memory/skill review from re-surfacing stale tool results from the prior conversation as if they just happened. After e.g. creating a cron reminder, subsequent
💾background-review notifications would includeCron job '<name>' created.again on every run, even though cron wasn't touched.How
The review agent forks with
conversation_history=messages_snapshot, so its_session_messagescontains inherited tool messages. The scan that builds the💾summary walked the whole list and treated historical tool successes as new review actions.@luyao618 extracts the scan into a testable
AIAgent._summarize_background_review_actionsstaticmethod that:tool_call_idin the snapshot and skips review messages whosetool_call_idmatchestool_call_iddata.get('success')branch against non-dict JSON payloads (latent bug — bare-string/list content previously raised)Why this over #14969
#14969 used a slice approach (
_session_messages[len(snapshot):]) which is smaller but brittle: if any future init step reorders, filters, or deduplicates the history (compression, prefix-cache replay, future hydration logic), the slice boundary silently drifts and stale results leak through again. ID-based matching is immune. #14967 also matches the issue author's explicit suggested approach verbatim and fixes the non-dict JSON crash.Validation
tests/run_agent/test_background_review_summary.py— 8/8 pass (new file)tests/run_agent/— 940/940 pass (the 2 other failures are pre-existing on current main, unrelated)Co-authored-by: @luyao618