fix(agent): persist repaired current turn history#24211
Open
qWaitCrypto wants to merge 1 commit into
Open
Conversation
Collaborator
This was referenced Jun 11, 2026
This was referenced Jun 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes a SessionDB persistence bug where the current turn can be silently skipped after message-sequence repair shortens the live conversation history.
AIAgent._repair_message_sequence()mutates the livemessageslist in place before the API call. That repair can drop orphan tool messages or merge consecutive user messages._flush_messages_to_session_db()then used the originallen(conversation_history)and_last_flushed_db_idxas append-only boundaries. If repair shortened the list, persistence could slice from the old boundary and write nothing.The important edge case is consecutive user-message repair: the current user turn can be merged into a prior user message that was already persisted in SQLite. In that case, appending from a later index is not enough because the existing DB row needs to be rewritten.
This PR marks the SessionDB transcript for rewrite when
_repair_message_sequence()changes the live history. The next flush uses the existingSessionDB.replace_messages()path to atomically resync the stored transcript from the repaired in-memory history, then updates_last_flushed_db_idx.Related Issue
Fixes #24187
Type of Change
Changes Made
run_agent.pyto track when message-sequence repair requires a SessionDB transcript rewrite_repair_message_sequence()to set the rewrite flag after it mutatesmessages_flush_messages_to_session_db()to callSessionDB.replace_messages()when the repaired live transcript must replace the stored transcripttests/run_agent/test_message_sequence_repair.pyfor the case where repair merges the current user turn into an already persisted user messageHow to Test
HOME=/tmp/hermes-agent-pytest HERMES_HOME=/tmp/hermes-agent-pytest/.hermes XDG_CACHE_HOME=/tmp/hermes-agent-pytest/.cache PYTHONPATH=. /home/cyt/miniconda3/envs/meta_workflow_py312/bin/python -m pytest --override-ini=addopts='' -p no:cacheprovider tests/run_agent/test_message_sequence_repair.py tests/run_agent/test_860_dedup.py -qChecklist
Code
fix(scope):,feat(scope):, etc.)pytest tests/ -qand all tests passpytestvia/home/cyt/miniconda3/envs/meta_workflow_py312/bin/pythonDocumentation & Housekeeping
docs/, docstrings) — or N/Acli-config.yaml.exampleif I added/changed config keys — or N/ACONTRIBUTING.mdorAGENTS.mdif I changed architecture or workflows — or N/AScreenshots / Logs
Before fix, local minimal reproduction showed that repair merged the current user turn into the live message list, but SessionDB append was skipped:
After fix, the same reproduction rewrites the stored SessionDB transcript and preserves the current turn:
Real local pytest results:
What does this PR do?
Fixes a SessionDB persistence bug where the current turn can be silently skipped after message-sequence repair shortens the live conversation history.
AIAgent._repair_message_sequence()mutates the livemessageslist in place before the API call. That repair can drop orphan tool messages or merge consecutive user messages._flush_messages_to_session_db()then used the originallen(conversation_history)and_last_flushed_db_idxas append-only boundaries. If repair shortened the list, persistence could slice from the old boundary and write nothing.The important edge case is consecutive user-message repair: the current user turn can be merged into a prior user message that was already persisted in SQLite. In that case, appending from a later index is not enough because the existing DB row needs to be rewritten.
This PR marks the SessionDB transcript for rewrite when
_repair_message_sequence()changes the live history. The next flush uses the existingSessionDB.replace_messages()path to atomically resync the stored transcript from the repaired in-memory history, then updates_last_flushed_db_idx.Related Issue
Fixes #24187
Type of Change
Changes Made
run_agent.pyto track when message-sequence repair requires a SessionDB transcript rewrite_repair_message_sequence()to set the rewrite flag after it mutatesmessages_flush_messages_to_session_db()to callSessionDB.replace_messages()when the repaired live transcript must replace the stored transcripttests/run_agent/test_message_sequence_repair.pyfor the case where repair merges the current user turn into an already persisted user messageHow to Test
HOME=/tmp/hermes-agent-pytest HERMES_HOME=/tmp/hermes-agent-pytest/.hermes XDG_CACHE_HOME=/tmp/hermes-agent-pytest/.cache PYTHONPATH=. /home/cyt/miniconda3/envs/meta_workflow_py312/bin/python -m pytest --override-ini=addopts='' -p no:cacheprovider tests/run_agent/test_message_sequence_repair.py tests/run_agent/test_860_dedup.py -qChecklist
Code
fix(scope):,feat(scope):, etc.)pytest tests/ -qand all tests passpytestvia/home/cyt/miniconda3/envs/meta_workflow_py312/bin/pythonDocumentation & Housekeeping
docs/, docstrings) — or N/Acli-config.yaml.exampleif I added/changed config keys — or N/ACONTRIBUTING.mdorAGENTS.mdif I changed architecture or workflows — or N/AScreenshots / Logs
Before fix, local minimal reproduction showed that repair merged the current user turn into the live message list, but SessionDB append was skipped:
After fix, the same reproduction rewrites the stored SessionDB transcript and preserves the current turn:
Real local pytest results: