fix(agent): prevent silent data loss when message repair shortens history#24196
fix(agent): prevent silent data loss when message repair shortens history#24196liuhao1024 wants to merge 1 commit into
Conversation
|
CI failure is from the repository-wide Windows footgun checker, not ruff/ty and not this PR's session-db repair logic: This is a known upstream-main false positive around an intentionally POSIX-only branch in An existing upstream PR already addresses the shared blocker: #23816 ( So this branch likely should not carry a duplicate patch unless maintainers want each open PR to unblock itself before #23816 lands. |
71ecff2 to
4135350
Compare
|
🔄 Rebased onto upstream/main This branch has been rebased onto the latest (dd0923b) to remove unrelated mixed-in changes from the previous stale fork base. The diff now contains only the intended fix. Please re-review. |
…tory When _repair_message_sequence() or _drop_trailing_empty_response_scaffolding() mutates messages in-place, the list may become shorter than the original conversation_history. _flush_messages_to_session_db() computed flush_from = max(len(conversation_history), _last_flushed_db_idx) which could exceed len(messages), causing messages[flush_from:] to return an empty list and silently skip persisting the current turn. Fix: detect when start_idx > len(messages), cap _last_flushed_db_idx to the current list length, and fall back to _last_flushed_db_idx as the start boundary. This ensures the current turn is always persisted while preserving the dedup logic for subsequent calls. Fixes SessionDB silently skips current turn when message repair shortens conversation history NousResearch#24187
4135350 to
9a0de64
Compare
What does this PR do?
_flush_messages_to_session_db()silently skips persisting the current turn when_repair_message_sequence()or_drop_trailing_empty_response_scaffolding()shortensmessagesbelow the originalconversation_historylength.Root Cause
_flush_messages_to_session_db()computes:When in-place repair (dropping orphan tool messages, merging consecutive user messages) shortens
messagesfrom 122 to 116,flush_from = 120 > 116 = len(messages), somessages[120:]returns[]. The current user turn and assistant response are silently not persisted.Gateway integrations that create a fresh
AIAgentper inbound message rely on SessionDB for continuity. Once this happens, the session loads stale history and follow-up messages resolve against old context.Related Issue
N/A
Type of Change
Changes Made
How to Test
pytest tests/ -q— all tests should passChecklist
Code
fix(scope):,feat(scope):, etc.)pytest tests/ -qand all tests passDocumentation & Housekeeping
docs/, docstrings) — or N/Acli-config.yaml.exampleif I added/changed config keys — or N/ACONTRIBUTING.mdorAGENTS.mdif I changed architecture and workflows — or N/A