Skip to content

fix(agent): rewrite SessionDB transcript after message-sequence repair#24419

Open
luoxiao6645 wants to merge 1 commit into
NousResearch:mainfrom
luoxiao6645:fix/24187-sessiondb-repair-persistence
Open

fix(agent): rewrite SessionDB transcript after message-sequence repair#24419
luoxiao6645 wants to merge 1 commit into
NousResearch:mainfrom
luoxiao6645:fix/24187-sessiondb-repair-persistence

Conversation

@luoxiao6645

Copy link
Copy Markdown

Summary

Fixes a SessionDB persistence bug where the current turn can be silently skipped after _repair_message_sequence() mutates the in-memory transcript before persistence.

The previous append-only flush logic used len(conversation_history) and _last_flushed_db_idx as boundaries. That breaks when repair shortens or rewrites the live
messages list:

  • stray/orphan tool messages may be dropped
  • consecutive user messages may be merged
  • the current turn can be merged into an already-persisted historical row

In those cases, append-only persistence is not sufficient. The SQLite transcript may need to be rewritten, not just appended.

Fixes #24187.

What changed

  • added self._session_db_needs_rewrite to AIAgent
  • when _repair_message_sequence() actually mutates messages, mark the SessionDB transcript dirty
  • in _flush_messages_to_session_db():
    • if rewrite is needed, call SessionDB.replace_messages(...)
    • update _last_flushed_db_idx
    • clear the rewrite flag
  • kept the normal append-only path for non-repair cases
  • added a defensive clamp on flush_from so stale offsets cannot overflow the slice
  • factored SessionDB message normalization into a helper so rewrite and append paths persist the same normalized shape

Why this approach

A simple clamp like:

python
flush_from = min(max(start_idx, self._last_flushed_db_idx), len(messages))

prevents the empty-slice failure, but it does not fix the more important case where repair merges the current turn into an already-persisted historical user row. That
case requires rewriting the stored transcript, not appending.

This patch uses the existing SessionDB.replace_messages() path, which already matches transcript-rewrite flows like /retry, /undo, and /compress.

Tests

Added regression coverage for the key failure shape:

  • historical transcript already persisted in SessionDB
  • current turn gets merged into a prior user message by _repair_message_sequence()
  • _flush_messages_to_session_db() must rewrite SQLite so the merged current turn is preserved

@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder labels May 12, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Competing fix for #24187, alongside #24211 and #24326. All three address the same SessionDB persistence bug where _repair_message_sequence() shortens the messages list but the flush cursor becomes stale. This PR uses the full replace_messages() rewrite approach (same as #24211), while #24326 uses a simpler clamp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: SessionDB silently skips current turn when message repair shortens conversation history

2 participants