Skip to content

fix(agent): clamp flush_from to len(messages) after repair shortens the list#24326

Closed
ryptotalent wants to merge 1 commit into
NousResearch:mainfrom
ryptotalent:fix/sessiondb-repair-flush-offset
Closed

fix(agent): clamp flush_from to len(messages) after repair shortens the list#24326
ryptotalent wants to merge 1 commit into
NousResearch:mainfrom
ryptotalent:fix/sessiondb-repair-flush-offset

Conversation

@ryptotalent

Copy link
Copy Markdown
Contributor

Problem

_repair_message_sequence() shortens the in-memory messages list in-place (merging consecutive user messages, dropping stray tool results). However, _flush_messages_to_session_db() still uses len(conversation_history) — the pre-repair length — as the starting offset (start_idx).

When repair removes N messages from the historical portion:

  1. start_idx = len(conversation_history) reflects the old, longer length
  2. flush_from = max(start_idx, self._last_flushed_db_idx) can exceed len(messages)
  3. messages[flush_from:] returns [] — the current user turn and assistant response are silently not persisted to SessionDB

Gateway integrations that create a fresh AIAgent per inbound message (Telegram, Discord, etc.) rely on SessionDB for conversation continuity across requests. A dropped turn means the agent "forgets" what just happened.

Fix

Clamp flush_from with min(..., len(messages)) so the slice never overflows:

flush_from = min(max(start_idx, self._last_flushed_db_idx), len(messages))

This also correctly handles stale _last_flushed_db_idx values after repair shortens the list.

Testing

  • Syntax validation: ast.parse() passes
  • No hardcoded secrets in diff
  • Reproduction: construct a conversation where _repair_message_sequence merges messages in the historical portion, then observe that _flush_messages_to_session_db now correctly persists the current turn

Closes #24187

…he list

_repair_message_sequence() shortens the in-memory messages list in-place
(merging consecutive user messages, dropping stray tool results), but
_flush_messages_to_session_db() still uses len(conversation_history) —
the pre-repair length — as the starting offset. When repair removes N
messages, flush_from can exceed len(messages), so messages[flush_from:]
returns an empty list and the current turn is silently dropped from
SessionDB.

Gateway integrations that create a fresh AIAgent per inbound message
are most affected, because they rely on SessionDB for conversation
continuity across requests.

Fix: clamp flush_from with min(..., len(messages)) so the slice never
overflows. This also handles stale _last_flushed_db_idx values after
repair.

Closes NousResearch#24187
@teknium1

Copy link
Copy Markdown
Contributor

Resolved by #46071. This PR targets the same overflow of flush_from after message repair shortens the live list; the merged fix uses message identity tracking instead of a clamp, so the repaired turn persists without relying on list positions.

@teknium1 teknium1 closed this Jun 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: SessionDB silently skips current turn when message repair shortens conversation history

3 participants