Skip to content

Bug: assistant responses silently dropped after repair_message_sequence (conversation_history vs messages mismatch) #46053

@saved-j

Description

@saved-j

Description

In long-running sessions, assistant responses are generated by the model (response_len > 0, finish_reason=stop) but never persisted to state.db. User messages are persisted (by the gateway), but assistant responses silently disappear. The model doesn't see its own previous responses and repeats the same answer indefinitely.

Root Cause

In _flush_messages_to_session_db (run_agent.py:~1562):

start_idx = len(conversation_history) if conversation_history else 0
flush_from = max(start_idx, self._last_flushed_db_idx)
for msg in messages[flush_from:]:

conversation_history is loaded from state.db by the gateway and contains all stored messages (e.g., 347).

messages starts as a copy of conversation_history, then repair_message_sequence removes stray tool results and merges consecutive user messages. After repair, messages is shorter (e.g., 303).

The bug: start_idx = len(conversation_history) = 347, but len(messages) = 306 (after repair + new assistant + tool). messages[347:] is an empty slice. Nothing is flushed.

This happens every turn — assistant responses are never persisted, the model never sees its own answers, and it repeats the same content indefinitely.

Reproduction

  1. Start a long session where consecutive user messages accumulate (background process notifications, system notes, user multi-messages)
  2. Let the agent run several turns
  3. Check state.db: user messages will be present, but assistant responses after a certain point will be missing
  4. The model will start repeating itself because it doesn't see its previous responses

Evidence

Session 20260612_050850_378fa296:

  • Last assistant in DB: feat(gateway): Microsoft Teams platform adapter #13753 at 03:34:05 (assistant count: 161)
  • 9 subsequent turns (03:35–04:48) all completed with response_len > 0
  • Assistant count remained at 161 — none were persisted
  • Violations grew 43→51 (consecutive user messages from background notifications)

Fix

Remove conversation_history from the flush cursor calculation. Use _last_flushed_db_idx as the sole cursor:

# Before (buggy):
start_idx = len(conversation_history) if conversation_history else 0
flush_from = max(start_idx, self._last_flushed_db_idx)

# After (fixed):
flush_from = self._last_flushed_db_idx

_last_flushed_db_idx is always ≤ len(messages) because it's set to len(messages) after each flush. It correctly tracks what's been persisted regardless of repair-induced length changes.

Additional context

  • Not related to LCM compression. Consecutive user messages that trigger the repair come from background process notifications, system notes, and user multi-messages — not from LCM.
  • Debug logging added: FLUSH SKIP / FLUSH log lines in _flush_messages_to_session_db to diagnose future persistence issues.
  • Severity: Critical. In affected sessions, the agent becomes unusable — repeats walls of text, ignores new user requests, because it has no memory of its own responses.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundcomp/agentCore agent loop, run_agent.py, prompt builderduplicateThis issue or pull request already existstype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions