Skip to content

Gateway cached-agent reuse can leak _last_flushed_db_idx across turns and skip assistant transcript rows #44327

@boweic

Description

@boweic

Summary

When the gateway reuses a cached AIAgent, the per-agent SessionDB flush cursor can leak across turns.

GatewayRunner._init_cached_agent_for_turn() resets some per-turn state, but it does not realign agent._last_flushed_db_idx to the history actually passed into the new turn.

As a result, a reused agent can compute a flush_from offset that is too large for the current turn and silently skip persisting the assistant reply into state.db.

This leaves a transcript with multiple consecutive user rows and missing assistant rows, which then causes later turns to replay stale questions and produce "repeated" or blended answers.

This looks related to, but distinct from:

Observed Behavior

In a live gateway session, the user reported that each new reply kept dragging prior questions into the current answer.

Inspecting the session transcript showed a pattern like:

  • assistant
  • user
  • user
  • user
  • user
  • user
  • user
  • assistant

So the platform visibly delivered multiple assistant replies over time, but the durable SQLite transcript only retained the final one. On subsequent turns, Hermes loaded this broken history, triggered consecutive-user repair, and effectively merged several old user turns into the next prompt.

Root Cause Hypothesis

The critical pieces are:

  1. gateway/run.py reuses cached agents:

    if cached and cached[1] == _sig:
        agent = cached[0]
        self._init_cached_agent_for_turn(agent, _interrupt_depth)
  2. _init_cached_agent_for_turn() currently resets only:

    • _last_activity_ts
    • _last_activity_desc
    • _api_call_count

    It does not reset or realign _last_flushed_db_idx.

  3. run_agent.py::_flush_messages_to_session_db() later computes:

    start_idx = len(conversation_history) if conversation_history else 0
    flush_from = max(start_idx, self._last_flushed_db_idx)
    for msg in messages[flush_from:]:
        ... append_message(...)

If the cached agent still carries _last_flushed_db_idx from the previous turn, the new turn can start flushing from a later index than the current conversation_history boundary. Then the assistant message for this turn is silently skipped.

Why This Causes Repeated Answers

On the gateway success path, transcript writes assume the agent already persisted the DB rows:

agent_persisted = self._session_db is not None
append_to_transcript(..., skip_db=agent_persisted)

So if the agent-side flush skips the assistant row, the gateway does not backfill it. The next inbound message then reloads a transcript containing several consecutive user rows with the assistant rows missing.

That broken replay state matches the repeated-answer symptom exactly:

  • Hermes repairs/merges consecutive user messages
  • old unanswered-looking questions get folded into the next prompt
  • the new reply appears to repeat or drag in previous topics

Minimal Regression Shape

A focused test should simulate:

  1. Create a cached AIAgent for a gateway session
  2. Run one turn so _last_flushed_db_idx becomes non-zero
  3. Reuse the same cached agent for a second turn with freshly loaded history
  4. Do not reset _last_flushed_db_idx
  5. Persist the second turn
  6. Assert that the second turn's assistant row is missing from SessionDB

Then apply the fix and assert the assistant row is present.

Suggested Fix

When reusing a cached agent, realign the flush cursor to the history actually being replayed for this turn.

Two plausible fixes:

  1. In the gateway path, after agent_history is built for the current turn, set:

    agent._last_flushed_db_idx = len(agent_history)
  2. Or more defensively, inside persistence, clamp / recompute flush_from so stale cached-agent state cannot skip the current turn.

The first option seems the most direct because _last_flushed_db_idx is turn-local persistence state, and cached-agent reuse is precisely where the stale value crosses turn boundaries.

Expected Invariant

For every successful gateway turn:

If a visible assistant response is produced, the session transcript for that turn must contain the corresponding assistant row.

Environment

  • Hermes gateway on macOS
  • Profile-scoped gateway session
  • Cached-agent reuse enabled in gateway
  • SessionDB (state.db) is the canonical transcript store

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/gatewayGateway runner, session dispatch, deliverytype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions