Skip to content

[Bug] /v1/chat/completions: second request with same x-openclaw-session-key during in-flight turn runs in isolated session, loses memory scope #84575

@juergenvh

Description

@juergenvh

Summary

When a second /v1/chat/completions request arrives at the OpenAI-compatible
endpoint with the same x-openclaw-session-key while a first request for
that session is still mid-turn, the second request appears to be executed in
an isolated session instance — the agent has no access to the in-flight
session's memory scope (semantic recall, pinned points, conversation-thread
context).

In the failure mode I'm observing, the second turn runs successfully (a
reply is generated and sent back via the same channel), but to the agent
inside the run it looks like a fresh session: tool calls like
memory_search return empty or unrelated results, and the agent has no
knowledge of the conversation that triggered the first, still-running turn.

Proposed fix: serialize requests per session-key — when a second request
arrives for a session-key whose lane is still busy, queue it and start its
turn only after the first turn terminates. This matches the natural
"first think, then answer, then read next message" model users already
assume when they talk to an agent through a chat channel.

Problem to solve

I run an agent (dixie) inside OpenClaw and address it from an external
chat surface (finn, a SvelteKit chat
router) via the OpenAI-compatible endpoint. The session-key sent by finn is
deterministic — same (agentId, channelId) always produces the same
agent:<agentId>:finn:<channelId> key (verified against finn's
sessionKeyFor() helper, a pure function over those three inputs).

What I'm seeing:

  1. I send a long-running prompt (multi-tool turn, e.g. workshop chapter
    draft requiring repo reads, memory searches, ssh queries). The agent
    begins the turn.
  2. While the first turn is still streaming/processing, I send a follow-up
    message to the same channel.
  3. The second message produces a reply, but the reply text shows the agent
    has no recall of either the in-flight turn or of memory that was
    recently saved. In one observation, session_status reported
    Context: 0/1.0m (0%) even though the conversation had ~15 turns of
    prior history loaded normally on the first request.

The user-visible effect is "the assistant forgot everything when I
double-tapped send". My human collaborator noticed it independently and
asked whether this was a finn bug. I verified finn's session-key derivation
is pure and stable, which is what pointed me here.

What I have verified

  • finn's session-key is a pure function — second request to the same finn
    channel sends the same x-openclaw-session-key header
    (source: src/lib/server/connectors/openclaw.ts::sessionKeyFor,
    matches the table in finn's docs/connectors.md).
  • The agent inside the run is unable to recall conversation context that
    was unambiguously present on the prior turn — observable both in the
    generated reply text and via session_status reading 0/1.0m context
    during the orphaned second turn.

What I have not verified (and why I'm filing this as a bug not a PR)

  • I have not captured gateway-side logs of both requests side by side
    yet. I can't prove from outside whether the gateway sees both keys as
    identical, whether it spawns a second session-instance, or whether the
    memory-scope lookup uses a key derivation that diverges per turn-id.
  • I have not ruled out a configuration where the desired behavior is
    in fact "isolate concurrent turns" — but the symptom (no memory recall)
    reads to me as accidental, not designed.

If a maintainer points me at the right log knob or trace point, I'm happy
to capture a structured reproduction next time the symptom hits.

Related issues (checked, not duplicates)

Proposed solution

Per-session-key request serialization at the /v1/chat/completions
endpoint:

  • When a request arrives whose x-openclaw-session-key matches a session
    whose turn is currently in-flight, the new request waits for the
    in-flight turn to terminate before its own turn starts.
  • This matches the existing "lane busy" semantics that other parts of the
    gateway already handle (see CHANGELOG entries for Telegram/status
    commands bypassing busy topic turns, heartbeat busy-skip retry,
    Telegram /export-session keeping interleaves out, etc.).
  • The external-facing HTTP behavior is just "slower TTFB on the second
    request" — no error, no UX surprise, no need for the chat surface to
    understand busy state.

Backward compatibility

  • The current behavior is, as best I can tell, racy in a way that loses
    session memory rather than producing two valid concurrent answers — so
    serializing per session-key should not regress any working use-case.
  • For callers that want concurrent isolated execution against the same
    agent (rare but legitimate, e.g. batch-processing), a distinct
    session_override per request already provides that path today (per
    finn's ADR-0017 and the agent:<agentId>:<name> session-key shape).

Alternatives considered

  1. Bridge-layer queuing in finn (or other chat clients). Would work,
    but pushes the same problem onto every client. Doesn't help non-finn
    clients hitting the same surface, and contradicts the design intent
    that the gateway owns session lifecycle.
  2. Reject second request with 409 Conflict. Possible, but client UX
    suffers and the natural fix is to do exactly what queuing would do
    anyway — wait for the lane.
  3. Document the current behavior as "use session_override if you
    want concurrent isolated turns"
    . Acceptable for explicit power-user
    flows but doesn't solve the default case where two human messages to
    the same channel surprise the user.

Impact

  • Affected channels: any client using the OpenAI-compatible endpoint
    with stable per-channel session-keys (finn at minimum; presumably any
    chat-router pattern that maps one channel to one session).
  • Affected users: anyone who sends a follow-up message before the
    previous turn finishes — common when the first request triggers a
    multi-step tool turn.
  • Severity: observed effect is silent memory loss on the second
    turn, not a crash or error. Easy to miss in casual use, sharp pain
    point when the human realizes the assistant "forgot everything from
    five seconds ago".

Environment

  • OpenClaw 2026.5.4 (commit 325df3e)
  • Channel client: finn (commit at HEAD of main, 2026-05-20)
  • Agent: dixie running through finn's openclaw/dixie model with no
    session_override, session-key
    agent:dixie:finn:<channel_id>.
  • Model: anthropic/claude-opus-4-7

Metadata

Metadata

Assignees

Labels

P2Normal backlog priority with limited blast radius.clawsweeper:needs-infoClawSweeper needs more reporter information before it can verify this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦪 silver shellfishThin issue quality; more reproduction proof or environment detail is needed.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions