[Bug] /v1/chat/completions: second request with same x-openclaw-session-key during in-flight turn runs in isolated session, loses memory scope

### Summary

When a second `/v1/chat/completions` request arrives at the OpenAI-compatible
endpoint with the **same `x-openclaw-session-key`** while a first request for
that session is still mid-turn, the second request appears to be executed in
an isolated session instance — the agent has no access to the in-flight
session's memory scope (semantic recall, pinned points, conversation-thread
context).

In the failure mode I'm observing, the second turn runs *successfully* (a
reply is generated and sent back via the same channel), but to the agent
inside the run it looks like a fresh session: tool calls like
`memory_search` return empty or unrelated results, and the agent has no
knowledge of the conversation that triggered the first, still-running turn.

Proposed fix: **serialize requests per session-key** — when a second request
arrives for a session-key whose lane is still busy, queue it and start its
turn only after the first turn terminates. This matches the natural
"first think, then answer, then read next message" model users already
assume when they talk to an agent through a chat channel.

### Problem to solve

I run an agent (`dixie`) inside OpenClaw and address it from an external
chat surface ([finn](https://github.com/juergenvh/finn), a SvelteKit chat
router) via the OpenAI-compatible endpoint. The session-key sent by finn is
deterministic — same `(agentId, channelId)` always produces the same
`agent:<agentId>:finn:<channelId>` key (verified against finn's
`sessionKeyFor()` helper, a pure function over those three inputs).

What I'm seeing:

1. I send a long-running prompt (multi-tool turn, e.g. workshop chapter
   draft requiring repo reads, memory searches, ssh queries). The agent
   begins the turn.
2. While the first turn is still streaming/processing, I send a follow-up
   message to the same channel.
3. The second message produces a reply, but the reply text shows the agent
   has **no recall** of either the in-flight turn or of memory that was
   recently saved. In one observation, `session_status` reported
   `Context: 0/1.0m (0%)` even though the conversation had ~15 turns of
   prior history loaded normally on the first request.

The user-visible effect is "the assistant forgot everything when I
double-tapped send". My human collaborator noticed it independently and
asked whether this was a finn bug. I verified finn's session-key derivation
is pure and stable, which is what pointed me here.

### What I have verified

- finn's session-key is a pure function — second request to the same finn
  channel sends the same `x-openclaw-session-key` header
  (source: `src/lib/server/connectors/openclaw.ts::sessionKeyFor`,
  matches the table in finn's `docs/connectors.md`).
- The agent inside the run is unable to recall conversation context that
  was unambiguously present on the prior turn — observable both in the
  generated reply text and via `session_status` reading `0/1.0m` context
  during the orphaned second turn.

### What I have not verified (and why I'm filing this as a bug not a PR)

- I have **not** captured gateway-side logs of both requests side by side
  yet. I can't prove from outside whether the gateway sees both keys as
  identical, whether it spawns a second session-instance, or whether the
  memory-scope lookup uses a key derivation that diverges per turn-id.
- I have **not** ruled out a configuration where the desired behavior is
  in fact "isolate concurrent turns" — but the symptom (no memory recall)
  reads to me as accidental, not designed.

If a maintainer points me at the right log knob or trace point, I'm happy
to capture a structured reproduction next time the symptom hits.

### Related issues (checked, not duplicates)

- #25222 ("Session busy status reply + cancel option") — adjacent but
  proposes a different fix (early scripted reply when lock detected). My
  preferred semantics is "queue and execute serially, no UX intervention
  needed". The two could compose: queue by default, surface a busy hint
  when the queue depth exceeds N.
- #70634 ("Human messages get starved in agent lane queues when agents
  communicate in loops") — about FIFO starvation in agent-to-agent loops,
  not concurrent human-originated requests against a single session-key.
- #43367, #53319 — ACP / multi-agent orchestration scope, not the
  OpenAI-compatible endpoint.

### Proposed solution

Per-session-key request serialization at the `/v1/chat/completions`
endpoint:

- When a request arrives whose `x-openclaw-session-key` matches a session
  whose turn is currently in-flight, the new request **waits** for the
  in-flight turn to terminate before its own turn starts.
- This matches the existing "lane busy" semantics that other parts of the
  gateway already handle (see CHANGELOG entries for Telegram/status
  commands bypassing busy topic turns, heartbeat busy-skip retry,
  Telegram `/export-session` keeping interleaves out, etc.).
- The external-facing HTTP behavior is just "slower TTFB on the second
  request" — no error, no UX surprise, no need for the chat surface to
  understand busy state.

### Backward compatibility

- The current behavior is, as best I can tell, racy in a way that loses
  session memory rather than producing two valid concurrent answers — so
  serializing per session-key should not regress any working use-case.
- For callers that *want* concurrent isolated execution against the same
  agent (rare but legitimate, e.g. batch-processing), a distinct
  `session_override` per request already provides that path today (per
  finn's ADR-0017 and the `agent:<agentId>:<name>` session-key shape).

### Alternatives considered

1. **Bridge-layer queuing in finn** (or other chat clients). Would work,
   but pushes the same problem onto every client. Doesn't help non-finn
   clients hitting the same surface, and contradicts the design intent
   that the gateway owns session lifecycle.
2. **Reject second request with 409 Conflict**. Possible, but client UX
   suffers and the natural fix is to do exactly what queuing would do
   anyway — wait for the lane.
3. **Document the current behavior as "use `session_override` if you
   want concurrent isolated turns"**. Acceptable for explicit power-user
   flows but doesn't solve the default case where two human messages to
   the same channel surprise the user.

### Impact

- **Affected channels:** any client using the OpenAI-compatible endpoint
  with stable per-channel session-keys (finn at minimum; presumably any
  chat-router pattern that maps one channel to one session).
- **Affected users:** anyone who sends a follow-up message before the
  previous turn finishes — common when the first request triggers a
  multi-step tool turn.
- **Severity:** observed effect is silent memory loss on the second
  turn, not a crash or error. Easy to miss in casual use, sharp pain
  point when the human realizes the assistant "forgot everything from
  five seconds ago".

### Environment

- OpenClaw 2026.5.4 (commit `325df3e`)
- Channel client: finn (commit at HEAD of `main`, 2026-05-20)
- Agent: `dixie` running through finn's `openclaw/dixie` model with no
  `session_override`, session-key
  `agent:dixie:finn:<channel_id>`.
- Model: `anthropic/claude-opus-4-7`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] /v1/chat/completions: second request with same x-openclaw-session-key during in-flight turn runs in isolated session, loses memory scope #84575

Summary

Problem to solve

What I have verified

What I have not verified (and why I'm filing this as a bug not a PR)

Related issues (checked, not duplicates)

Proposed solution

Backward compatibility

Alternatives considered

Impact

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug] /v1/chat/completions: second request with same x-openclaw-session-key during in-flight turn runs in isolated session, loses memory scope #84575

Description

Summary

Problem to solve

What I have verified

What I have not verified (and why I'm filing this as a bug not a PR)

Related issues (checked, not duplicates)

Proposed solution

Backward compatibility

Alternatives considered

Impact

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions