Summary
When a second /v1/chat/completions request arrives at the OpenAI-compatible
endpoint with the same x-openclaw-session-key while a first request for
that session is still mid-turn, the second request appears to be executed in
an isolated session instance — the agent has no access to the in-flight
session's memory scope (semantic recall, pinned points, conversation-thread
context).
In the failure mode I'm observing, the second turn runs successfully (a
reply is generated and sent back via the same channel), but to the agent
inside the run it looks like a fresh session: tool calls like
memory_search return empty or unrelated results, and the agent has no
knowledge of the conversation that triggered the first, still-running turn.
Proposed fix: serialize requests per session-key — when a second request
arrives for a session-key whose lane is still busy, queue it and start its
turn only after the first turn terminates. This matches the natural
"first think, then answer, then read next message" model users already
assume when they talk to an agent through a chat channel.
Problem to solve
I run an agent (dixie) inside OpenClaw and address it from an external
chat surface (finn, a SvelteKit chat
router) via the OpenAI-compatible endpoint. The session-key sent by finn is
deterministic — same (agentId, channelId) always produces the same
agent:<agentId>:finn:<channelId> key (verified against finn's
sessionKeyFor() helper, a pure function over those three inputs).
What I'm seeing:
- I send a long-running prompt (multi-tool turn, e.g. workshop chapter
draft requiring repo reads, memory searches, ssh queries). The agent
begins the turn.
- While the first turn is still streaming/processing, I send a follow-up
message to the same channel.
- The second message produces a reply, but the reply text shows the agent
has no recall of either the in-flight turn or of memory that was
recently saved. In one observation, session_status reported
Context: 0/1.0m (0%) even though the conversation had ~15 turns of
prior history loaded normally on the first request.
The user-visible effect is "the assistant forgot everything when I
double-tapped send". My human collaborator noticed it independently and
asked whether this was a finn bug. I verified finn's session-key derivation
is pure and stable, which is what pointed me here.
What I have verified
- finn's session-key is a pure function — second request to the same finn
channel sends the same x-openclaw-session-key header
(source: src/lib/server/connectors/openclaw.ts::sessionKeyFor,
matches the table in finn's docs/connectors.md).
- The agent inside the run is unable to recall conversation context that
was unambiguously present on the prior turn — observable both in the
generated reply text and via session_status reading 0/1.0m context
during the orphaned second turn.
What I have not verified (and why I'm filing this as a bug not a PR)
- I have not captured gateway-side logs of both requests side by side
yet. I can't prove from outside whether the gateway sees both keys as
identical, whether it spawns a second session-instance, or whether the
memory-scope lookup uses a key derivation that diverges per turn-id.
- I have not ruled out a configuration where the desired behavior is
in fact "isolate concurrent turns" — but the symptom (no memory recall)
reads to me as accidental, not designed.
If a maintainer points me at the right log knob or trace point, I'm happy
to capture a structured reproduction next time the symptom hits.
Related issues (checked, not duplicates)
Proposed solution
Per-session-key request serialization at the /v1/chat/completions
endpoint:
- When a request arrives whose
x-openclaw-session-key matches a session
whose turn is currently in-flight, the new request waits for the
in-flight turn to terminate before its own turn starts.
- This matches the existing "lane busy" semantics that other parts of the
gateway already handle (see CHANGELOG entries for Telegram/status
commands bypassing busy topic turns, heartbeat busy-skip retry,
Telegram /export-session keeping interleaves out, etc.).
- The external-facing HTTP behavior is just "slower TTFB on the second
request" — no error, no UX surprise, no need for the chat surface to
understand busy state.
Backward compatibility
- The current behavior is, as best I can tell, racy in a way that loses
session memory rather than producing two valid concurrent answers — so
serializing per session-key should not regress any working use-case.
- For callers that want concurrent isolated execution against the same
agent (rare but legitimate, e.g. batch-processing), a distinct
session_override per request already provides that path today (per
finn's ADR-0017 and the agent:<agentId>:<name> session-key shape).
Alternatives considered
- Bridge-layer queuing in finn (or other chat clients). Would work,
but pushes the same problem onto every client. Doesn't help non-finn
clients hitting the same surface, and contradicts the design intent
that the gateway owns session lifecycle.
- Reject second request with 409 Conflict. Possible, but client UX
suffers and the natural fix is to do exactly what queuing would do
anyway — wait for the lane.
- Document the current behavior as "use
session_override if you
want concurrent isolated turns". Acceptable for explicit power-user
flows but doesn't solve the default case where two human messages to
the same channel surprise the user.
Impact
- Affected channels: any client using the OpenAI-compatible endpoint
with stable per-channel session-keys (finn at minimum; presumably any
chat-router pattern that maps one channel to one session).
- Affected users: anyone who sends a follow-up message before the
previous turn finishes — common when the first request triggers a
multi-step tool turn.
- Severity: observed effect is silent memory loss on the second
turn, not a crash or error. Easy to miss in casual use, sharp pain
point when the human realizes the assistant "forgot everything from
five seconds ago".
Environment
- OpenClaw 2026.5.4 (commit
325df3e)
- Channel client: finn (commit at HEAD of
main, 2026-05-20)
- Agent:
dixie running through finn's openclaw/dixie model with no
session_override, session-key
agent:dixie:finn:<channel_id>.
- Model:
anthropic/claude-opus-4-7
Summary
When a second
/v1/chat/completionsrequest arrives at the OpenAI-compatibleendpoint with the same
x-openclaw-session-keywhile a first request forthat session is still mid-turn, the second request appears to be executed in
an isolated session instance — the agent has no access to the in-flight
session's memory scope (semantic recall, pinned points, conversation-thread
context).
In the failure mode I'm observing, the second turn runs successfully (a
reply is generated and sent back via the same channel), but to the agent
inside the run it looks like a fresh session: tool calls like
memory_searchreturn empty or unrelated results, and the agent has noknowledge of the conversation that triggered the first, still-running turn.
Proposed fix: serialize requests per session-key — when a second request
arrives for a session-key whose lane is still busy, queue it and start its
turn only after the first turn terminates. This matches the natural
"first think, then answer, then read next message" model users already
assume when they talk to an agent through a chat channel.
Problem to solve
I run an agent (
dixie) inside OpenClaw and address it from an externalchat surface (finn, a SvelteKit chat
router) via the OpenAI-compatible endpoint. The session-key sent by finn is
deterministic — same
(agentId, channelId)always produces the sameagent:<agentId>:finn:<channelId>key (verified against finn'ssessionKeyFor()helper, a pure function over those three inputs).What I'm seeing:
draft requiring repo reads, memory searches, ssh queries). The agent
begins the turn.
message to the same channel.
has no recall of either the in-flight turn or of memory that was
recently saved. In one observation,
session_statusreportedContext: 0/1.0m (0%)even though the conversation had ~15 turns ofprior history loaded normally on the first request.
The user-visible effect is "the assistant forgot everything when I
double-tapped send". My human collaborator noticed it independently and
asked whether this was a finn bug. I verified finn's session-key derivation
is pure and stable, which is what pointed me here.
What I have verified
channel sends the same
x-openclaw-session-keyheader(source:
src/lib/server/connectors/openclaw.ts::sessionKeyFor,matches the table in finn's
docs/connectors.md).was unambiguously present on the prior turn — observable both in the
generated reply text and via
session_statusreading0/1.0mcontextduring the orphaned second turn.
What I have not verified (and why I'm filing this as a bug not a PR)
yet. I can't prove from outside whether the gateway sees both keys as
identical, whether it spawns a second session-instance, or whether the
memory-scope lookup uses a key derivation that diverges per turn-id.
in fact "isolate concurrent turns" — but the symptom (no memory recall)
reads to me as accidental, not designed.
If a maintainer points me at the right log knob or trace point, I'm happy
to capture a structured reproduction next time the symptom hits.
Related issues (checked, not duplicates)
proposes a different fix (early scripted reply when lock detected). My
preferred semantics is "queue and execute serially, no UX intervention
needed". The two could compose: queue by default, surface a busy hint
when the queue depth exceeds N.
communicate in loops") — about FIFO starvation in agent-to-agent loops,
not concurrent human-originated requests against a single session-key.
OpenAI-compatible endpoint.
Proposed solution
Per-session-key request serialization at the
/v1/chat/completionsendpoint:
x-openclaw-session-keymatches a sessionwhose turn is currently in-flight, the new request waits for the
in-flight turn to terminate before its own turn starts.
gateway already handle (see CHANGELOG entries for Telegram/status
commands bypassing busy topic turns, heartbeat busy-skip retry,
Telegram
/export-sessionkeeping interleaves out, etc.).request" — no error, no UX surprise, no need for the chat surface to
understand busy state.
Backward compatibility
session memory rather than producing two valid concurrent answers — so
serializing per session-key should not regress any working use-case.
agent (rare but legitimate, e.g. batch-processing), a distinct
session_overrideper request already provides that path today (perfinn's ADR-0017 and the
agent:<agentId>:<name>session-key shape).Alternatives considered
but pushes the same problem onto every client. Doesn't help non-finn
clients hitting the same surface, and contradicts the design intent
that the gateway owns session lifecycle.
suffers and the natural fix is to do exactly what queuing would do
anyway — wait for the lane.
session_overrideif youwant concurrent isolated turns". Acceptable for explicit power-user
flows but doesn't solve the default case where two human messages to
the same channel surprise the user.
Impact
with stable per-channel session-keys (finn at minimum; presumably any
chat-router pattern that maps one channel to one session).
previous turn finishes — common when the first request triggers a
multi-step tool turn.
turn, not a crash or error. Easy to miss in casual use, sharp pain
point when the human realizes the assistant "forgot everything from
five seconds ago".
Environment
325df3e)main, 2026-05-20)dixierunning through finn'sopenclaw/dixiemodel with nosession_override, session-keyagent:dixie:finn:<channel_id>.anthropic/claude-opus-4-7