Skip to content

Claude CLI sessions reset on every turn in group channels due to groupIntro drift in extraSystemPromptHash #69118

@Zordon7-pixel

Description

@Zordon7-pixel

Summary

Companion to #64386 — that issue covers mcpConfigHash drift on gateway restart. This issue reports the second failure mode of the same function: extraSystemPromptHash drifts on every turn transition in any group-style channel (Discord channels, Telegram groups, etc.), completely independent of restarts, causing every turn-2 reply to be generated against a fresh claude -p with no memory of turn 1.

Affects any deployment using claude-cli backend where the agent replies in more than one group channel. With default settings this is "all Discord/Telegram group users."

Users experience this as "the agent has amnesia within seconds" — which they often misattribute to model behaviour or config rather than a session-plumbing bug.

Repro (does not require a restart)

  1. Configure any agent on claude-cli backend in a Discord guild channel (channel_type=text).
  2. Mention the agent twice in succession — ~10 seconds apart, within the 20-minute idle window.
  3. Watch ~/.openclaw/logs/gateway.log:
[HH:MM:00] cli session reset: provider=claude-cli reason=system-prompt
  1. Turn 2's reply reads as if it never saw turn 1. Transcript .jsonl on disk shows two separate session_ids, not one resumed session.

Expected: turn 2 reuses the session and references turn 1.

Root cause

In src/auto-reply/reply/route-reply.ts (bundled as get-reply-*.js in releases), runPreparedReply assembles extraSystemPromptParts:

const shouldInjectGroupIntro = Boolean(
  isGroupChat && (isFirstTurnInSession || sessionEntry?.groupActivationNeedsSystemIntro)
);
const groupIntro = shouldInjectGroupIntro
  ? buildGroupIntro({ cfg, sessionCtx, sessionEntry, defaultActivation, silentToken: SILENT_REPLY_TOKEN })
  : "";
// ...
const extraSystemPromptParts = [
  buildInboundMetaSystemPrompt(...),
  groupChatContext,
  groupIntro,           // <-- non-empty only on first turn / re-intro
  groupSystemPrompt,
  buildExecOverridePromptHint(...)
].filter(Boolean);

groupIntro (src/auto-reply/reply/groups.ts:buildGroupIntro) emits a ~200–400 char block describing activation mode ("Activation: always-on (you receive every group message)..." etc). By design it is present on turn 1 and absent on turn 2+.

That assembled text is joined and hashed in src/agents/cli-runner/prepare.ts:

const extraSystemPrompt = params.extraSystemPrompt?.trim() ?? "";
const extraSystemPromptHash = hashCliSessionText(extraSystemPrompt);

Turn 1 hash: sha256(inboundMeta + groupChatContext + groupIntro + groupSystemPrompt + execHint)
Turn 2 hash: sha256(inboundMeta + groupChatContext + "" + groupSystemPrompt + execHint)

Different bytes → different hash. On turn 2, resolveCliSessionReuse (src/agents/cli-session.ts) hits:

if (normalizeOptionalString(binding?.extraSystemPromptHash) !== currentExtraSystemPromptHash)
  return { invalidatedReason: "system-prompt" };

runCliWithSession(undefined)claude -p (fresh) → amnesia.

Why this is the wrong invalidation key

This is the same category of bug as #64386 and the underlying architectural mistake is worth calling out together:

The system prompt is re-sent to claude-cli on every invocation via --system-prompt / --append-system-prompt. It does not live inside the --resume transcript. A different system prompt on turn 2 is not a corruption — it's the normal case (context drift, new tools, different user flags).

By the same token, mcpConfigHash (per #64386) hashes --mcp-config content that is also re-read per invocation.

So both extraSystemPromptHash and mcpConfigHash as session-reuse keys optimize for an imagined failure mode (resume-with-stale-env corrupts the transcript) that does not exist in the CLI runtime. They only produce false-positive invalidations.

The only legitimate invalidation keys are the two that remain: authProfileId and authEpoch — a genuine auth rotation means the stored sessionId likely belongs to a different account and shouldn't be resumed.

Additional aggravator in the writer

setCliSessionBinding (src/agents/cli-session.ts) stores optional fields via spread-conditional:

...normalizeOptionalString(binding.extraSystemPromptHash)
  ? { extraSystemPromptHash: normalizeOptionalString(binding.extraSystemPromptHash) }
  : {}

If a binding was first written under a dist that didn't populate the field, or by a turn that happened to produce an empty prompt, the stored binding lacks the field. The comparator then treats undefined !== <hex> as a mismatch, invalidating every subsequent turn for the lifetime of that binding.

In this environment I observed two live Discord channel bindings in sessions.json with exactly this shape:

"cliSessionBindings": {
  "claude-cli": {
    "sessionId": "ab533298-...",
    "mcpConfigHash": "452116cfc1..."
    // no authProfileId, no extraSystemPromptHash
  }
}

Every turn on those channels produced reason=system-prompt until the fix was applied.

Evidence from a live gateway

2026-04-19T11:38:10 [agent] cli session reset: provider=claude-cli reason=system-prompt  <- turn-2 after fresh session
2026-04-19T11:51:02 [agent] cli session reset: provider=claude-cli reason=system-prompt  <- turn-3 in a separate channel
2026-04-19T13:15:00 [agent] cli session reset: provider=claude-cli reason=mcp              <- #64386 after restart
2026-04-19T18:18:55 [agent] cli session reset: provider=claude-cli reason=mcp              <- #64386 after another restart

With the comparator change described below applied locally, zero reason=system-prompt events in 24h across five active Discord channels.

Why tests did not catch it

src/agents/cli-session.test.ts tests resolveCliSessionReuse with hand-crafted binding/params pairs, but has no end-to-end test that computes extraSystemPromptHash across two consecutive turn transitions in a group context. A test as narrow as:

it("reuses session across turn-1 → turn-2 in a group channel", async () => {
  const turn1Hash = await runTurnAndReturnHash({ isFirstTurn: true, isGroup: true });
  const turn2Hash = await runTurnAndReturnHash({ isFirstTurn: false, isGroup: true });
  expect(turn2Hash).toBe(turn1Hash); // would fail today
});

would have caught this.

Suggested fix

Two options, in order of preference:

1. Drop extraSystemPromptHash and mcpConfigHash from resolveCliSessionReuse entirely. Keep only authProfileId and authEpoch. This fixes both this issue and #64386 in one patch, and removes a class of future regressions when anyone adds a new part to extraSystemPromptParts or merges additional ephemeral state into mergedConfig.

2. If the hashes must be retained for some reason not yet documented, normalize the hashed inputs to strip turn-variable content:

Either way, make the comparator tolerant: if binding[field] is undefined (legacy binding), skip that axis rather than invalidating.

Workaround in use

Local patch replaces resolveCliSessionReuse body with option (1) + tolerant auth comparison. Applied via a gateway-launcher hook (~/.openclaw/bin/apply-hermes-dist-patches.sh) that re-applies after auto-updates and detects upstream refactors to fail loudly in a log rather than silently.

Happy to open a PR against src/agents/cli-session.ts with the comparator change + a regression test covering the turn-1 → turn-2 group-channel scenario if maintainers would accept it.

Environment

  • openclaw 2026.4.15 (install via npm i -g openclaw)
  • claude-cli backend, OAuth auth profile (Claude Pro)
  • macOS 14, Node 22, launchd-managed gateway on port 18789
  • Discord channels (guild text, not DM)

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.clawsweeper:queueable-fixClawSweeper marked this issue as an existing queue_fix_pr work candidate.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions