fix(auto-reply): clear runtime model cache on reset by mjamiv · Pull Request #77339 · openclaw/openclaw

mjamiv · 2026-05-04T13:41:59Z

Summary

clear cached runtime model/provider fields when /new or /reset creates a fresh auto-reply channel session
keep explicit preserved overrides and unrelated behavior overrides intact
add a persisted-store regression for [Bug]: session.model cache survives /new and ignores agents.defaults.model.primary changes, scope distinct from PR #69419 #77322 plus changelog entry

Why

the transient reset entry was clean, but the persisted store merge kept old model/modelProvider fields and could keep using the prior model after defaults changed

Real behavior proof

Behavior or issue addressed: /new and /reset should start a fresh channel session without carrying stale runtime model cache fields from the previous run, so a defaults change or a model retirement actually takes effect on the next turn.

Real environment tested: rebased patched OpenClaw source checkout at /tmp/openclaw-77339, Node v24.14.0, running a standalone tsx driver script that imports the production initSessionState from src/auto-reply/reply/session.ts and exercises a real on-disk persisted session store. No vitest, no mocks — just the real auto-reply session-state code path against a real sessions.json file in /tmp.

Exact steps or command run after this patch:

Build a small driver script that imports initSessionState from the patched src/auto-reply/reply/session.ts, seeds a real sessions.json with a session entry that has modelProvider: "openai", model: "gpt-5.4-mini", contextTokens: 400_000, and verboseLevel: "on", calls initSessionState with Body: "/new" (and again with Body: "/reset"), then prints the live persisted store contents and the returned sessionEntry fields. The driver is not part of the diff.
pnpm exec tsx scratch-77322-demo.mts

Driver script:

import * as fs from "node:fs/promises";
import * as os from "node:os";
import * as path from "node:path";
import { initSessionState } from "./src/auto-reply/reply/session.js";

async function main() {
  const tmpRoot = await fs.mkdtemp(path.join(os.tmpdir(), "openclaw-77322-demo-"));
  const storePath = path.join(tmpRoot, "sessions.json");
  const sessionKey = "agent:main:telegram:direct:demo";
  const seedEntry = {
    sessionId: "session-before-reset",
    updatedAt: Date.now(),
    modelProvider: "openai",
    model: "gpt-5.4-mini",
    contextTokens: 400_000,
    verboseLevel: "on",
  };
  await fs.writeFile(storePath, JSON.stringify({ [sessionKey]: seedEntry }, null, 2));
  console.log("[before /new] sessions.json:");
  console.log((await fs.readFile(storePath, "utf-8")).trim());

  const cfg = { session: { store: storePath, idleMinutes: 999 } } as any;
  const r1 = await initSessionState({
    ctx: { Body: "/new", RawBody: "/new", CommandBody: "/new",
           From: "user", To: "bot", ChatType: "direct",
           SessionKey: sessionKey, Provider: "telegram", Surface: "telegram" } as any,
    cfg, commandAuthorized: true,
  });
  console.log("\n[after /new] sessionEntry.modelProvider:", r1.sessionEntry.modelProvider);
  console.log("[after /new] sessionEntry.model:", r1.sessionEntry.model);
  console.log("[after /new] sessionEntry.verboseLevel:", r1.sessionEntry.verboseLevel);
  console.log("[after /new] sessions.json:");
  console.log((await fs.readFile(storePath, "utf-8")).trim());

  await fs.writeFile(storePath, JSON.stringify({ [sessionKey]: seedEntry }, null, 2));
  const r2 = await initSessionState({
    ctx: { Body: "/reset", RawBody: "/reset", CommandBody: "/reset",
           From: "user", To: "bot", ChatType: "direct",
           SessionKey: sessionKey, Provider: "telegram", Surface: "telegram" } as any,
    cfg, commandAuthorized: true,
  });
  console.log("\n[after /reset] sessionEntry.modelProvider:", r2.sessionEntry.modelProvider);
  console.log("[after /reset] sessionEntry.model:", r2.sessionEntry.model);
  console.log("[after /reset] sessionEntry.verboseLevel:", r2.sessionEntry.verboseLevel);
}
main().catch((e) => { console.error(e); process.exit(1); });

Evidence after fix: copied live stdout from the tsx driver, with uuids/timestamps as emitted by the runtime (anonymized identifiers only):

[before /new] sessions.json:
{
  "agent:main:telegram:direct:demo": {
    "sessionId": "session-before-reset",
    "updatedAt": 1778436597524,
    "modelProvider": "openai",
    "model": "gpt-5.4-mini",
    "contextTokens": 400000,
    "verboseLevel": "on"
  }
}

[after /new] sessionEntry.modelProvider: undefined
[after /new] sessionEntry.model: undefined
[after /new] sessionEntry.verboseLevel: on
[after /new] sessions.json:
{
  "agent:main:telegram:direct:demo": {
    "sessionId": "fa90f97a-7512-41fb-a75f-74d4a0f4aa2f",
    "updatedAt": 1778436649018,
    "verboseLevel": "on",
    "sessionStartedAt": 1778436649016,
    "lastInteractionAt": 1778436649016,
    "systemSent": false,
    "abortedLastRun": false,
    "usageFamilyKey": "agent:main:telegram:direct:demo",
    "usageFamilySessionIds": [
      "session-before-reset",
      "fa90f97a-7512-41fb-a75f-74d4a0f4aa2f"
    ],
    "chatType": "direct",
    "deliveryContext": { "channel": "telegram", "to": "bot" },
    "lastChannel": "telegram",
    "lastTo": "bot",
    "origin": { "label": "user", "provider": "telegram", "surface": "telegram",
                "chatType": "direct", "from": "user", "to": "bot" },
    "sessionFile": "<state-dir>/agents/main/sessions/fa90f97a-7512-41fb-a75f-74d4a0f4aa2f.jsonl",
    "compactionCount": 0
  }
}

[after /reset] sessionEntry.modelProvider: undefined
[after /reset] sessionEntry.model: undefined
[after /reset] sessionEntry.verboseLevel: on

Observed result after fix:

sessionId rotates to a fresh uuid (true reset; usageFamilySessionIds keeps both entries for cost tracking).
modelProvider, model, and contextTokens are absent from the persisted entry — the stale runtime cache is gone, so the next turn resolves from current defaults or explicit preserved overrides.
verboseLevel: "on" (an unrelated behavior override) is preserved.
Same shape on both /new and /reset.

What was not tested: full Telegram network round-trip; the proof exercises the production auto-reply session-state code path that channel commands route through (the same initSessionState call that runs in production for /new and /reset).

Validation

pnpm install
pnpm test -- src/auto-reply/reply/session.test.ts
pnpm test src/gateway/sessions-patch.test.ts src/gateway/server.sessions.reset-models.test.ts
pnpm exec oxfmt --check --threads=1 src/auto-reply/reply/session.ts src/auto-reply/reply/session.test.ts CHANGELOG.md
git diff --check
pnpm check:changed -- --base upstream/main --head HEAD
pnpm exec tsx scratch-77322-demo.mts (the live driver above)

Fixes #77322

clawsweeper · 2026-05-04T13:45:14Z

Codex review: needs maintainer review before merge. Reviewed May 28, 2026, 11:46 PM ET / 03:46 UTC.

Summary
The PR clears cached auto-reply modelProvider/model fields during /new and /reset, adds persisted-store regression coverage, and adds one changelog entry.

PR surface: Source +5, Tests +59, Docs +1. Total +65 across 3 files.

Reproducibility: yes. source-level reproduction is clear: current main resets the session but does not clear stale modelProvider/model, and the PR body supplies a real on-disk session-store driver showing the fields gone after /new and /reset. I did not run tests because this review is read-only.

Review metrics: 1 noteworthy metric.

Release-owned changelog touch: 1 added entry. CHANGELOG.md is release-owned in this repo, so maintainers may choose to drop or keep that release note during landing without changing the code verdict.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Next step before merge

[P2] No repair lane is needed; the branch already contains the focused code change, regression coverage, and sufficient proof, leaving ordinary maintainer review and landing.

Security
Cleared: The diff touches only auto-reply session code, colocated tests, and a changelog entry; it does not change dependencies, workflows, secrets, permissions, or code download/execution paths.

Review details

Best possible solution:

Land the focused auto-reply reset cleanup after maintainer review, with release-note handling left to the repository release flow.

Do we have a high-confidence way to reproduce the issue?

Yes, source-level reproduction is clear: current main resets the session but does not clear stale modelProvider/model, and the PR body supplies a real on-disk session-store driver showing the fields gone after /new and /reset. I did not run tests because this review is read-only.

Is this the best way to solve the issue?

Yes, this is the narrow maintainable fix: clear only runtime cache fields in the auto-reply reset path while existing preserved-selection logic continues to protect explicit user overrides.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against e7fb8cabb681.

Label changes

Label changes:

add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix live terminal output from production initSessionState against a real disk-backed session store, which directly exercises the changed behavior.
add rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
remove rating: 🐚 platinum hermit: Current PR rating is rating: 🦞 diamond lobster, so this older rating label is no longer current.

Label justifications:

P2: This is a normal-priority channel/session model-selection bug fix with limited but real impact when users change defaults and reset a session.
rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (live_output): The PR body includes after-fix live terminal output from production initSessionState against a real disk-backed session store, which directly exercises the changed behavior.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix live terminal output from production initSessionState against a real disk-backed session store, which directly exercises the changed behavior.

Evidence reviewed

PR surface:

Source +5, Tests +59, Docs +1. Total +65 across 3 files.

View PR surface stats

Area	Files	Added	Net
Source	1	5	+5
Tests	1	59	+59
Docs	1	1	+1
Config	0	0	0
Generated	0	0	0
Other	0	0	0
Total	3	65	+65

What I checked:

Current main still carries runtime model fields: Current main builds the reset session entry from prior state and clears reset/token/cache fields, but it does not clear modelProvider or model before merging the entry back into the session store. (src/auto-reply/reply/session.ts:781, e7fb8cabb681)
PR clears the implicated cache fields: The PR head clears sessionEntry.modelProvider and sessionEntry.model inside the isNewSession reset cleanup before updateSessionStore persists the entry. (src/auto-reply/reply/session.ts:785, e87d2a84430f)
Regression coverage matches the reported bug: The new test seeds stale modelProvider, model, and contextTokens, then verifies both /new and /reset rotate the session, clear the runtime model cache in memory and on disk, and preserve unrelated verboseLevel. (src/auto-reply/reply/session.test.ts:2540, e87d2a84430f)
Explicit override contract remains separate: Current reset-selection code preserves user-driven model/provider/auth overrides and clears auto-created fallback selections, so this PR does not need a new config or product policy to distinguish explicit user choices from runtime cache fields. (src/config/sessions/reset-preserved-selection.ts:24, e7fb8cabb681)
Sibling reset surface already recomputes defaults: Gateway sessions.reset coverage expects reset to use the configured default model instead of stale runtime identity and to clear stale context-token state, which supports aligning auto-reply reset behavior with the same model-selection boundary. (src/gateway/server.sessions.reset-models.test.ts:64, e7fb8cabb681)
PR merge shape is clean against current main: A read-only merge-tree check reported merged hunks for the three touched files without conflicts against current main. (e87d2a84430f)

Likely related people:

steipete: Current main blame and path history for src/auto-reply/reply/session.ts point to the recent session-entry refactor commit that owns the current reset/session construction shape. (role: recent area contributor; confidence: medium; commits: d5bbf3033c9f; files: src/auto-reply/reply/session.ts, src/auto-reply/reply/session.test.ts)
hclsys: The related closed PR and review comment identified the same stale implicit model-cache failure mode and supported clearing stale runtime fields on reset. (role: related root-cause investigator; confidence: medium; commits: aa149597afd6; files: src/gateway/sessions-patch.test.ts, src/gateway/sessions-patch.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

mjamiv · 2026-05-08T04:25:25Z

Rebased this branch onto current upstream/main and resolved the CHANGELOG.md conflict by keeping the current upstream Unreleased notes plus this PR's Auto-reply fix entry.

Validation:

pnpm install
pnpm test -- src/auto-reply/reply/session.test.ts
git diff --check
pnpm check:changed -- --base upstream/main --head HEAD

mjamiv · 2026-05-16T03:59:06Z

Refreshed this branch onto current upstream/main and pushed as 6a39c911e3.

Conflict handling:

CHANGELOG.md: kept current upstream Unreleased notes and re-added this PR's Auto-reply reset-cache entry.
src/auto-reply/reply/session.test.ts: kept upstream's recovered auto-fallback override coverage and this PR's runtime model-cache regression as separate tests.

Local validation on the rebased head:

pnpm install --offline --frozen-lockfile
pnpm test -- src/auto-reply/reply/session.test.ts (84 tests passed)
./node_modules/.bin/oxfmt --check --threads=1 src/auto-reply/reply/session.ts src/auto-reply/reply/session.test.ts
git diff --check
pnpm check:changed -- --base refs/remotes/upstream/main --head HEAD

GitHub now reports head 6a39c911e3, mergeable=true, mergeable_state=clean; all reported checks are pass or skip.

clawsweeper · 2026-05-26T23:25:45Z

ClawSweeper PR egg

✨ Hatched: 🥚 common Gilded Proofling

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

Merged PRs are hatchable.
Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🥚 common.
Trait: sniffs out flaky tests.
Image traits: location green-check meadow; accessory proof snapshot camera; palette charcoal, cyan, and signal green; mood focused; pose holding its accessory up for inspection; shell frosted glass shell; lighting calm overcast light; background gentle dashboard dots.
Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Gilded Proofling in ClawSweeper.

What is this egg doing here?

Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

openclaw-barnacle Bot added the size: S label May 4, 2026

This was referenced May 4, 2026

fix(sessions): clear stale implicit model cache on /new (#77322) #77326

Closed

[Bug]: session.model cache survives /new and ignores agents.defaults.model.primary changes, scope distinct from PR #69419 #77322

Open

This comment was marked as low quality.

Sign in to view

clawsweeper Bot mentioned this pull request May 7, 2026

[Bug]: /status reports gpt-5.5 fallback even when agents.defaults.model.primary is set AND there is a partial agents.list['main'] #78859

Closed

mjamiv force-pushed the fix/77322-auto-reply-reset-model-cache branch from 6ef6a51 to e2c2bba Compare May 8, 2026 04:25

mjamiv force-pushed the fix/77322-auto-reply-reset-model-cache branch from e2c2bba to 8e63520 Compare May 10, 2026 18:28

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026

This comment was marked as low quality.

Sign in to view

This was referenced May 11, 2026

Codex agent on Telegram fails with stale openai:default API key despite all corrective actions #78484

Closed

Model switch experience: 5 issues when switching from qwen3.6-plus to deepseek-v4-pro #73144

Closed

mjamiv force-pushed the fix/77322-auto-reply-reset-model-cache branch from 8e63520 to 6a39c91 Compare May 16, 2026 03:50

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 16, 2026

clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 16, 2026

mjamiv force-pushed the fix/77322-auto-reply-reset-model-cache branch from 6a39c91 to 71a0ed6 Compare May 26, 2026 23:17

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 26, 2026

mjamiv force-pushed the fix/77322-auto-reply-reset-model-cache branch from 71a0ed6 to f66e2d7 Compare May 26, 2026 23:18

clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. labels May 26, 2026

clawsweeper Bot added status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P2 Normal backlog priority with limited blast radius. labels May 26, 2026

mjamiv force-pushed the fix/77322-auto-reply-reset-model-cache branch from f66e2d7 to 4972490 Compare May 29, 2026 03:37

openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 29, 2026

fix(auto-reply): clear runtime model cache on reset

e87d2a8

mjamiv force-pushed the fix/77322-auto-reply-reset-model-cache branch from 4972490 to e87d2a8 Compare May 29, 2026 03:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(auto-reply): clear runtime model cache on reset#77339

fix(auto-reply): clear runtime model cache on reset#77339
mjamiv wants to merge 1 commit into
openclaw:mainfrom
mjamiv:fix/77322-auto-reply-reset-model-cache

mjamiv commented May 4, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 4, 2026 •

edited

Loading

Uh oh!

This comment was marked as low quality.

mjamiv commented May 8, 2026

Uh oh!

This comment was marked as low quality.

This comment was marked as low quality.

mjamiv commented May 16, 2026

Uh oh!

clawsweeper Bot commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

mjamiv commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Real behavior proof

Uh oh!

clawsweeper Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as low quality.

mjamiv commented May 8, 2026

Uh oh!

This comment was marked as low quality.

This comment was marked as low quality.

mjamiv commented May 16, 2026

Uh oh!

clawsweeper Bot commented May 26, 2026

Hatch command

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mjamiv commented May 4, 2026 •

edited

Loading

clawsweeper Bot commented May 4, 2026 •

edited

Loading