fix(auto-reply): clear runtime model cache on reset#77339
Conversation
|
Codex review: needs maintainer review before merge. Reviewed May 28, 2026, 11:46 PM ET / 03:46 UTC. Summary PR surface: Source +5, Tests +59, Docs +1. Total +65 across 3 files. Reproducibility: yes. source-level reproduction is clear: current main resets the session but does not clear stale Review metrics: 1 noteworthy metric.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Next step before merge
Security Review detailsBest possible solution: Land the focused auto-reply reset cleanup after maintainer review, with release-note handling left to the repository release flow. Do we have a high-confidence way to reproduce the issue? Yes, source-level reproduction is clear: current main resets the session but does not clear stale Is this the best way to solve the issue? Yes, this is the narrow maintainable fix: clear only runtime cache fields in the auto-reply reset path while existing preserved-selection logic continues to protect explicit user overrides. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against e7fb8cabb681. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +5, Tests +59, Docs +1. Total +65 across 3 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
This comment was marked as low quality.
This comment was marked as low quality.
6ef6a51 to
e2c2bba
Compare
|
Rebased this branch onto current Validation:
|
e2c2bba to
8e63520
Compare
This comment was marked as low quality.
This comment was marked as low quality.
This comment was marked as low quality.
This comment was marked as low quality.
8e63520 to
6a39c91
Compare
|
Refreshed this branch onto current Conflict handling:
Local validation on the rebased head:
GitHub now reports head |
6a39c91 to
71a0ed6
Compare
71a0ed6 to
f66e2d7
Compare
|
ClawSweeper PR egg ✨ Hatched: 🥚 common Gilded Proofling Hatch commandComment Hatchability rules:
Rarity: 🥚 common. What is this egg doing here?
|
f66e2d7 to
4972490
Compare
4972490 to
e87d2a8
Compare
Summary
Why
Real behavior proof
Behavior or issue addressed:
/newand/resetshould start a fresh channel session without carrying stale runtime model cache fields from the previous run, so a defaults change or a model retirement actually takes effect on the next turn.Real environment tested: rebased patched OpenClaw source checkout at
/tmp/openclaw-77339, Node v24.14.0, running a standalonetsxdriver script that imports the productioninitSessionStatefromsrc/auto-reply/reply/session.tsand exercises a real on-disk persisted session store. No vitest, no mocks — just the real auto-reply session-state code path against a realsessions.jsonfile in/tmp.Exact steps or command run after this patch:
initSessionStatefrom the patchedsrc/auto-reply/reply/session.ts, seeds a realsessions.jsonwith a session entry that hasmodelProvider: "openai",model: "gpt-5.4-mini",contextTokens: 400_000, andverboseLevel: "on", callsinitSessionStatewithBody: "/new"(and again withBody: "/reset"), then prints the live persisted store contents and the returnedsessionEntryfields. The driver is not part of the diff.pnpm exec tsx scratch-77322-demo.mtsDriver script:
Evidence after fix: copied live stdout from the
tsxdriver, with uuids/timestamps as emitted by the runtime (anonymized identifiers only):Observed result after fix:
sessionIdrotates to a fresh uuid (true reset;usageFamilySessionIdskeeps both entries for cost tracking).modelProvider,model, andcontextTokensare absent from the persisted entry — the stale runtime cache is gone, so the next turn resolves from current defaults or explicit preserved overrides.verboseLevel: "on"(an unrelated behavior override) is preserved./newand/reset.What was not tested: full Telegram network round-trip; the proof exercises the production auto-reply session-state code path that channel commands route through (the same
initSessionStatecall that runs in production for/newand/reset).Validation
Fixes #77322