fix(sessions): avoid per-row model resolution when selected model metadata is persisted#77650
Conversation
|
Codex review: needs maintainer review before merge. Summary Reproducibility: yes. at source level: current main calls Real behavior proof Next step before merge Security Review detailsBest possible solution: Land the scoped list-row cache after normal maintainer review, keeping the resolver/storage/config contract unchanged and preserving explicit override display semantics. Do we have a high-confidence way to reproduce the issue? Yes at source level: current main calls Is this the best way to solve the issue? Yes: the per-list-call row-context cache is a narrow maintainable fix that avoids changing config, storage, or the resolver contract while preserving override semantics. What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 1c924c3c126d. |
There was a problem hiding this comment.
Pull request overview
Optimizes the sessions.list control-plane path by trying to avoid repeated selected-model resolution during session row construction, and adds a reproducible benchmark harness to seed and measure large session stores.
Changes:
- Adds a selected-model cache/fast path in
buildGatewaySessionRowfor persisted session model metadata. - Introduces benchmark + seeding scripts for cloning real session stores/transcripts and measuring
sessions.list. - Wires the benchmark into
package.jsonas a runnable script.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/gateway/session-utils.ts |
Adds selected-model cache helpers and switches session row building to the new fast path. |
scripts/bench-sessions-list.ts |
New CLI benchmark runner for invoking sessions.list repeatedly and reporting timings. |
scripts/bench-sessions-list-seed.ts |
New session-store seeder that clones real transcript-backed sessions into a temp benchmark profile. |
package.json |
Adds a test:sessions:list:bench script entry for the new benchmark. |
01730c8 to
e87f088
Compare
Behavior/performance proof for PR #77650Latest PR head after the changelog fix: This change removes repeated selected-model resolution from the Benchmark command shapeThe benchmark harness seeded synthetic rows from a local real transcript-backed session store, then measured cold and warm pnpm test:sessions:list:bench -- \
--sessions 10000 \
--source-store <local-real-sessions.json> \
--cold-runs 3 \
--runs 6For the unpatched 2026.5.3 baseline, lower session counts were required because 1k/10k unpatched runs OOM-killed under the same test profile. The 500-session unpatched run completed only with a larger heap: NODE_OPTIONS=--max-old-space-size=8192 pnpm test:sessions:list:bench -- \
--sessions 500 \
--source-store <local-real-sessions.json> \
--cold-runs 3 \
--runs 6Unpatched 2026.5.3 baseline, 500 sessionsCommit/tag context: Result: unpatched 2026.5.3 needed about 17s warm avg to return 500 rows, even with an 8 GiB old-space cap. Unpatched 1k and 10k runs OOM-killed in the same test profile. Fixed branch, 10,000 sessionsBranch/head: Result: the fixed branch lists 10,000 rows in about 170ms warm avg. That is the direct behavior contrast: unpatched 2026.5.3 took about 17s for 500 rows, while the fixed branch returns 10k rows in about 170ms warm avg. Correctness/CI proof for latest headI also re-ran the targeted gates on the latest PR head after adding the changelog entry: The existing selected-model override regression coverage still passes, including the case where explicit |
6397d26 to
b07c51a
Compare
b07c51a to
36b277b
Compare
36b277b to
1c3743c
Compare
1c3743c to
012cd7f
Compare
obviyus
left a comment
There was a problem hiding this comment.
Verified the sessions list selected-model path: rows with explicit session overrides still use resolveSessionModelRef, while repeated override resolution is cached per list build.
Maintainer follow-up: reduced the PR to the runtime fix plus changelog, dropped benchmark script noise, rebased onto current main, and resolved the review threads.
Local gate: pnpm test src/gateway/session-utils.test.ts and pnpm check:changed.
…ainer-hardening * origin/main: (843 commits) docs(changelog): relocate openclaw#77046 and openclaw#77280 entries from 2026.5.3 to Unreleased (openclaw#77728) docs: reorder unreleased changelog fix: expose ollama thinking profile before activation (openclaw#77617) (thanks @yfge) fix: expose ollama thinking profile before activation test(gateway): preserve dispatch timers in waiter test(gateway): keep startup context timer live docs: document cache-friendly activity helper ci: install ffmpeg for Mantis media previews fix: avoid impossible device token rotation advice (openclaw#77688) (thanks @Conan-Scott) docs(changelog): note doctor device pairing advice fix fix(doctor): avoid impossible device token rotation advice ci: use Crabbox media previews for Mantis docs: filter maintainer-owned triage noise test: cover GitHub activity helper fix(session-file-repair): drop null-role message entries instead of preserving them (openclaw#77288) fix: prune orphan session artifacts perf: reduce GitHub activity cache misses fix: cache session list model resolution (openclaw#77650) (thanks @ragesaq) ci: embed Mantis desktop previews fix(replay-history): drop trailing stream-error placeholder before provider send (openclaw#77287) ... # Conflicts: # CHANGELOG.md
Summary
listSessionsFromStore()callsresolveSessionModelRef()for every row when sessions have model/provider overrides, doing full config resolution per row — expensive at scale (hundreds of sessions).openclaw sessions/ Control UI list loads, especially when sessions have per-session model overrides.resolveSessionModelRef()itself is untouched. No config schema, API, or storage changes.🤖 AI-assisted (OpenClaw agent, Claude Opus 4). Fully tested. Author understands all changes.
Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Real behavior proof
Behavior or issue addressed:
openclaw sessions/ Control UI session lists were spending seconds in repeated selected-model resolution on large stores with model metadata; this patch keeps explicit override behavior correct while avoiding repeated resolver work.Real environment tested: Linux aarch64 Raspberry Pi-style OpenClaw setup, Node.js v22.22.2, local real transcript-backed session store cloned into benchmark profiles.
Exact steps or command run after this patch:
Evidence after fix: Terminal output from the fixed branch benchmark:
Baseline terminal output from unpatched 2026.5.3 for comparison, limited to 500 sessions because larger 1k/10k unpatched runs OOM-killed in the same test profile:
Observed result after fix: The fixed branch returns 10,000 rows in about 170ms warm average, while the unpatched 2026.5.3 path took about 17s warm average for only 500 rows. The selected-model override regression test also still passes, confirming explicit override rows display the override model rather than a fallback runtime model.
What was not tested: Full
pnpm testsuite on this host; targeted gateway tests, benchmark help, diff check, andpnpm check:changedwere run instead.Root Cause (if applicable)
resolveSessionModelRef()performs full config resolution (agent defaults, provider lookup, model normalization) on every call. When called per-row for hundreds of sessions, this dominates list-build time. Additionally, the initial fast-path optimization incorrectly returned persisted runtime metadata even when explicit overrides (modelOverride/providerOverride) were present.Regression Test Plan (if applicable)
src/config/__tests__/session-utils.test.ts— the"shows the selected override model even when a fallback runtime model exists"test caseproviderOverride=anthropic+modelOverride=claude-opus-4-6but persistedmodelProvider=openai-codex+model=gpt-5.4, the list must show the override values, not the persisted runtime values.User-visible / Behavior Changes
openclaw sessionsand Control UI session list load faster when many sessions have model overrides (cache eliminates redundant resolution).Diagram (if applicable)
Security Impact (required)
Repro + Verification
Envclawment
Steps
providerOverride/modelOverrideopenclaw sessionsor load Control UI session listExpected
Actual
resolveSessionModelRef()and override sessions showed wrong modelEvidence
Benchmark scripts (
scripts/bench-sessions-list.ts,scripts/bench-sessions-list-seed.ts) included in PR for reproducible profiling. Unit tests pass including the override correctness case.Human Verification (required)
pnpm testsuite (OOM on Pi for large test files — ran targeted tests only). CI will cover full suite.Review Conversations
Compatibility / Migration
Risks and Mitigations
modelProvider/modelare updated on session save. The fast path only uses them when no explicit overrides exist, which is the same semantic as the resolver's fallback path.