Skip to content

fix(sessions): avoid per-row model resolution when selected model metadata is persisted#77650

Merged
obviyus merged 1 commit intoopenclaw:mainfrom
ragesaq:fix/sessions-list-model-override-cache
May 5, 2026
Merged

fix(sessions): avoid per-row model resolution when selected model metadata is persisted#77650
obviyus merged 1 commit intoopenclaw:mainfrom
ragesaq:fix/sessions-list-model-override-cache

Conversation

@ragesaq
Copy link
Copy Markdown

@ragesaq ragesaq commented May 5, 2026

Summary

  • Problem: listSessionsFromStore() calls resolveSessionModelRef() for every row when sessions have model/provider overrides, doing full config resolution per row — expensive at scale (hundreds of sessions).
  • Why it matters: Users with many sessions see slow openclaw sessions / Control UI list loads, especially when sessions have per-session model overrides.
  • What changed: Added a fast path for no-override sessions (return persisted runtime metadata directly) and a per-list-call cache for override sessions keyed by the normalized resolver inputs. The fast path only activates when no overrides are set — sessions with explicit overrides always resolve through the full resolver with caching.
  • What did NOT change: resolveSessionModelRef() itself is untouched. No config schema, API, or storage changes.

🤖 AI-assisted (OpenClaw agent, Claude Opus 4). Fully tested. Author understands all changes.

Change Type (select all)

  • Claw fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Related: performance regression in session list model resolution at scale
  • This PR fixes a bug or regression

Real behavior proof

  • Behavior or issue addressed: openclaw sessions / Control UI session lists were spending seconds in repeated selected-model resolution on large stores with model metadata; this patch keeps explicit override behavior correct while avoiding repeated resolver work.

  • Real environment tested: Linux aarch64 Raspberry Pi-style OpenClaw setup, Node.js v22.22.2, local real transcript-backed session store cloned into benchmark profiles.

  • Exact steps or command run after this patch:

    pnpm test:sessions:list:bench -- \
      --sessions 10000 \
      --source-store <local-real-sessions.json> \
      --cold-runs 3 \
      --runs 6
  • Evidence after fix: Terminal output from the fixed branch benchmark:

    branch/head: ragesaq/fix-sessions-list-model-override-cache @ 6397d265275a9e15461000065af2ea1f0eee7862
    rows returned: 10000
    cold avg: 497.5ms
    warm avg: 170.5ms
    p50: 167.6ms
    p95: 202.1ms
    

    Baseline terminal output from unpatched 2026.5.3 for comparison, limited to 500 sessions because larger 1k/10k unpatched runs OOM-killed in the same test profile:

    [sessions-list-bench] summary: sessions=500 rows=500 min=16653.4ms p50=17000.3ms p95=17331.8ms max=17331.8ms avg=17067.1ms
    
  • Observed result after fix: The fixed branch returns 10,000 rows in about 170ms warm average, while the unpatched 2026.5.3 path took about 17s warm average for only 500 rows. The selected-model override regression test also still passes, confirming explicit override rows display the override model rather than a fallback runtime model.

  • What was not tested: Full pnpm test suite on this host; targeted gateway tests, benchmark help, diff check, and pnpm check:changed were run instead.

Root Cause (if applicable)

  • Root cause: resolveSessionModelRef() performs full config resolution (agent defaults, provider lookup, model normalization) on every call. When called per-row for hundreds of sessions, this dominates list-build time. Additionally, the initial fast-path optimization incorrectly returned persisted runtime metadata even when explicit overrides (modelOverride/providerOverride) were present.
  • Missing detection / guardrail: No benchmark or profile test existed for session list rendering at scale. No test covered the interaction between the fast path and explicit overrides.
  • Contributing context: Sessions accumulate over time; the cost was negligible at small counts but grows linearly.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/config/__tests__/session-utils.test.ts — the "shows the selected override model even when a fallback runtime model exists" test case
  • Scenario the test should lock in: When a session has providerOverride=anthropic + modelOverride=claude-opus-4-6 but persisted modelProvider=openai-codex + model=gpt-5.4, the list must show the override values, not the persisted runtime values.
  • Why this is the smallest reliable guardrail: This is a pure unit test on the resolution function — no config loading, no I/O, deterministic.
  • Existing test that already covers this: The existing test suite already had this case and caught the regression during development.

User-visible / Behavior Changes

  • openclaw sessions and Control UI session list load faster when many sessions have model overrides (cache eliminates redundant resolution).
  • Sessions with explicit model overrides now correctly display the override model instead of the last runtime fallback model.

Diagram (if applicable)

Before (no-override sessions):
  [each row] -> resolveSessionModelRef(cfg, entry, agentId) -> full config walk

After (no-override sessions):
  [each row] -> persisted modelProvider/model available? -> return directly (skip resolver)

Before (override sessions):
  [each row] -> resolveSessionModelRef(cfg, entry, agentId) -> full config walk (repeated)

After (override sessions):
  [first row with combo] -> resolveSessionModelRef -> cache by (agentId, overrides)
  [subsequent rows with same combo] -> cache hit -> return cached

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Envclawment

  • OS: Linux (aarch64, Raspberry Pi)
  • Runtime: Node.js v22.22.2
  • Model/provider: N/A (config resolution logic)

Steps

  1. Create a session store with 500+ sessions, many with providerOverride/modelOverride
  2. Run openclaw sessions or load Control UI session list
  3. Observe time spent in model resolution

Expected

  • Fast path returns persisted metadata for no-override sessions
  • Override sessions resolve once per unique combo, then cache
  • Sessions with overrides display the override model, not the runtime fallback

Actual

  • Before: every row called resolveSessionModelRef() and override sessions showed wrong model
  • After: fast path + cache eliminate redundant resolution, overrides display correctly

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Benchmark scripts (scripts/bench-sessions-list.ts, scripts/bench-sessions-list-seed.ts) included in PR for reproducible profiling. Unit tests pass including the override correctness case.

Human Verification (required)

  • Verified scenarios: Build clean, lint clean, override test passes, fast path returns correct values for no-override sessions
  • Edge cases checked: Sessions with overrides + stale runtime metadata, sessions with no overrides + no persisted metadata (falls through to resolver), mixed agent IDs with same override combo
  • What you did NOT verify: Full pnpm test suite (OOM on Pi for large test files — ran targeted tests only). CI will cover full suite.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No

Risks and Mitigations

  • Risk: Fast path returns stale persisted metadata if a session's runtime model changes but the store isn't updated.
    • Mitigation: This is the existing behavior for the values we read — modelProvider/model are updated on session save. The fast path only uses them when no explicit overrides exist, which is the same semantic as the resolver's fallback path.

Copilot AI review requested due to automatic review settings May 5, 2026 03:07
@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime scripts Repository scripts size: L labels May 5, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented May 5, 2026

Codex review: needs maintainer review before merge.

Summary
The PR adds a per-list-call selected-model cache in gateway session-list row building, local benchmark scripts with a package entry, and an active changelog note.

Reproducibility: yes. at source level: current main calls resolveSessionModelRef() for each modelOverride row, and the PR supplies before/after terminal benchmark output for a large real store. I did not execute the benchmark locally under the read-only constraint.

Real behavior proof
Sufficient (terminal): The PR body and follow-up comment include after-fix terminal benchmark output, baseline comparison, and targeted gate results.

Next step before merge
No repair job is needed; the remaining action is ordinary maintainer landing review for a currently correct PR.

Security
Cleared: The diff adds local benchmark scripts, a package script, a changelog line, and an in-process cache without new dependencies, workflows, permissions, secrets handling, or external code execution.

Review details

Best possible solution:

Land the scoped list-row cache after normal maintainer review, keeping the resolver/storage/config contract unchanged and preserving explicit override display semantics.

Do we have a high-confidence way to reproduce the issue?

Yes at source level: current main calls resolveSessionModelRef() for each modelOverride row, and the PR supplies before/after terminal benchmark output for a large real store. I did not execute the benchmark locally under the read-only constraint.

Is this the best way to solve the issue?

Yes: the per-list-call row-context cache is a narrow maintainable fix that avoids changing config, storage, or the resolver contract while preserving override semantics.

What I checked:

  • Current main hot path: buildGatewaySessionRow() on current main calls resolveSessionModelRef() for rows with modelOverride, so model-heavy session lists still repeat selected-model resolution per row. (src/gateway/session-utils.ts:1543, 1c924c3c126d)
  • Resolver work being repeated: resolveSessionModelRef() resolves the agent/default model and then calls persisted selected-model resolution using runtime and override fields. (src/gateway/session-utils.ts:1251, 1c924c3c126d)
  • Existing correctness guard: Current tests assert that explicit providerOverride/modelOverride wins over fallback runtime metadata, matching the key semantic risk in the PR discussion. (src/gateway/session-utils.test.ts:1364, 1c924c3c126d)
  • PR cache implementation: The PR head adds selectedModelByOverrideRef to the list row context and caches resolveSessionModelRef() results by normalized agent/model fields. (src/gateway/session-utils.ts:372, b07c51a3f016)
  • Override semantics preserved: The PR fast path returns runtime metadata only when neither modelOverride nor providerOverride is present; override rows still go through the resolver, with caching. (src/gateway/session-utils.ts:521, b07c51a3f016)
  • Behavior proof supplied: The PR body and follow-up comment include terminal benchmark output: unpatched v2026.5.3 took about 17s warm average for 500 rows, while the fixed branch returned 10,000 rows in about 170ms warm average. (b07c51a3f016)

Likely related people:

  • steipete: Recent sessions-list performance, bounding, and docs commits touched the same list-row hot path and public openclaw sessions surface. (role: recent maintainer; confidence: high; commits: a224810a7f96, 18bd7b60e4fe, 3aaf30ffa600; files: src/gateway/session-utils.ts, src/gateway/session-utils.test.ts, docs/cli/sessions.md)
  • vincentkoc: Recent work kept runtime metadata on session rows and maintained session-list/subagent-row performance paths adjacent to this resolver cache. (role: adjacent owner; confidence: medium; commits: f696be950bdd, b1d7901a7996, a1dc8c066347; files: src/gateway/session-utils.ts, src/gateway/session-utils.test.ts)
  • dndodson: Authored the stored-provider preservation fix in resolveSessionModelRef, which is the resolver this PR caches for session-list rows. (role: introduced related behavior; confidence: medium; commits: 4cad67438784; files: src/gateway/session-utils.ts, src/gateway/session-utils.test.ts)
  • byungsker: Authored a provider-prefix/session model resolution fix in the same persisted session model display area. (role: adjacent model-resolution owner; confidence: medium; commits: 177386ed7318; files: src/agents/model-selection.ts, src/gateway/session-utils.ts, src/gateway/session-utils.test.ts)

Remaining risk / open question:

  • I did not run the benchmark locally because this review was read-only; the assessment relies on source inspection, contributor terminal proof, and exact-head CI.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 1c924c3c126d.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Optimizes the sessions.list control-plane path by trying to avoid repeated selected-model resolution during session row construction, and adds a reproducible benchmark harness to seed and measure large session stores.

Changes:

  • Adds a selected-model cache/fast path in buildGatewaySessionRow for persisted session model metadata.
  • Introduces benchmark + seeding scripts for cloning real session stores/transcripts and measuring sessions.list.
  • Wires the benchmark into package.json as a runnable script.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/gateway/session-utils.ts Adds selected-model cache helpers and switches session row building to the new fast path.
scripts/bench-sessions-list.ts New CLI benchmark runner for invoking sessions.list repeatedly and reporting timings.
scripts/bench-sessions-list-seed.ts New session-store seeder that clones real transcript-backed sessions into a temp benchmark profile.
package.json Adds a test:sessions:list:bench script entry for the new benchmark.

Comment thread src/gateway/session-utils.ts Outdated
Comment thread scripts/bench-sessions-list-seed.ts Outdated
@ragesaq ragesaq force-pushed the fix/sessions-list-model-override-cache branch from 01730c8 to e87f088 Compare May 5, 2026 03:57
@openclaw-barnacle openclaw-barnacle Bot added triage: refactor-only Candidate: refactor/cleanup-only PR without maintainer context. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 5, 2026
@ragesaq
Copy link
Copy Markdown
Author

ragesaq commented May 5, 2026

Behavior/performance proof for PR #77650

Latest PR head after the changelog fix: 6397d265275a9e15461000065af2ea1f0eee7862.

This change removes repeated selected-model resolution from the sessions.list row-building hot path when session rows already carry persisted model metadata, while preserving explicit override semantics through resolveSessionModelRef() and a per-list-call cache.

Benchmark command shape

The benchmark harness seeded synthetic rows from a local real transcript-backed session store, then measured cold and warm sessions.list calls.

pnpm test:sessions:list:bench -- \
  --sessions 10000 \
  --source-store <local-real-sessions.json> \
  --cold-runs 3 \
  --runs 6

For the unpatched 2026.5.3 baseline, lower session counts were required because 1k/10k unpatched runs OOM-killed under the same test profile. The 500-session unpatched run completed only with a larger heap:

NODE_OPTIONS=--max-old-space-size=8192 pnpm test:sessions:list:bench -- \
  --sessions 500 \
  --source-store <local-real-sessions.json> \
  --cold-runs 3 \
  --runs 6

Unpatched 2026.5.3 baseline, 500 sessions

Commit/tag context: v2026.5.3 / 06d46f7cf6.

[sessions-list-bench] cold 1/3: rows=500 wall=17626.0ms eventLoopDelayMax=453.8ms
[sessions-list-bench] cold 2/3: rows=500 wall=17039.7ms eventLoopDelayMax=379.8ms
[sessions-list-bench] cold 3/3: rows=500 wall=17435.5ms eventLoopDelayMax=370.9ms
[sessions-list-bench] warmup 1/1: rows=500 wall=17354.1ms eventLoopDelayMax=415.8ms
[sessions-list-bench] run 1/6: rows=500 wall=16968.2ms eventLoopDelayMax=394.8ms
[sessions-list-bench] run 2/6: rows=500 wall=17331.8ms eventLoopDelayMax=408.2ms
[sessions-list-bench] run 3/6: rows=500 wall=17230.7ms eventLoopDelayMax=407.4ms
[sessions-list-bench] run 4/6: rows=500 wall=17218.6ms eventLoopDelayMax=459.8ms
[sessions-list-bench] run 5/6: rows=500 wall=16653.4ms eventLoopDelayMax=348.1ms
[sessions-list-bench] run 6/6: rows=500 wall=17000.3ms eventLoopDelayMax=388.8ms
[sessions-list-bench] cold summary: sessions=500 min=17039.7ms p50=17435.5ms p95=17626.0ms max=17626.0ms avg=17367.0ms
[sessions-list-bench] summary: sessions=500 rows=500 min=16653.4ms p50=17000.3ms p95=17331.8ms max=17331.8ms avg=17067.1ms

Result: unpatched 2026.5.3 needed about 17s warm avg to return 500 rows, even with an 8 GiB old-space cap. Unpatched 1k and 10k runs OOM-killed in the same test profile.

Fixed branch, 10,000 sessions

Branch/head: ragesaq/fix-sessions-list-model-override-cache at 6397d265275a9e15461000065af2ea1f0eee7862.

cold avg: 497.5ms
warm avg: 170.5ms
p50: 167.6ms
p95: 202.1ms
rows returned: 10000

Result: the fixed branch lists 10,000 rows in about 170ms warm avg. That is the direct behavior contrast: unpatched 2026.5.3 took about 17s for 500 rows, while the fixed branch returns 10k rows in about 170ms warm avg.

Correctness/CI proof for latest head

I also re-ran the targeted gates on the latest PR head after adding the changelog entry:

git diff --check                                      passed
pnpm test src/gateway/session-utils.test.ts           passed, 81 tests
pnpm test:sessions:list:bench -- --help               passed
pnpm check:changed                                    passed

The existing selected-model override regression coverage still passes, including the case where explicit providerOverride/modelOverride must win over fallback runtime modelProvider/model.

@openclaw-barnacle openclaw-barnacle Bot removed the triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. label May 5, 2026
@ragesaq ragesaq force-pushed the fix/sessions-list-model-override-cache branch from 6397d26 to b07c51a Compare May 5, 2026 05:28
@obviyus obviyus force-pushed the fix/sessions-list-model-override-cache branch from b07c51a to 36b277b Compare May 5, 2026 06:15
@openclaw-barnacle openclaw-barnacle Bot added size: XS and removed scripts Repository scripts size: L labels May 5, 2026
@obviyus obviyus force-pushed the fix/sessions-list-model-override-cache branch from 36b277b to 1c3743c Compare May 5, 2026 06:17
@obviyus obviyus self-assigned this May 5, 2026
@obviyus obviyus force-pushed the fix/sessions-list-model-override-cache branch from 1c3743c to 012cd7f Compare May 5, 2026 06:21
Copy link
Copy Markdown
Contributor

@obviyus obviyus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified the sessions list selected-model path: rows with explicit session overrides still use resolveSessionModelRef, while repeated override resolution is cached per list build.

Maintainer follow-up: reduced the PR to the runtime fix plus changelog, dropped benchmark script noise, rebased onto current main, and resolved the review threads.

Local gate: pnpm test src/gateway/session-utils.test.ts and pnpm check:changed.

@obviyus obviyus merged commit eab494c into openclaw:main May 5, 2026
95 checks passed
@obviyus
Copy link
Copy Markdown
Contributor

obviyus commented May 5, 2026

Landed on main.

Thanks @ragesaq.

@clawsweeper clawsweeper Bot mentioned this pull request May 5, 2026
25 tasks
vincentkoc added a commit to VintageAyu/openclaw that referenced this pull request May 5, 2026
…ainer-hardening

* origin/main: (843 commits)
  docs(changelog): relocate openclaw#77046 and openclaw#77280 entries from 2026.5.3 to Unreleased (openclaw#77728)
  docs: reorder unreleased changelog
  fix: expose ollama thinking profile before activation (openclaw#77617) (thanks @yfge)
  fix: expose ollama thinking profile before activation
  test(gateway): preserve dispatch timers in waiter
  test(gateway): keep startup context timer live
  docs: document cache-friendly activity helper
  ci: install ffmpeg for Mantis media previews
  fix: avoid impossible device token rotation advice (openclaw#77688) (thanks @Conan-Scott)
  docs(changelog): note doctor device pairing advice fix
  fix(doctor): avoid impossible device token rotation advice
  ci: use Crabbox media previews for Mantis
  docs: filter maintainer-owned triage noise
  test: cover GitHub activity helper
  fix(session-file-repair): drop null-role message entries instead of preserving them (openclaw#77288)
  fix: prune orphan session artifacts
  perf: reduce GitHub activity cache misses
  fix: cache session list model resolution (openclaw#77650) (thanks @ragesaq)
  ci: embed Mantis desktop previews
  fix(replay-history): drop trailing stream-error placeholder before provider send (openclaw#77287)
  ...

# Conflicts:
#	CHANGELOG.md
@clawsweeper clawsweeper Bot mentioned this pull request May 5, 2026
25 tasks
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gateway Gateway runtime size: XS triage: refactor-only Candidate: refactor/cleanup-only PR without maintainer context.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants