Summary
Follow-up to #5369 (now closed/locked). The underlying root cause — a stale-cache race condition in the agent handler's session entry construction — appears to still be present on main.
The issue is intermittent (~3% reproduction rate) because it depends on timing between sessions.patch writing modelOverride to the store and the agent handler reading a cached (potentially stale) entry.
Root cause
When the agent handler runs for a spawned sub-agent session:
loadSessionEntry(key) returns a cached entry (may be stale — no modelOverride)
nextEntry is built from that stale entry, including modelOverride: entry?.modelOverride → undefined
updateSessionStore() acquires a lock and loads the store with skipCache: true (fresh from disk — has modelOverride from sessions.patch)
- But the mutator ignores the fresh store data and writes
store[key] = nextEntry — overwriting the fresh modelOverride with the stale undefined
Relevant lines on current main in src/gateway/server-methods/agent.ts:
// Line 214: stale cached read
const { cfg, storePath, entry, canonicalKey } = loadSessionEntry(requestedSessionKey);
// Line 252: nextEntry built from stale entry
modelOverride: entry?.modelOverride,
// Lines 284-286: mutator ignores fresh store, writes stale nextEntry
await updateSessionStore(storePath, (store) => {
store[canonicalSessionKey] = nextEntry;
});
Fix
PR #19328 moves session entry construction inside the updateSessionStore mutator callback, reading from freshEntry = store[primaryKey] (which was loaded with skipCache: true) instead of the stale entry. This guarantees modelOverride, providerOverride, and all other fields reflect the latest writes from sessions.patch.
Reproduction
A standalone test (agent-stale-cache-race.test.ts in PR #19328) reproduces both the vulnerable and fixed code patterns without requiring the full module chain. It confirms:
- main's pattern drops
modelOverride when the cache is stale
- The fix pattern preserves
modelOverride from the fresh store
- When cache and store happen to agree, both patterns work identically (explains the intermittent nature)
Environment
Same as #5369 — affects any version where the agent handler builds nextEntry outside the updateSessionStore mutator.
Summary
Follow-up to #5369 (now closed/locked). The underlying root cause — a stale-cache race condition in the agent handler's session entry construction — appears to still be present on
main.The issue is intermittent (~3% reproduction rate) because it depends on timing between
sessions.patchwritingmodelOverrideto the store and the agent handler reading a cached (potentially stale) entry.Root cause
When the agent handler runs for a spawned sub-agent session:
loadSessionEntry(key)returns a cached entry (may be stale — nomodelOverride)nextEntryis built from that staleentry, includingmodelOverride: entry?.modelOverride→undefinedupdateSessionStore()acquires a lock and loads the store withskipCache: true(fresh from disk — hasmodelOverridefromsessions.patch)store[key] = nextEntry— overwriting the freshmodelOverridewith the staleundefinedRelevant lines on current
maininsrc/gateway/server-methods/agent.ts:Fix
PR #19328 moves session entry construction inside the
updateSessionStoremutator callback, reading fromfreshEntry = store[primaryKey](which was loaded withskipCache: true) instead of the staleentry. This guaranteesmodelOverride,providerOverride, and all other fields reflect the latest writes fromsessions.patch.Reproduction
A standalone test (
agent-stale-cache-race.test.tsin PR #19328) reproduces both the vulnerable and fixed code patterns without requiring the full module chain. It confirms:modelOverridewhen the cache is stalemodelOverridefrom the fresh storeEnvironment
Same as #5369 — affects any version where the agent handler builds
nextEntryoutside theupdateSessionStoremutator.