fix: handle LiveSessionModelSwitchError in cron isolated sessions by issaba1 · Pull Request #57972 · openclaw/openclaw

issaba1 · 2026-03-30T22:43:09Z

Problem

When a cron job with sessionTarget: 'isolated' specifies a model different from the agent's primary, the session throws LiveSessionModelSwitchError and fails immediately.

The embedded runner initializes the isolated session using the gateway's default agent model. When the cron payload carries a different model override, the runner detects the mismatch mid-execution and throws LiveSessionModelSwitchError. The main agent runner (agent-runner-execution.ts) already catches this error in a retry loop and re-runs with the requested model — but the cron isolated session runner (cron/isolated-agent/run.ts) does not, so every affected cron job fails.

Fix

Wrap the initial runPrompt(commandBody) call in a while (true) retry loop that catches LiveSessionModelSwitchError, updates the provider/model state (and persists it to the session store), and continues — mirroring the existing retry pattern in agent-runner-execution.ts exactly.

Changed files:

src/cron/isolated-agent/run.ts — adds LiveSessionModelSwitchError import + retry loop around the initial runPrompt call
src/cron/isolated-agent/run.live-session-model-switch.test.ts — new test suite covering retry success, session entry update, non-infinite-loop on repeated errors, and no-retry on other errors

Impact

All cron jobs that specify a model different from the agent primary fail with this bug
Affects every user running multi-model setups (e.g. opus as primary, sonnet/minimax for cost-optimized crons)
Related issues: Isolated cron sessions ignore model override (LiveSessionModelSwitchError) #57206, LiveSessionModelSwitchError crashes all isolated cron sessions #57862, Cron payload.model field ignored when agent has model pin — all crons cascade to expensive models #57006, Isolated cron jobs fail with LiveSessionModelSwitchError on 2026.3.28 #57134, LiveSessionModelSwitchError overrides model failover, creating infinite retry loop #57812
Jobs that previously worked before the 2026.3.28 release are broken

Testing

Traced the code path in src/cron/isolated-agent/run.ts and src/agents/live-model-switch.ts
The retry pattern is an exact mirror of the working retry in agent-runner-execution.ts (lines 615–625)
Added 4 unit tests covering the retry lifecycle

Fixes #57206

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 637d62a54e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-30T22:45:52Z

+          provider = err.provider;
+          model = err.model;
+          fallbackProvider = err.provider;
+          fallbackModel = err.model;


Carry auth profile through model-switch retry

LiveSessionModelSwitchError is also used when only the auth profile changes (the live-selection diff includes authProfileId/authProfileIdSource), but this retry block updates only provider/model. Because runEmbeddedPiAgent is still called with the original authProfileId values captured earlier, a switch request that changes auth profile can keep throwing the same switch error and loop indefinitely inside while (true). Propagate err.authProfileId and err.authProfileIdSource into the retry state (and persisted session override as needed) before continue so the second attempt can actually converge.

Useful? React with 👍 / 👎.

greptile-apps · 2026-03-30T22:50:08Z

Greptile Summary

This PR fixes a regression (introduced in the 2026-03-28 release) where cron jobs with sessionTarget: 'isolated' and a model override different from the agent's primary model would fail immediately with LiveSessionModelSwitchError. The fix adds a while (true) retry loop in src/cron/isolated-agent/run.ts that catches this error, updates the active provider/model state (and persists it to the session store), and retries — directly mirroring the existing working pattern in src/auto-reply/reply/agent-runner-execution.ts. Four unit tests cover the retry lifecycle.

Logic looks correct: provider and model are let-captured by the runPrompt closure, so updating them in the catch block is picked up on the next iteration without any extra plumbing.
Minor deviation from reference: The reference in agent-runner-execution.ts also propagates authProfileId/authProfileIdSource from the error on each switch (lines 619–622). The cron runner does not, because those are const captures at the call-site. For isolated sessions (fresh session each run), the resolved auth-profile shouldn't differ from what the error carries, so this is low-impact in practice.
Test assertion gap: The "updates provider/model on session entry" test only asserts updateSessionStoreMock was called, but that call also fires during the pre-run persistence step, so the assertion does not uniquely verify the model-switch persistence path.

Confidence Score: 5/5

Safe to merge — the core retry logic is correct and well-tested; remaining findings are minor test-quality suggestions.
All findings are P2 (style/test quality). The fix correctly mirrors the reference pattern in agent-runner-execution.ts, provider/model closure capture is sound, and the four new tests cover the primary retry scenarios. No logic bugs found.
No files require special attention.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/cron/isolated-agent/run.live-session-model-switch.test.ts
Line: 133-146

Comment:
**Weak assertion on model-switch persistence**

The test only checks that `updateSessionStoreMock` was called at all — but `updateSessionStore` is also invoked during the pre-run persistence step (before the first `runPrompt` attempt, where `modelProvider` and `model` are written for the "in-progress" display). This means the assertion would pass even if the model-switch branch were deleted, as long as the pre-run call still fires.

To actually guard the model-switch path, consider asserting the _specific arguments_ passed on the call triggered by the switch, e.g.:

```typescript
const calls = updateSessionStoreMock.mock.calls;
// The second call should carry the switched model
const switchedEntry = calls[calls.length - 1][1]; // updater fn or direct arg
// Or inspect cronSession.sessionEntry.model directly if the harness exposes it
```

Alternatively, a stronger check would verify `callCount === 2` **and** that the second `updateSessionStoreMock` invocation contained `provider: "anthropic", model: "claude-sonnet-4-6"`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/cron/isolated-agent/run.ts
Line: 571-578

Comment:
**`authProfileId`/`authProfileIdSource` not updated on model switch**

The reference implementation in `agent-runner-execution.ts` (lines 619–622) also propagates `authProfileId` and `authProfileIdSource` from the error on every switch:

```typescript
params.followupRun.run.authProfileId = err.authProfileId;
params.followupRun.run.authProfileIdSource = err.authProfileId ? err.authProfileIdSource : undefined;
```

The cron runner omits these updates because `authProfileId` is a `const` at line 430. For isolated sessions, the auth profile is always resolved fresh before the first attempt (via `resolveSessionAuthProfileOverride`), so in practice the error's `authProfileId` should match the already-resolved value and this divergence is harmless. However, if the embedded runner ever returns a `LiveSessionModelSwitchError` with a _different_ auth-profile hint (e.g., the session store's `authProfileOverride` was written by an external agent between the resolution step and the actual run), the retry would proceed with the stale auth profile and could fail with a 401.

Consider converting `authProfileId` (and `authProfileIdSource`) to `let` and updating them in the catch block to stay aligned with the reference pattern.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "fix: handle LiveSessionModelSwitchError ..." | Re-trigger Greptile}

greptile-apps · 2026-03-30T22:50:13Z

+      } catch (err) {
+        if (err instanceof LiveSessionModelSwitchError) {
+          provider = err.provider;
+          model = err.model;
+          fallbackProvider = err.provider;
+          fallbackModel = err.model;
+          cronSession.sessionEntry.modelProvider = err.provider;
+          cronSession.sessionEntry.model = err.model;


authProfileId/authProfileIdSource not updated on model switch

The reference implementation in agent-runner-execution.ts (lines 619–622) also propagates authProfileId and authProfileIdSource from the error on every switch:

params.followupRun.run.authProfileId = err.authProfileId; params.followupRun.run.authProfileIdSource = err.authProfileId ? err.authProfileIdSource : undefined;

The cron runner omits these updates because authProfileId is a const at line 430. For isolated sessions, the auth profile is always resolved fresh before the first attempt (via resolveSessionAuthProfileOverride), so in practice the error's authProfileId should match the already-resolved value and this divergence is harmless. However, if the embedded runner ever returns a LiveSessionModelSwitchError with a different auth-profile hint (e.g., the session store's authProfileOverride was written by an external agent between the resolution step and the actual run), the retry would proceed with the stale auth profile and could fail with a 401.

Consider converting authProfileId (and authProfileIdSource) to let and updating them in the catch block to stay aligned with the reference pattern.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/cron/isolated-agent/run.ts Line: 571-578 Comment: **`authProfileId`/`authProfileIdSource` not updated on model switch** The reference implementation in `agent-runner-execution.ts` (lines 619–622) also propagates `authProfileId` and `authProfileIdSource` from the error on every switch: ```typescript params.followupRun.run.authProfileId = err.authProfileId; params.followupRun.run.authProfileIdSource = err.authProfileId ? err.authProfileIdSource : undefined; ``` The cron runner omits these updates because `authProfileId` is a `const` at line 430. For isolated sessions, the auth profile is always resolved fresh before the first attempt (via `resolveSessionAuthProfileOverride`), so in practice the error's `authProfileId` should match the already-resolved value and this divergence is harmless. However, if the embedded runner ever returns a `LiveSessionModelSwitchError` with a _different_ auth-profile hint (e.g., the session store's `authProfileOverride` was written by an external agent between the resolution step and the actual run), the retry would proceed with the stale auth profile and could fail with a 401. Consider converting `authProfileId` (and `authProfileIdSource`) to `let` and updating them in the catch block to stay aligned with the reference pattern. How can I resolve this? If you propose a fix, please make it concise.

The main agent runner catches LiveSessionModelSwitchError and retries with the requested model, but cron isolated sessions hit this error and fail immediately. This extends the retry to cover cron execution. When a cron job with `sessionTarget: 'isolated'` specifies a `model` different from the agent's primary, the embedded runner throws LiveSessionModelSwitchError (because the session initialized with the wrong model). The fix wraps the initial runPrompt call in a retry loop that catches this error, updates provider/model state, and re-runs — mirroring the existing retry logic in agent-runner-execution.ts. Fixes openclaw#57206

@issaba1

@issaba1)

obviyus

Reviewed latest changes; landing now.

obviyus · 2026-03-31T04:33:56Z

Landed on main.

Landed source commit: 7300b22
Merge commit: 10ac6ea

Thanks @issaba1.

@issaba1

@issaba1) * fix: handle LiveSessionModelSwitchError in cron isolated sessions The main agent runner catches LiveSessionModelSwitchError and retries with the requested model, but cron isolated sessions hit this error and fail immediately. This extends the retry to cover cron execution. When a cron job with `sessionTarget: 'isolated'` specifies a `model` different from the agent's primary, the embedded runner throws LiveSessionModelSwitchError (because the session initialized with the wrong model). The fix wraps the initial runPrompt call in a retry loop that catches this error, updates provider/model state, and re-runs — mirroring the existing retry logic in agent-runner-execution.ts. Fixes openclaw#57206 * fix: carry auth profile through cron model retry * fix: complete cron isolated model-switch retry (openclaw#57972) (thanks @issaba1) --------- Co-authored-by: Isaac Saba <isaacsaba@Isaacs-Mac-mini.local> Co-authored-by: Ayaan Zaidi <hi@obviy.us>

@issaba1

@issaba1) * fix: handle LiveSessionModelSwitchError in cron isolated sessions The main agent runner catches LiveSessionModelSwitchError and retries with the requested model, but cron isolated sessions hit this error and fail immediately. This extends the retry to cover cron execution. When a cron job with `sessionTarget: 'isolated'` specifies a `model` different from the agent's primary, the embedded runner throws LiveSessionModelSwitchError (because the session initialized with the wrong model). The fix wraps the initial runPrompt call in a retry loop that catches this error, updates provider/model state, and re-runs — mirroring the existing retry logic in agent-runner-execution.ts. Fixes openclaw#57206 * fix: carry auth profile through cron model retry * fix: complete cron isolated model-switch retry (openclaw#57972) (thanks @issaba1) --------- Co-authored-by: Isaac Saba <isaacsaba@Isaacs-Mac-mini.local> Co-authored-by: Ayaan Zaidi <hi@obviy.us>

@issaba1

@issaba1) * fix: handle LiveSessionModelSwitchError in cron isolated sessions The main agent runner catches LiveSessionModelSwitchError and retries with the requested model, but cron isolated sessions hit this error and fail immediately. This extends the retry to cover cron execution. When a cron job with `sessionTarget: 'isolated'` specifies a `model` different from the agent's primary, the embedded runner throws LiveSessionModelSwitchError (because the session initialized with the wrong model). The fix wraps the initial runPrompt call in a retry loop that catches this error, updates provider/model state, and re-runs — mirroring the existing retry logic in agent-runner-execution.ts. Fixes openclaw#57206 * fix: carry auth profile through cron model retry * fix: complete cron isolated model-switch retry (openclaw#57972) (thanks @issaba1) --------- Co-authored-by: Isaac Saba <isaacsaba@Isaacs-Mac-mini.local> Co-authored-by: Ayaan Zaidi <hi@obviy.us>

@issaba1

@issaba1) * fix: handle LiveSessionModelSwitchError in cron isolated sessions The main agent runner catches LiveSessionModelSwitchError and retries with the requested model, but cron isolated sessions hit this error and fail immediately. This extends the retry to cover cron execution. When a cron job with `sessionTarget: 'isolated'` specifies a `model` different from the agent's primary, the embedded runner throws LiveSessionModelSwitchError (because the session initialized with the wrong model). The fix wraps the initial runPrompt call in a retry loop that catches this error, updates provider/model state, and re-runs — mirroring the existing retry logic in agent-runner-execution.ts. Fixes openclaw#57206 * fix: carry auth profile through cron model retry * fix: complete cron isolated model-switch retry (openclaw#57972) (thanks @issaba1) --------- Co-authored-by: Isaac Saba <isaacsaba@Isaacs-Mac-mini.local> Co-authored-by: Ayaan Zaidi <hi@obviy.us>

@issaba1

@issaba1) * fix: handle LiveSessionModelSwitchError in cron isolated sessions The main agent runner catches LiveSessionModelSwitchError and retries with the requested model, but cron isolated sessions hit this error and fail immediately. This extends the retry to cover cron execution. When a cron job with `sessionTarget: 'isolated'` specifies a `model` different from the agent's primary, the embedded runner throws LiveSessionModelSwitchError (because the session initialized with the wrong model). The fix wraps the initial runPrompt call in a retry loop that catches this error, updates provider/model state, and re-runs — mirroring the existing retry logic in agent-runner-execution.ts. Fixes openclaw#57206 * fix: carry auth profile through cron model retry * fix: complete cron isolated model-switch retry (openclaw#57972) (thanks @issaba1) --------- Co-authored-by: Isaac Saba <isaacsaba@Isaacs-Mac-mini.local> Co-authored-by: Ayaan Zaidi <hi@obviy.us>

issaba1 mentioned this pull request Mar 30, 2026

Isolated cron sessions ignore model override (LiveSessionModelSwitchError) #57206

Closed

openclaw-barnacle Bot added the size: M label Mar 30, 2026

chatgpt-codex-connector Bot reviewed Mar 30, 2026

View reviewed changes

greptile-apps Bot reviewed Mar 30, 2026

View reviewed changes

obviyus self-assigned this Mar 31, 2026

Isaac Saba and others added 2 commits March 31, 2026 10:00

fix: carry auth profile through cron model retry

baecfbb

obviyus force-pushed the fix/cron-isolated-model-switch branch from fd33bac to baecfbb Compare March 31, 2026 04:31

fix: complete cron isolated model-switch retry (openclaw#57972) (thanks

7300b22

@issaba1)

obviyus approved these changes Mar 31, 2026

View reviewed changes

obviyus merged commit 10ac6ea into openclaw:main Mar 31, 2026
8 checks passed

github-actions Bot mentioned this pull request Mar 31, 2026

📡 Upstream Digest — 2026-03-31 07:04 UTC curtismercier/openclaw-mods#418

Open

aaronagent mentioned this pull request Mar 31, 2026

fix(cron): prevent agent default model from overriding cron payload model #58294

Merged

22 tasks

zeeyuu222 mentioned this pull request Mar 31, 2026

LiveSessionModelSwitch silently hijacks fallback in live sessions, creating infinite retry loop #58303

Closed

goncalovelosa mentioned this pull request Mar 31, 2026

Regression v2026.3.28: LiveSessionModelSwitchError when spawning subagent with different model than parent #58305

Closed

steipete mentioned this pull request Mar 31, 2026

Cron agentTurn isolated runs fail with LiveSessionModelSwitchError instead of creating a fresh session on model mismatch #56986

Closed

simonemacario mentioned this pull request Mar 31, 2026

[Feature]: Expose per-run token usage on the WebSocket lifecycle event stream #57404

Open

stemkat100 mentioned this pull request Apr 1, 2026

[Bug]: Cron payload model override ignored - LiveSessionModelSwitchError on 2026.3.28 #57112

Closed

martingarramon mentioned this pull request Apr 2, 2026

Isolated cron jobs fail with LiveSessionModelSwitchError when payload model differs from agent default #59657

Closed

martingarramon mentioned this pull request Apr 10, 2026

fix(cron): unwrap FailoverError to detect LiveSessionModelSwitchError in retry loop #59893

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: handle LiveSessionModelSwitchError in cron isolated sessions#57972

fix: handle LiveSessionModelSwitchError in cron isolated sessions#57972
obviyus merged 3 commits intoopenclaw:mainfrom
issaba1:fix/cron-isolated-model-switch

issaba1 commented Mar 30, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 30, 2026

Uh oh!

greptile-apps Bot commented Mar 30, 2026

Uh oh!

Uh oh!

greptile-apps Bot Mar 30, 2026

Uh oh!

obviyus left a comment

Uh oh!

Uh oh!

obviyus commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

issaba1 commented Mar 30, 2026

Problem

Fix

Impact

Testing

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Mar 30, 2026

Greptile Summary

Confidence Score: 5/5

Uh oh!

Uh oh!

greptile-apps Bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

obviyus left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

obviyus commented Mar 31, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants