Skip to content

fix: handle LiveSessionModelSwitchError in cron isolated sessions#57972

Merged
obviyus merged 3 commits intoopenclaw:mainfrom
issaba1:fix/cron-isolated-model-switch
Mar 31, 2026
Merged

fix: handle LiveSessionModelSwitchError in cron isolated sessions#57972
obviyus merged 3 commits intoopenclaw:mainfrom
issaba1:fix/cron-isolated-model-switch

Conversation

@issaba1
Copy link
Copy Markdown
Contributor

@issaba1 issaba1 commented Mar 30, 2026

Problem

When a cron job with sessionTarget: 'isolated' specifies a model different from the agent's primary, the session throws LiveSessionModelSwitchError and fails immediately.

The embedded runner initializes the isolated session using the gateway's default agent model. When the cron payload carries a different model override, the runner detects the mismatch mid-execution and throws LiveSessionModelSwitchError. The main agent runner (agent-runner-execution.ts) already catches this error in a retry loop and re-runs with the requested model — but the cron isolated session runner (cron/isolated-agent/run.ts) does not, so every affected cron job fails.

Fix

Wrap the initial runPrompt(commandBody) call in a while (true) retry loop that catches LiveSessionModelSwitchError, updates the provider/model state (and persists it to the session store), and continues — mirroring the existing retry pattern in agent-runner-execution.ts exactly.

Changed files:

  • src/cron/isolated-agent/run.ts — adds LiveSessionModelSwitchError import + retry loop around the initial runPrompt call
  • src/cron/isolated-agent/run.live-session-model-switch.test.ts — new test suite covering retry success, session entry update, non-infinite-loop on repeated errors, and no-retry on other errors

Impact

Testing

  • Traced the code path in src/cron/isolated-agent/run.ts and src/agents/live-model-switch.ts
  • The retry pattern is an exact mirror of the working retry in agent-runner-execution.ts (lines 615–625)
  • Added 4 unit tests covering the retry lifecycle

Fixes #57206

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 637d62a54e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/cron/isolated-agent/run.ts Outdated
Comment on lines +573 to +576
provider = err.provider;
model = err.model;
fallbackProvider = err.provider;
fallbackModel = err.model;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Carry auth profile through model-switch retry

LiveSessionModelSwitchError is also used when only the auth profile changes (the live-selection diff includes authProfileId/authProfileIdSource), but this retry block updates only provider/model. Because runEmbeddedPiAgent is still called with the original authProfileId values captured earlier, a switch request that changes auth profile can keep throwing the same switch error and loop indefinitely inside while (true). Propagate err.authProfileId and err.authProfileIdSource into the retry state (and persisted session override as needed) before continue so the second attempt can actually converge.

Useful? React with 👍 / 👎.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 30, 2026

Greptile Summary

This PR fixes a regression (introduced in the 2026-03-28 release) where cron jobs with sessionTarget: 'isolated' and a model override different from the agent's primary model would fail immediately with LiveSessionModelSwitchError. The fix adds a while (true) retry loop in src/cron/isolated-agent/run.ts that catches this error, updates the active provider/model state (and persists it to the session store), and retries — directly mirroring the existing working pattern in src/auto-reply/reply/agent-runner-execution.ts. Four unit tests cover the retry lifecycle.

  • Logic looks correct: provider and model are let-captured by the runPrompt closure, so updating them in the catch block is picked up on the next iteration without any extra plumbing.
  • Minor deviation from reference: The reference in agent-runner-execution.ts also propagates authProfileId/authProfileIdSource from the error on each switch (lines 619–622). The cron runner does not, because those are const captures at the call-site. For isolated sessions (fresh session each run), the resolved auth-profile shouldn't differ from what the error carries, so this is low-impact in practice.
  • Test assertion gap: The "updates provider/model on session entry" test only asserts updateSessionStoreMock was called, but that call also fires during the pre-run persistence step, so the assertion does not uniquely verify the model-switch persistence path.

Confidence Score: 5/5

  • Safe to merge — the core retry logic is correct and well-tested; remaining findings are minor test-quality suggestions.
  • All findings are P2 (style/test quality). The fix correctly mirrors the reference pattern in agent-runner-execution.ts, provider/model closure capture is sound, and the four new tests cover the primary retry scenarios. No logic bugs found.
  • No files require special attention.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/cron/isolated-agent/run.live-session-model-switch.test.ts
Line: 133-146

Comment:
**Weak assertion on model-switch persistence**

The test only checks that `updateSessionStoreMock` was called at all — but `updateSessionStore` is also invoked during the pre-run persistence step (before the first `runPrompt` attempt, where `modelProvider` and `model` are written for the "in-progress" display). This means the assertion would pass even if the model-switch branch were deleted, as long as the pre-run call still fires.

To actually guard the model-switch path, consider asserting the _specific arguments_ passed on the call triggered by the switch, e.g.:

```typescript
const calls = updateSessionStoreMock.mock.calls;
// The second call should carry the switched model
const switchedEntry = calls[calls.length - 1][1]; // updater fn or direct arg
// Or inspect cronSession.sessionEntry.model directly if the harness exposes it
```

Alternatively, a stronger check would verify `callCount === 2` **and** that the second `updateSessionStoreMock` invocation contained `provider: "anthropic", model: "claude-sonnet-4-6"`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/cron/isolated-agent/run.ts
Line: 571-578

Comment:
**`authProfileId`/`authProfileIdSource` not updated on model switch**

The reference implementation in `agent-runner-execution.ts` (lines 619–622) also propagates `authProfileId` and `authProfileIdSource` from the error on every switch:

```typescript
params.followupRun.run.authProfileId = err.authProfileId;
params.followupRun.run.authProfileIdSource = err.authProfileId ? err.authProfileIdSource : undefined;
```

The cron runner omits these updates because `authProfileId` is a `const` at line 430. For isolated sessions, the auth profile is always resolved fresh before the first attempt (via `resolveSessionAuthProfileOverride`), so in practice the error's `authProfileId` should match the already-resolved value and this divergence is harmless. However, if the embedded runner ever returns a `LiveSessionModelSwitchError` with a _different_ auth-profile hint (e.g., the session store's `authProfileOverride` was written by an external agent between the resolution step and the actual run), the retry would proceed with the stale auth profile and could fail with a 401.

Consider converting `authProfileId` (and `authProfileIdSource`) to `let` and updating them in the catch block to stay aligned with the reference pattern.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix: handle LiveSessionModelSwitchError ..." | Re-trigger Greptile

Comment thread src/cron/isolated-agent/run.live-session-model-switch.test.ts
Comment thread src/cron/isolated-agent/run.ts Outdated
Comment on lines +571 to +578
} catch (err) {
if (err instanceof LiveSessionModelSwitchError) {
provider = err.provider;
model = err.model;
fallbackProvider = err.provider;
fallbackModel = err.model;
cronSession.sessionEntry.modelProvider = err.provider;
cronSession.sessionEntry.model = err.model;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 authProfileId/authProfileIdSource not updated on model switch

The reference implementation in agent-runner-execution.ts (lines 619–622) also propagates authProfileId and authProfileIdSource from the error on every switch:

params.followupRun.run.authProfileId = err.authProfileId;
params.followupRun.run.authProfileIdSource = err.authProfileId ? err.authProfileIdSource : undefined;

The cron runner omits these updates because authProfileId is a const at line 430. For isolated sessions, the auth profile is always resolved fresh before the first attempt (via resolveSessionAuthProfileOverride), so in practice the error's authProfileId should match the already-resolved value and this divergence is harmless. However, if the embedded runner ever returns a LiveSessionModelSwitchError with a different auth-profile hint (e.g., the session store's authProfileOverride was written by an external agent between the resolution step and the actual run), the retry would proceed with the stale auth profile and could fail with a 401.

Consider converting authProfileId (and authProfileIdSource) to let and updating them in the catch block to stay aligned with the reference pattern.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/cron/isolated-agent/run.ts
Line: 571-578

Comment:
**`authProfileId`/`authProfileIdSource` not updated on model switch**

The reference implementation in `agent-runner-execution.ts` (lines 619–622) also propagates `authProfileId` and `authProfileIdSource` from the error on every switch:

```typescript
params.followupRun.run.authProfileId = err.authProfileId;
params.followupRun.run.authProfileIdSource = err.authProfileId ? err.authProfileIdSource : undefined;
```

The cron runner omits these updates because `authProfileId` is a `const` at line 430. For isolated sessions, the auth profile is always resolved fresh before the first attempt (via `resolveSessionAuthProfileOverride`), so in practice the error's `authProfileId` should match the already-resolved value and this divergence is harmless. However, if the embedded runner ever returns a `LiveSessionModelSwitchError` with a _different_ auth-profile hint (e.g., the session store's `authProfileOverride` was written by an external agent between the resolution step and the actual run), the retry would proceed with the stale auth profile and could fail with a 401.

Consider converting `authProfileId` (and `authProfileIdSource`) to `let` and updating them in the catch block to stay aligned with the reference pattern.

How can I resolve this? If you propose a fix, please make it concise.

@obviyus obviyus self-assigned this Mar 31, 2026
Isaac Saba and others added 2 commits March 31, 2026 10:00
The main agent runner catches LiveSessionModelSwitchError and retries
with the requested model, but cron isolated sessions hit this error
and fail immediately. This extends the retry to cover cron execution.

When a cron job with `sessionTarget: 'isolated'` specifies a `model`
different from the agent's primary, the embedded runner throws
LiveSessionModelSwitchError (because the session initialized with the
wrong model). The fix wraps the initial runPrompt call in a retry loop
that catches this error, updates provider/model state, and re-runs —
mirroring the existing retry logic in agent-runner-execution.ts.

Fixes openclaw#57206
@obviyus obviyus force-pushed the fix/cron-isolated-model-switch branch from fd33bac to baecfbb Compare March 31, 2026 04:31
Copy link
Copy Markdown
Contributor

@obviyus obviyus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed latest changes; landing now.

@obviyus obviyus merged commit 10ac6ea into openclaw:main Mar 31, 2026
8 checks passed
@obviyus
Copy link
Copy Markdown
Contributor

obviyus commented Mar 31, 2026

Landed on main.

Thanks @issaba1.

pgondhi987 pushed a commit to pgondhi987/openclaw that referenced this pull request Mar 31, 2026
@issaba1)

* fix: handle LiveSessionModelSwitchError in cron isolated sessions

The main agent runner catches LiveSessionModelSwitchError and retries
with the requested model, but cron isolated sessions hit this error
and fail immediately. This extends the retry to cover cron execution.

When a cron job with `sessionTarget: 'isolated'` specifies a `model`
different from the agent's primary, the embedded runner throws
LiveSessionModelSwitchError (because the session initialized with the
wrong model). The fix wraps the initial runPrompt call in a retry loop
that catches this error, updates provider/model state, and re-runs —
mirroring the existing retry logic in agent-runner-execution.ts.

Fixes openclaw#57206

* fix: carry auth profile through cron model retry

* fix: complete cron isolated model-switch retry (openclaw#57972) (thanks @issaba1)

---------

Co-authored-by: Isaac Saba <isaacsaba@Isaacs-Mac-mini.local>
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
pgondhi987 pushed a commit to pgondhi987/openclaw that referenced this pull request Mar 31, 2026
@issaba1)

* fix: handle LiveSessionModelSwitchError in cron isolated sessions

The main agent runner catches LiveSessionModelSwitchError and retries
with the requested model, but cron isolated sessions hit this error
and fail immediately. This extends the retry to cover cron execution.

When a cron job with `sessionTarget: 'isolated'` specifies a `model`
different from the agent's primary, the embedded runner throws
LiveSessionModelSwitchError (because the session initialized with the
wrong model). The fix wraps the initial runPrompt call in a retry loop
that catches this error, updates provider/model state, and re-runs —
mirroring the existing retry logic in agent-runner-execution.ts.

Fixes openclaw#57206

* fix: carry auth profile through cron model retry

* fix: complete cron isolated model-switch retry (openclaw#57972) (thanks @issaba1)

---------

Co-authored-by: Isaac Saba <isaacsaba@Isaacs-Mac-mini.local>
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
amittell pushed a commit to amittell/openclaw that referenced this pull request Mar 31, 2026
@issaba1)

* fix: handle LiveSessionModelSwitchError in cron isolated sessions

The main agent runner catches LiveSessionModelSwitchError and retries
with the requested model, but cron isolated sessions hit this error
and fail immediately. This extends the retry to cover cron execution.

When a cron job with `sessionTarget: 'isolated'` specifies a `model`
different from the agent's primary, the embedded runner throws
LiveSessionModelSwitchError (because the session initialized with the
wrong model). The fix wraps the initial runPrompt call in a retry loop
that catches this error, updates provider/model state, and re-runs —
mirroring the existing retry logic in agent-runner-execution.ts.

Fixes openclaw#57206

* fix: carry auth profile through cron model retry

* fix: complete cron isolated model-switch retry (openclaw#57972) (thanks @issaba1)

---------

Co-authored-by: Isaac Saba <isaacsaba@Isaacs-Mac-mini.local>
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
lovewanwan pushed a commit to lovewanwan/openclaw that referenced this pull request Apr 28, 2026
@issaba1)

* fix: handle LiveSessionModelSwitchError in cron isolated sessions

The main agent runner catches LiveSessionModelSwitchError and retries
with the requested model, but cron isolated sessions hit this error
and fail immediately. This extends the retry to cover cron execution.

When a cron job with `sessionTarget: 'isolated'` specifies a `model`
different from the agent's primary, the embedded runner throws
LiveSessionModelSwitchError (because the session initialized with the
wrong model). The fix wraps the initial runPrompt call in a retry loop
that catches this error, updates provider/model state, and re-runs —
mirroring the existing retry logic in agent-runner-execution.ts.

Fixes openclaw#57206

* fix: carry auth profile through cron model retry

* fix: complete cron isolated model-switch retry (openclaw#57972) (thanks @issaba1)

---------

Co-authored-by: Isaac Saba <isaacsaba@Isaacs-Mac-mini.local>
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
@issaba1)

* fix: handle LiveSessionModelSwitchError in cron isolated sessions

The main agent runner catches LiveSessionModelSwitchError and retries
with the requested model, but cron isolated sessions hit this error
and fail immediately. This extends the retry to cover cron execution.

When a cron job with `sessionTarget: 'isolated'` specifies a `model`
different from the agent's primary, the embedded runner throws
LiveSessionModelSwitchError (because the session initialized with the
wrong model). The fix wraps the initial runPrompt call in a retry loop
that catches this error, updates provider/model state, and re-runs —
mirroring the existing retry logic in agent-runner-execution.ts.

Fixes openclaw#57206

* fix: carry auth profile through cron model retry

* fix: complete cron isolated model-switch retry (openclaw#57972) (thanks @issaba1)

---------

Co-authored-by: Isaac Saba <isaacsaba@Isaacs-Mac-mini.local>
Co-authored-by: Ayaan Zaidi <hi@obviy.us>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Isolated cron sessions ignore model override (LiveSessionModelSwitchError)

2 participants