Skip to content

fix: preserve selected session model during fallback#40053

Closed
glitch418x wants to merge 1 commit intoopenclaw:mainfrom
glitch418x:fix/fallback-runtime-state
Closed

fix: preserve selected session model during fallback#40053
glitch418x wants to merge 1 commit intoopenclaw:mainfrom
glitch418x:fix/fallback-runtime-state

Conversation

@glitch418x
Copy link
Copy Markdown
Contributor

@glitch418x glitch418x commented Mar 8, 2026

Summary

  • Fallback turns were persisting the fallback model as the session runtime model, which made session status look like the session had permanently switched models.
  • This was misleading in Telegram/OpenClaw status because the selected model and the actively serving fallback model were being conflated.
  • The fix keeps the selected session model stable in persisted runtime state, while still exposing active fallback state separately in status.
  • No config/schema or failover-policy changes are included in this PR.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #

User-visible / Behavior Changes

  • Status/runtime model displays no longer silently replace the selected session model with the fallback model after a successful fallback turn.
  • Active fallback is still shown when fallback notice state is present.
  • No defaults/config changed.

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: OpenClaw source checkout
  • Model/provider: anthropic/claude-opus-4-6 with openai-codex/gpt-5.4 fallback
  • Integration/channel (if any): Telegram direct session
  • Relevant config (redacted): selected primary model with configured fallback

Steps

  1. Start a session with a selected primary model and a configured fallback.
  2. Force or simulate a successful fallback turn.
  3. Inspect persisted session runtime model and status output.

Expected

  • The persisted session runtime model stays on the selected model.
  • Status can still show that fallback is currently active.

Actual

  • Before this change, the fallback model could be persisted as the session runtime model.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: reviewed the persistence path, fallback status rendering path, and the added regression coverage.
  • Edge cases checked: active fallback still renders when fallback notice state matches the selected model; persisted runtime state remains pinned to the selected model while usage/context sizing still reflect the actual serving model.
  • What you did not verify: a post-deploy live Telegram round-trip.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps:

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: revert this PR/commit.
  • Files/config to restore: src/commands/agent/session-store.ts, src/auto-reply/reply/session-usage.ts, src/auto-reply/model-runtime.ts, related tests.
  • Known bad symptoms reviewers should watch for: active fallback stops rendering, or persisted runtime model drifts to the fallback model again.

Risks and Mitigations

  • Risk: fallback status could be hidden if fallback notice state and selected model drift out of sync.
    • Mitigation: regression tests cover matching and mismatched fallback-state cases, and persisted runtime model remains independent from fallback notice state.

@openclaw-barnacle openclaw-barnacle Bot added commands Command implementations size: M labels Mar 8, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 8, 2026

Greptile Summary

This PR fixes a session model drift bug: previously, completing a turn via a fallback model would overwrite the session's persisted modelProvider/model with the fallback model's identity, causing the agent to "forget" the user-selected model on subsequent turns. The fix preserves the selected model in the store via new persistedModel/persistedProvider params, and separately routes fallback-serving information through fallbackNoticeSelectedModel/fallbackNoticeActiveModel so the status display can still surface the active fallback without needing to read it from the runtime model fields.

Key changes:

  • session-store.ts / agent-runner.ts: always persist the user-selected provider/model pair; context-token sizing correctly continues to use the actual serving model's limits.
  • session-usage.ts: new persistedModel/persistedProvider params take priority over modelUsed/providerUsed in both write paths.
  • model-runtime.ts: resolveSelectedAndActiveModel now reads fallbackNoticeActiveModel from the session entry when the stored notice matches the current selection, so the UI shows the fallback model as "active" without the runtime fields lying about which model is configured.
  • Tests: unit and e2e coverage added for session-store persistence, status rendering, and the fallback-during-reply-agent path.

Confidence Score: 4/5

  • This PR is safe to merge; the core logic is correct and well-tested, with one minor latent guard-condition robustness concern.
  • The implementation correctly preserves the selected session model during fallback across both code paths (updateSessionStoreAfterAgentRun and persistSessionUsageUpdate), and the resolveSelectedAndActiveModel logic uses fallbackNoticeActiveModel consistently with how the notice is written. The minor deduction is for the second-branch guard in session-usage.ts (if (params.modelUsed || params.contextTokensUsed)) not including the new persistedModel/persistedProvider params — harmless today but a future footgun if callers pass only the new fields without modelUsed.
  • Pay close attention to src/auto-reply/reply/session-usage.ts — the second branch guard condition does not cover the new persistedModel/persistedProvider params.

Comments Outside Diff (1)

  1. src/auto-reply/reply/session-usage.ts, line 117 (link)

    persistedModel/persistedProvider silently ignored when guard condition misses

    The second branch condition if (params.modelUsed || params.contextTokensUsed) does not include params.persistedModel || params.persistedProvider. If a future caller passes only persistedModel/persistedProvider (with no modelUsed and no contextTokensUsed), those values would be silently dropped — neither branch would execute.

    Today this isn't a bug because every call site in agent-runner.ts always provides modelUsed (it falls back to defaultModel which is a required param), but the guard doesn't communicate that invariant and could trap a future caller.

Last reviewed commit: ce0c516

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

commands Command implementations size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants