Skip to content

Fallback provider can become pinned in session metadata and trap a chat on unavailable LM Studio model #90462

@al-osokin

Description

@al-osokin

Summary

OpenClaw can trap a single chat/topic on an unavailable fallback model after a transient model-call failure.

In our case, one Telegram topic started replying only:

Model is unloaded

The configured/default model had been restored to openai/gpt-5.5, but the affected topic continued using an effective session-level route to:

lmstudio/google/gemma-4-26b-a4b

The only way to recover the chat was to manually edit session metadata.

Environment

  • OpenClaw: 2026.5.28 (e932160)
  • Channel: Telegram direct topic
  • Default model: openai/gpt-5.5
  • Fallback provider involved: LM Studio
  • Affected session kind: direct topic session

Observed behavior

During an unrelated network/server incident, a regular OpenAI/Codex request failed while the Telegram topic session was active. The observed provider-side response was a 403 from:

https://chatgpt.com/backend-api/codex/responses

OpenClaw interpreted this transient network/provider/auth-like failure as a reason to trigger fallback.

Fallback selected:

lmstudio/google/gemma-4-26b-a4b

That fallback route was then persisted in session metadata for the affected Telegram topic. LM Studio could not serve the selected model and returned:

Model is unloaded

After that, every subsequent message in the topic produced only Model is unloaded.

Running:

/model openai/gpt-5.5

did not recover the topic. The visible/default model changed back to OpenAI, but the effective session route was still pinned to LM Studio.

Neighboring Telegram topics remained on openai/gpt-5.5, so this was not a full gateway outage.

Affected state

Local inspection found pinned model/provider metadata on the direct session and one Telegram topic session.

The affected topic metadata had:

modelProvider=lmstudio
model=google/gemma-4-26b-a4b

Other active Telegram topics still had:

modelProvider=openai
model=gpt-5.5

Why this is a bug

Automatic fallback became a persistent per-session route after a transient failure, and the selected fallback model was unavailable.

That made the affected chat unrecoverable through normal user-facing model commands.

Problems observed:

  • fallback state was persisted too aggressively;
  • unavailable fallback model trapped the session;
  • configured/default model and effective session model diverged silently;
  • /model openai/gpt-5.5 changed the visible/default model but not the effective pinned route;
  • doctor did not clearly report session-level fallback pinning;
  • raw provider error Model is unloaded was sent repeatedly instead of a recovery hint.

Expected behavior

Automatic fallback should be safe, observable, and reversible.

Fallback should not permanently pin a session unless:

  • the fallback provider/model passes a healthcheck; and
  • the user/admin explicitly accepts persistent fallback, or OpenClaw has a documented policy for persisting it.

If a fallback model fails with Model is unloaded, OpenClaw should treat that fallback as failed and either:

  • try the next configured fallback;
  • return to the configured/default model if possible;
  • or send a diagnostic recovery message instead of repeating the raw provider error.

Suggested fixes

Diagnostics:

  • Add openclaw sessions inspect <session-id> showing configured vs effective provider/model, override source, and last automatic fallback reason/time.
  • Add /model status for the current chat/topic.
  • Make doctor warn when active sessions are pinned to fallback/local providers.
  • Include exact reset commands in doctor output.

Recovery:

  • Add /model reset to clear current chat/topic model/provider overrides.
  • Add openclaw sessions reset-model <session-id> to clear session-level model/provider override metadata.

Fallback safety:

  • Do not persist automatic fallback as a session override after a single transient provider/auth/network error.
  • Healthcheck fallback providers/models before selecting or persisting them.
  • If LM Studio or another fallback returns Model is unloaded, clear or skip that fallback route instead of repeatedly returning the raw provider error.

Related

This is related in spirit to model/session reset/reconcile concerns, but it is a concrete failure mode where automatic fallback persisted an unavailable provider route and trapped one Telegram topic.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:needs-live-reproClawSweeper needs live local, crabbox, or manual validation to confirm this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.impact:auth-providerAuth, provider routing, model choice, or SecretRef resolution may break.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🐚 platinum hermitGood issue quality with a plausible reproduction path needing some confirmation.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions