Skip to content

fix(provider): retry transient 401s for a known-good key + clearer auth error (#3146)#4106

Merged
esengine merged 1 commit into
main-v2from
fix/3146-mimo-transient-401
Jun 12, 2026
Merged

fix(provider): retry transient 401s for a known-good key + clearer auth error (#3146)#4106
esengine merged 1 commit into
main-v2from
fix/3146-mimo-transient-401

Conversation

@esengine

Copy link
Copy Markdown
Owner

Problem

MiMo's token-plan gateway (token-plan-cn.xiaomimimo.com) returns a transient 401 under load / quota / gateway hiccups, not a 429. Two things turned that into an apparent dead key (#3146):

  1. We never retry a 401 — RetryableStatus treats every 4xx as unrecoverable.
  2. Every 401 is shown as "API key is missing, wrong, or expired", so a key that's actually fine reads as expired.

Users reported a working key "expiring" mid-session; re-entering the same key fixed it — but only because saving the key rebuilds the connection, not because the key changed. We do not delete the key anywhere; the 401 came from the server.

Fix

  • Retry transient 401s for a known-good key. SendWithRetry now takes a SendOptions with RetryAuth (true once a request on this client has succeeded) and KeyPresent. A 401/403 on a key that has authenticated before backs off and retries up to maxAuthRetries (2); a key that has never worked still fails fast, so a genuinely bad key isn't hammered.
  • Clearer message. AuthError.HasKey lets the UI distinguish:
    • no key configured → "API key is missing or unset…"
    • key sent but rejected → "the server rejected your API key… may be a transient auth/quota issue — retried with backoff and still failed; try again shortly."

Wired through both the OpenAI-compatible provider (MiMo / DeepSeek / MiniMax) and Anthropic.

Tests

  • SendWithRetry retries a transient 401 for a known key and recovers (3 calls); gives up after maxAuthRetries with HasKey=true; still fails fast for a never-authed key.
  • explainError selects the missing vs. server-rejected message by HasKey.

Closes #3146

…h error

MiMo's token-plan gateway returns a transient 401 under load/quota. We
surfaced every 401 as "API key missing, wrong, or expired" and never
retried it, so a working key looked like it had "expired" — re-entering
the same key only helped because saving it rebuilt the connection.

- SendWithRetry backs off and retries a 401/403 up to maxAuthRetries when
  the key has authenticated before (SendOptions.RetryAuth); a key that has
  never worked still fails fast, so a genuinely bad key isn't hammered.
- AuthError gains HasKey so the message separates "no key configured" from
  "server rejected your key (possibly transient)".

Closes #3146
@esengine esengine requested a review from SivanCola as a code owner June 12, 2026 02:45
@github-actions github-actions Bot added v2 Go rewrite (1.x) — main-v2 branch, active development agent Core agent loop (internal/agent, internal/control) provider Model providers & selection (internal/provider) labels Jun 12, 2026
@esengine esengine merged commit 6edcf5a into main-v2 Jun 12, 2026
14 checks passed
@esengine esengine deleted the fix/3146-mimo-transient-401 branch June 12, 2026 02:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent Core agent loop (internal/agent, internal/control) provider Model providers & selection (internal/provider) v2 Go rewrite (1.x) — main-v2 branch, active development

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: 调用MIMO失败

1 participant