fix: probe stale rate-limit cooldown primaries by joshavant · Pull Request #87833 · openclaw/openclaw

joshavant · 2026-05-29T00:46:30Z

Summary

Allow throttled half-open probes for stale generic rate_limit cooldowns on primary model candidates even when fallbacks are configured.
Preserve provider-recorded reset windows (blockedUntil with provider block metadata) so known future quota windows still keep the fallback path preferred until near expiry.
Update model fallback tests for the new probe behavior and the recorded-window guard.

Verification

.agents/skills/autoreview/scripts/autoreview --mode local
- Clean: no accepted/actionable findings.
node scripts/run-vitest.mjs src/agents/model-fallback.probe.test.ts src/agents/model-fallback.test.ts -- --reporter=verbose
- 4 files passed, 170 tests passed.
AWS Crabbox live repro with Ollama Cloud:
- provider: aws
- lease: cbx_33d131797efc
- run: run_4f42fc0b825a
- result: passed

Real behavior proof

Behavior addressed: A primary model with a stale generic rate_limit cooldown could remain bypassed behind configured fallbacks even after the provider recovered.

Real environment tested: AWS Crabbox Linux runner using live Ollama Cloud credentials, Gateway, and a real ollama-cloud/gemma3:4b model request.

Exact steps or command run after this patch: seeded an agent auth profile with a 30-minute generic cooldownUntil/cooldownReason: rate_limit, configured primary ollama-cloud/gemma3:4b plus a synthetic missing fallback, started Gateway, and sent a raw Gateway agent model-run request through Ollama Cloud.

Evidence after fix: Crabbox run run_4f42fc0b825a on AWS lease cbx_33d131797efc logged decision=probe_cooldown_candidate for ollama-cloud/gemma3:4b, then decision=candidate_succeeded for the same primary model.

Observed result after fix: The primary Ollama model was probed and succeeded; the seeded cooldown state was cleared (hasCooldownUntil: false, errorCount: 0, failureCounts: null), and the script ended with live_regression_repro_passed=true.

What was not tested: Full repository test suite; this change was covered with focused model fallback tests plus the live Gateway/Ollama regression repro.

clawsweeper · 2026-05-29T00:48:03Z

Codex review: needs maintainer review before merge. Reviewed May 28, 2026, 8:52 PM ET / 00:52 UTC.

Summary
The PR changes model fallback cooldown decisions so stale generic rate-limit cooldowns on primary candidates can be half-open probed despite configured fallbacks, while active provider reset windows keep fallback preference, and updates focused tests.

PR surface: Source +38, Tests +10. Total +48 across 3 files.

Reproducibility: yes. from source inspection and PR proof: current main skips a cooldowned primary until near expiry, while the PR body reports a live Gateway/Ollama run where a seeded generic rate_limit cooldown was probed and cleared. I did not execute the repro in this read-only review.

Review metrics: 1 noteworthy metric.

Fallback cooldown decision: 1 changed decision path. The PR changes whether a cooldowned primary model is tried before configured fallbacks for generic rate_limit state.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

Have a maintainer explicitly accept the generic rate-limit primary-probe-before-fallback tradeoff under the protected label.

Risk before merge

[P2] Merging intentionally changes fallback preference for generic rate_limit cooldowns: existing fallback chains may spend one throttled primary probe before using fallbacks when the local cooldown is still active.

Maintainer options:

Accept the bounded probe tradeoff (recommended)
Merge if maintainers agree that a throttled primary probe for generic rate_limit cooldowns is preferable to staying on fallbacks until the local cooldown nears expiry.
Ask for broader provider proof
If maintainers are not comfortable generalizing from the focused tests and Ollama Cloud live repro, request an additional multi-profile or non-Ollama proof before merge.

Next step before merge

No ClawSweeper repair is needed; maintainer review should decide the protected fallback-policy tradeoff and normal merge readiness.

Security
Cleared: The diff only touches model fallback runtime and tests; it does not change dependencies, workflows, credentials, scripts, or package resolution.

Review details

Best possible solution:

Land the focused fallback-policy fix after a maintainer accepts the bounded primary-probe tradeoff and required checks pass, keeping the provider reset-window guard intact.

Do we have a high-confidence way to reproduce the issue?

Yes from source inspection and PR proof: current main skips a cooldowned primary until near expiry, while the PR body reports a live Gateway/Ollama run where a seeded generic rate_limit cooldown was probed and cleared. I did not execute the repro in this read-only review.

Is this the best way to solve the issue?

Yes, the implementation is a narrow fallback-decision change with focused tests for stale generic rate_limit cooldowns and provider-recorded reset windows. The remaining question is whether maintainers accept the fallback-policy tradeoff before merge.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 61cf005437fd.

Label changes

Label changes:

add P1: The linked bug blocks real agent/message workflows after recovered provider 429s and the PR targets that urgent fallback/auth-provider path.
add merge-risk: 🚨 auth-provider: The diff changes provider routing and model-choice behavior while auth-profile cooldown state is active.
add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes structured post-patch AWS Crabbox live Gateway/Ollama proof with exact setup, run and lease IDs, decision logs, and cleared cooldown state.
add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (logs): The PR body includes structured post-patch AWS Crabbox live Gateway/Ollama proof with exact setup, run and lease IDs, decision logs, and cleared cooldown state.

Label justifications:

P1: The linked bug blocks real agent/message workflows after recovered provider 429s and the PR targets that urgent fallback/auth-provider path.
merge-risk: 🚨 auth-provider: The diff changes provider routing and model-choice behavior while auth-profile cooldown state is active.
rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (logs): The PR body includes structured post-patch AWS Crabbox live Gateway/Ollama proof with exact setup, run and lease IDs, decision logs, and cleared cooldown state.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes structured post-patch AWS Crabbox live Gateway/Ollama proof with exact setup, run and lease IDs, decision logs, and cleared cooldown state.

Evidence reviewed

PR surface:

Source +38, Tests +10. Total +48 across 3 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	1	44	6	+38
Tests	2	40	30	+10
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	3	84	36	+48

What I checked:

Current main behavior: Current main only probes a primary candidate with configured fallbacks when the cooldown is expired or within PROBE_MARGIN_MS, so a 30-minute generic rate_limit cooldown remains skipped behind fallbacks. (src/agents/model-fallback.ts:946, 61cf005437fd)
PR implementation: The PR adds hasActiveProviderRateLimitResetWindow and passes the inferred cooldown reason into shouldProbePrimaryDuringCooldown so generic rate_limit cooldowns can probe immediately unless a provider-recorded blockedUntil reset window is still active. (src/agents/model-fallback.ts:947, fccb2f576541)
Focused test coverage: The probe tests now assert that a far-from-expiry generic rate_limit primary is probed and that a blockedUntil window with subscription_limit metadata still suspends instead of probing. (src/agents/model-fallback.probe.test.ts:343, fccb2f576541)
Provider reset-window contract: The existing auth usage code records blockedUntil with blockedReason, blockedSource, and blockedModel for provider quota windows, which matches the PR's guard for authoritative provider reset windows. (src/agents/auth-profiles/usage.ts:795, 61cf005437fd)
Real behavior proof in PR body: The PR body reports an AWS Crabbox live Gateway/Ollama Cloud repro, run_4f42fc0b825a on lease cbx_33d131797efc, with probe_cooldown_candidate followed by candidate_succeeded and cleared cooldown state. (fccb2f576541)
Not already implemented on main: The PR head is not an ancestor of the current main checkout, so the branch still carries the proposed fix rather than being obsolete. (fccb2f576541)

Likely related people:

Peter Steinberger: git log shows many recent model fallback/auth test and policy commits in the touched files, including shared probe assertions and fallback decision logging work. (role: heavy recent area contributor; confidence: high; commits: 6a87d6e81426, 51c6b1c2bc56, 6739c28718ec; files: src/agents/model-fallback.ts, src/agents/model-fallback.probe.test.ts, src/agents/model-fallback.test.ts)
Ítalo Souza: Commit 39bb1b3 added the earlier auto-recovery path for primary models after rate-limit cooldown expiry and added the probe regression tests this PR extends. (role: introduced related recovery behavior; confidence: high; commits: 39bb1b33222e; files: src/agents/model-fallback.ts, src/agents/model-fallback.probe.test.ts, src/agents/auth-profiles/usage.ts)
sebslight: Commit d224776 extracted the cooldown probe decision helper that this PR modifies, and the earlier recovery commit records sebslight as reviewer/co-author. (role: recent refactor/reviewer; confidence: high; commits: d224776ffbb1, 39bb1b33222e; files: src/agents/model-fallback.ts, src/agents/model-fallback.probe.test.ts)
Vignesh Natarajan: Recent history includes explicit rate-limit cooldown probe behavior and unavailable-reason inference work in the same fallback path. (role: adjacent cooldown probe contributor; confidence: medium; commits: d45353f95b57, 5c7c37a02a3b; files: src/agents/model-fallback.ts, src/agents/model-fallback.probe.test.ts, src/agents/model-fallback.test.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

…026.5.28) (#759) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [ghcr.io/openclaw/openclaw](https://openclaw.ai) ([source](https://github.com/openclaw/openclaw)) | patch | `2026.5.27` → `2026.5.28` | --- ### Release Notes <details> <summary>openclaw/openclaw (ghcr.io/openclaw/openclaw)</summary> ### [`v2026.5.28`](https://github.com/openclaw/openclaw/blob/HEAD/CHANGELOG.md#2026528) [Compare Source](openclaw/openclaw@v2026.5.27...v2026.5.28) ##### Highlights - Agent and Codex runtime recovery is steadier: subagents keep cwd/workspace separation, hook context stays prompt-local, session locks release on timeout abort while live OpenClaw locks survive cleanup, stale restart continuations are avoided, and Codex app-server/helper failures no longer tear down shared runtime state. ([#87218](openclaw/openclaw#87218), [#86875](openclaw/openclaw#86875), [#87409](openclaw/openclaw#87409), [#87399](openclaw/openclaw#87399), [#87375](openclaw/openclaw#87375), [#88129](openclaw/openclaw#88129)) - Channel delivery and session identity got safer across outbound plugin hooks, Matrix room ids, iMessage reactions/approvals, Slack final replies, Discord recovered tool warnings, runtime-config message actions, WhatsApp profile auth roots, Telegram polling, and Microsoft Teams service URL trust checks. ([#73706](openclaw/openclaw#73706), [#75670](openclaw/openclaw#75670), [#87366](openclaw/openclaw#87366), [#87451](openclaw/openclaw#87451), [#87334](openclaw/openclaw#87334), [#84535](openclaw/openclaw#84535), [#82492](openclaw/openclaw#82492), [#83304](openclaw/openclaw#83304), [#87160](openclaw/openclaw#87160)) - Mobile and chat surfaces got a broader refresh: the iOS Pro UI, hosted push relay default, realtime Talk tab playback, Gateway chat transport, onboarding, Talk permissions, WebChat reconnect delivery, and session picker behavior now preserve more state across reconnects and empty searches. ([#87367](openclaw/openclaw#87367), [#87531](openclaw/openclaw#87531), [#87682](openclaw/openclaw#87682), [#88096](openclaw/openclaw#88096), [#88105](openclaw/openclaw#88105)) Thanks [@ngutman](https://github.com/ngutman) and [@BunsDev](https://github.com/BunsDev). - Browser, channel, and automation inputs are stricter: Browser tool timeouts, viewport/tab indices, Gateway ports, cron retry handling, Discord component ids, schema array refs, Telegram callback pages, and channel progress callbacks now reject malformed values earlier and preserve the intended delivery context. ([#82887](openclaw/openclaw#82887)) - Provider, media, and document coverage expands with Claude Opus 4.8, Fal Krea image schemas, NVIDIA featured models, MiniMax streaming music responses, encrypted PDF extraction, voice model catalogs, GitHub Copilot agent runtime support, and a Codex Supervisor plugin path for delegated Codex workflows. ([#87845](openclaw/openclaw#87845), [#87890](openclaw/openclaw#87890), [#80775](openclaw/openclaw#80775), [#84764](openclaw/openclaw#84764), [#87751](openclaw/openclaw#87751), [#87794](openclaw/openclaw#87794)) - CLI, auth, doctor, and provider paths fail faster and recover more clearly: malformed numeric/version options are rejected, workspace dotenv provider credentials are ignored, heartbeat defaults, OAuth/token lifetimes, and local service startup requests are bounded, agent auth health labels are clearer, legacy `api_key` auth profiles migrate to canonical form, and restart guidance is actionable. ([#87398](openclaw/openclaw#87398), [#86281](openclaw/openclaw#86281), [#87361](openclaw/openclaw#87361), [#88133](openclaw/openclaw#88133), [#83655](openclaw/openclaw#83655), [#87559](openclaw/openclaw#87559), [#88088](openclaw/openclaw#88088), [#85924](openclaw/openclaw#85924)) Thanks [@vincentkoc](https://github.com/vincentkoc) and [@giodl73-repo](https://github.com/giodl73-repo). - Plugin and Gateway hot paths do less repeated work while preserving cache correctness for install records, config JSON parsing, tool search catalogs, session stores, manifest model rows, auto-enabled plugin config, browser tokens, viewer assets, and release-split external plugin packages. ([#86699](openclaw/openclaw#86699)) - Release, QA, and E2E validation now bound more log, artifact, harness, and cross-OS waits so failing lanes produce proof instead of hanging or false-greening. ##### Changes - Status: show active subagent details in status output. - Diffs: split the default language pack and expand default Diffs language coverage while keeping the host floor aligned. ([#87370](openclaw/openclaw#87370), [#87372](openclaw/openclaw#87372)) Thanks [@RomneyDa](https://github.com/RomneyDa). - ClawHub: add plugin display names plus skill verification and trust surfaces. ([#87354](openclaw/openclaw#87354), [#86699](openclaw/openclaw#86699)) Thanks [@thewilloftheshadow](https://github.com/thewilloftheshadow) and [@Patrick-Erichsen](https://github.com/Patrick-Erichsen). - iOS: refresh the dev app with Pro Command, Chat, Agents, Settings, hosted push relay defaults, and realtime Talk playback wired to gateway sessions, diagnostics, chat, and realtime Talk. ([#87367](openclaw/openclaw#87367), [#88096](openclaw/openclaw#88096), [#88105](openclaw/openclaw#88105)) Thanks [@Solvely-Colin](https://github.com/Solvely-Colin) and [@ngutman](https://github.com/ngutman). - Docs: clarify Codex computer-use setup, paste-token stdin auth setup, macOS gateway sleep troubleshooting, native Codex hook relay recovery, container model auth, install deployment cards, device-token admin gating, CLI setup flow compatibility, Notte cloud browser CDP setup, and backport targets. ([#87313](openclaw/openclaw#87313), [#63050](openclaw/openclaw#63050), [#87685](openclaw/openclaw#87685)) Thanks [@bdjben](https://github.com/bdjben), [@liaoandi](https://github.com/liaoandi), and [@thewilloftheshadow](https://github.com/thewilloftheshadow). - PDF/tools: use ClawPDF for PDF extraction, support encrypted PDF extraction, and surface MCP structured content in agent tool results. ([#87670](openclaw/openclaw#87670), [#87751](openclaw/openclaw#87751)) - Providers: add Claude Opus 4.8 support, Fal Krea image model schemas, NVIDIA featured model catalogs, MiniMax streaming music responses, and provider-backed voice model catalogs. ([#87845](openclaw/openclaw#87845), [#87890](openclaw/openclaw#87890), [#80775](openclaw/openclaw#80775), [#84764](openclaw/openclaw#84764), [#87794](openclaw/openclaw#87794)) Thanks [@eleqtrizit](https://github.com/eleqtrizit) and [@vincentkoc](https://github.com/vincentkoc). - Codex/GitHub: add the GitHub Copilot agent runtime and the Codex Supervisor plugin package. - Plugins: externalize GitHub Copilot and Tokenjuice as official install-on-demand plugins with npm and ClawHub publish metadata. - Workboard: add agent coordination tools for tracking and handing off active agent work. - Discord: show commentary in progress drafts so live Discord runs expose useful in-progress context. ([#85200](openclaw/openclaw#85200)) - Plugin SDK: add a reply payload sending hook for plugins that need to deliver channel-owned replies and flatten package types for SDK declarations. ([#82823](openclaw/openclaw#82823), [#87165](openclaw/openclaw#87165)) Thanks [@piersonr](https://github.com/piersonr) and [@RomneyDa](https://github.com/RomneyDa). - Policy: add policy comparison, ingress-channel conformance, and sandbox-posture conformance checks. ([#85572](openclaw/openclaw#85572), [#85744](openclaw/openclaw#85744), [#86768](openclaw/openclaw#86768)) ##### Fixes - Agents: fall back to local config pruning when the optional `agents delete` Gateway probe cannot authenticate, so offline installs can still delete agents without removing shared workspaces. - Tighten phone-control mutation authorization \[AI]. ([#87150](openclaw/openclaw#87150)) Thanks [@pgondhi987](https://github.com/pgondhi987). - Clarify directive persistence authorization policy \[AI]. ([#86369](openclaw/openclaw#86369)) Thanks [@pgondhi987](https://github.com/pgondhi987). - Agents/Codex: keep spawned agent cwd/workspace state separated, forward ACP spawn attachments, keep hook context prompt-local, release session locks on timeout abort and runtime teardown without deleting live OpenClaw-owned locks during cleanup, avoid session event queue self-wait, clean up exec abort listeners, stream assistant deltas incrementally, recover raw missing-thread compaction failures, preserve rotated compaction session identity, keep compaction-timeout snapshots continuable, preserve shared app-server state across startup or helper failures, keep native hook relay alive across restarts and prune stale bridge files, close native hook relay replacement races, keep Claude live tool progress visible for watchdog recovery, suppress abandoned requester completion handoff, route workspace memory through tools, resolve Codex runtime models first, report quarantined dynamic tools, format `skills` command output, bind node auto-review to prepared plans, retry Claude CLI transcript probes, and bound compaction/steering retries. ([#87218](openclaw/openclaw#87218), [#86875](openclaw/openclaw#86875), [#86123](openclaw/openclaw#86123), [#88129](openclaw/openclaw#88129), [#87399](openclaw/openclaw#87399), [#87375](openclaw/openclaw#87375), [#72574](openclaw/openclaw#72574), [#87383](openclaw/openclaw#87383), [#87400](openclaw/openclaw#87400), [#83022](openclaw/openclaw#83022), [#87671](openclaw/openclaw#87671), [#87738](openclaw/openclaw#87738), [#87747](openclaw/openclaw#87747), [#87706](openclaw/openclaw#87706), [#87546](openclaw/openclaw#87546), [#87541](openclaw/openclaw#87541), [#81048](openclaw/openclaw#81048)) Thanks [@mbelinky](https://github.com/mbelinky), [@Alix-007](https://github.com/Alix-007), [@luoyanglang](https://github.com/luoyanglang), [@yetval](https://github.com/yetval), [@sjf](https://github.com/sjf), [@joshavant](https://github.com/joshavant), [@benjamin1492](https://github.com/benjamin1492), [@c19354837](https://github.com/c19354837), [@fuller-stack-dev](https://github.com/fuller-stack-dev), [@pfrederiksen](https://github.com/pfrederiksen), and [@dodge1218](https://github.com/dodge1218). - Codex Supervisor: keep real-home app-server MCP session listing on the loaded state path, bound stored history scans, and close WebSocket probes cleanly. - Channels: thread canonical session keys into outbound hooks, preserve Matrix room-id case, keep fallback tool warnings mention-inert, retain delivered Slack final replies during late cleanup, continue iMessage polling after denied reactions, suppress duplicate native exec approvals, resolve Gateway message actions against the active runtime config, preserve Telegram SecretRef prompt config and polling keepalives, preserve WhatsApp profile auth roots, QR display, document filenames, and plugin hook config, suppress Discord recovered tool warnings, preserve the Discord voice outbound helper, cap Discord/Signal/Zalo channel request and container timeouts, and block untrusted Teams service URLs while keeping TeamsSDK patterns aligned. ([#73706](openclaw/openclaw#73706), [#75670](openclaw/openclaw#75670), [#87366](openclaw/openclaw#87366), [#87451](openclaw/openclaw#87451), [#87465](openclaw/openclaw#87465), [#87334](openclaw/openclaw#87334), [#84535](openclaw/openclaw#84535), [#76262](openclaw/openclaw#76262), [#83304](openclaw/openclaw#83304), [#82492](openclaw/openclaw#82492), [#87581](openclaw/openclaw#87581), [#77114](openclaw/openclaw#77114), [#86426](openclaw/openclaw#86426), [#85529](openclaw/openclaw#85529), [#87160](openclaw/openclaw#87160)) Thanks [@zeroaltitude](https://github.com/zeroaltitude), [@lukeboyett](https://github.com/lukeboyett), [@jarvis-mns1](https://github.com/jarvis-mns1), [@xiaotian](https://github.com/xiaotian), [@funmerlin](https://github.com/funmerlin), [@joshavant](https://github.com/joshavant), [@eleqtrizit](https://github.com/eleqtrizit), [@heyitsaamir](https://github.com/heyitsaamir), [@amittell](https://github.com/amittell), [@lidge-jun](https://github.com/lidge-jun), [@liorb-mountapps](https://github.com/liorb-mountapps), [@masatohoshino](https://github.com/masatohoshino), [@bladin](https://github.com/bladin), and [@giodl73-repo](https://github.com/giodl73-repo). - CLI/auth/doctor/providers: reject malformed numeric/timeout/subcommand-version inputs, ignore workspace dotenv provider credentials, wait for respawn child shutdown, bound heartbeat defaults plus Codex, GitHub Copilot, OpenAI, Anthropic, Google, Feishu, LM Studio, MiniMax, Xiaomi TTS, and local-provider OAuth/token/model requests, harden Codex auth probes, label auth health by agent, preserve explicit agentRuntime pins during Codex model migration, warm provider auth off the main thread, honor Codex response timeouts, stop migrating current Claude Haiku 4.5 profiles to Sonnet, bound local service startup, resolve GPT-5.5 without cached catalog, migrate legacy memory auto-provider config, rewrite non-canonical `api_key` auth profiles, and make doctor restart follow-ups actionable. ([#87398](openclaw/openclaw#87398), [#86281](openclaw/openclaw#86281), [#87361](openclaw/openclaw#87361), [#88133](openclaw/openclaw#88133), [#83655](openclaw/openclaw#83655), [#87559](openclaw/openclaw#87559), [#87719](openclaw/openclaw#87719), [#88088](openclaw/openclaw#88088), [#85924](openclaw/openclaw#85924), [#84362](openclaw/openclaw#84362)) Thanks [@Patrick-Erichsen](https://github.com/Patrick-Erichsen), [@samzong](https://github.com/samzong), [@giodl73-repo](https://github.com/giodl73-repo), [@alkor2000](https://github.com/alkor2000), [@mmaps](https://github.com/mmaps), [@nxmxbbd](https://github.com/nxmxbbd), and [@vincentkoc](https://github.com/vincentkoc). - Gateway/security/session state: expire browser tokens after auth rotation, scope assistant idempotency dedupe, drain probe client closes, avoid stale restart continuation reuse, preserve retry-after fallbacks and stale rate-limit cooldown probes, bound webchat image and artifact transcript scans, include seconds in inbound metadata timestamps, clear completed session active runs, clear stale chat stream buffers, and evict current plugin-state namespaces at row caps. ([#87810](openclaw/openclaw#87810), [#87833](openclaw/openclaw#87833), [#75089](openclaw/openclaw#75089)) Thanks [@joshavant](https://github.com/joshavant) and [@litang9](https://github.com/litang9). - Config/parsing/network: reject partial numeric parsing, parse provider/Discord retry headers and dates strictly, honor IPv6 and bare IPv6 `no_proxy` entries, preserve empty plugin allowlists, canonicalize secret target array indexes, and reject malformed media content lengths, inspected TCP ports, marketplace content lengths, cron epochs, sandbox stat fields, unsafe duration values, empty config path segments, noncanonical schema array refs, unsafe Telegram callback pages, and invalid Teams attachment-fetch DNS targets. ([#87883](openclaw/openclaw#87883)) Thanks [@zhangguiping-xydt](https://github.com/zhangguiping-xydt). - Browser/input hardening: reject invalid tab indexes, excessive viewport resizes, explicit zero CDP ports, malformed geolocation options, unsafe screenshot or permission-grant timeouts, loose response-body limits, invalid cookie expiries, and non-finite Browser tool delays/timeouts. - Cron/automation: retry recurring jobs after transient model rate limits before waiting for the next scheduled slot, and preflight model fallbacks before skipping scheduled work. ([#82887](openclaw/openclaw#82887)) Thanks [@chen-zhang-cs-code](https://github.com/chen-zhang-cs-code). - Auto-reply/directives: respect provider and relayed channel metadata during directive persistence so channel-originated decisions keep their intended context. ([#87683](openclaw/openclaw#87683)) - WhatsApp: resolve the auth directory from the active profile so profile-scoped WhatsApp installs do not drift to the wrong credential root. ([#82492](openclaw/openclaw#82492)) Thanks [@lidge-jun](https://github.com/lidge-jun). - Gateway/session state: clear completed session active runs, avoid cold-loading providers for MCP inventory, cache single-session child indexes, cap handshake timers, and bound preauth, auth-guard, media, transcript, readiness, and port options. - Channels/replies: preserve channel-owned progress callbacks when verbose output is off, keep group-room progress suppression intact, prefer external session delivery context, escape Discord component id delimiters, force final TUI chat repaints, show Slack reasoning previews, and normalize Discord/Matrix/Mattermost channel numeric options. ([#87476](openclaw/openclaw#87476), [#87423](openclaw/openclaw#87423)) - Agents/tool args: harden smart-quoted argument repair for edit arrays and exact escaped arguments so model-produced tool calls recover without corrupting valid input. ([#86611](openclaw/openclaw#86611)) Thanks [@ferminquant](https://github.com/ferminquant). - Providers/agents: preserve seeded Anthropic signatures, preserve signed thinking payloads, concatenate signature-delta chunks, preserve DeepSeek `reasoning_content` replay across tier suffixes, apply OpenRouter strict9 ids to Mistral routes, promote Ollama plain-text tool calls, load NVIDIA featured model catalogs, stream MiniMax music generation responses, and recover empty preflight compaction. ([#87593](openclaw/openclaw#87593), [#87493](openclaw/openclaw#87493), [#80775](openclaw/openclaw#80775), [#84764](openclaw/openclaw#84764)) Thanks [@Pluviobyte](https://github.com/Pluviobyte) and [@eleqtrizit](https://github.com/eleqtrizit). - Media/images: skip CLI image cache refs when resolving generated images, allow trusted generated HTML attachments, and bound generated video downloads so stale refs and slow providers fail cleanly. ([#87523](openclaw/openclaw#87523), [#87982](openclaw/openclaw#87982)) - File transfer: handle late tar stdin pipe errors after archive validation or unpacking has already settled. - Performance: trust install-record caches between reloads, prefer native JSON parsing, reuse unchanged tool-search catalogs, reuse gateway session and plugin metadata paths, skip unchanged store serialization, patch single-entry session writes, add precomputed session patch writers, reduce store clone allocations, cache manifest model catalog rows and auto-enabled plugin config, avoid full session snapshots for entry reads, defer configured Slack full startup, prefer bundled plugin dist entries, and slim current metadata identity caches. ([#87760](openclaw/openclaw#87760)) - Docker/release/QA: package runtime workspace templates, stream cross-OS served artifacts, preserve sparse Crabbox run artifacts, isolate npm plugin installs per package, reject incompatible package plugin API installs, drop the leftover root Sharp dependency from package manifests after the Rastermill migration, bound OpenClaw instance logs, plugin gauntlet relay logs, MCP channel buffers, kitchen-sink scans, agent-turn assertions, QA-Lab credential broker calls, QA Matrix substrate requests, and release scenario logs, and keep release/google live guards current. ([#87647](openclaw/openclaw#87647), [#87477](openclaw/openclaw#87477)) Thanks [@rohitjavvadi](https://github.com/rohitjavvadi) and [@vincentkoc](https://github.com/vincentkoc). - Release/CI: bound manual git fetches, ClawHub verifier responses, ClawHub owner metadata, dependency-guard error bodies, Parallels limits, startup/test/memory budget parsing, and diffs viewer build warnings so release lanes fail with useful proof instead of hanging. ([#87839](openclaw/openclaw#87839)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).  Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/759

fix: probe stale rate-limit cooldown primaries

fccb2f5

openclaw-barnacle Bot added agents Agent runtime and tooling size: S maintainer Maintainer-authored PR labels May 29, 2026

joshavant merged commit 92051f6 into main May 29, 2026
158 of 168 checks passed

joshavant deleted the fix/ollama-cooldown-half-open branch May 29, 2026 01:11

joshavant mentioned this pull request May 29, 2026

[Bug] Ollama Cloud rate-limit cooldown permanently blocks agents — not released after API recovery #87608

Closed

Haderach-Ram mentioned this pull request May 29, 2026

Ecosystem Digest — 2026-05-29 Haderach-Ram/openclaw-radar#22

Open

github-actions Bot mentioned this pull request May 29, 2026

📡 Upstream Digest — 2026-05-29 02:31 UTC curtismercier/openclaw-mods#968

Open

github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 29, 2026

fix: probe stale rate-limit cooldown primaries (openclaw#87833)

cd3092c

SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026

fix: probe stale rate-limit cooldown primaries (openclaw#87833)

bd50f72

sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026

fix: probe stale rate-limit cooldown primaries (openclaw#87833)

6ed4fc7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: probe stale rate-limit cooldown primaries#87833

fix: probe stale rate-limit cooldown primaries#87833
joshavant merged 1 commit into
mainfrom
fix/ollama-cooldown-half-open

joshavant commented May 29, 2026

Uh oh!

clawsweeper Bot commented May 29, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

joshavant commented May 29, 2026

Summary

Verification

Real behavior proof

Uh oh!

clawsweeper Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

clawsweeper Bot commented May 29, 2026 •

edited

Loading