fix(agents): re-probe single-provider primary during cooldown by 849261680 · Pull Request #90717 · openclaw/openclaw

849261680 · 2026-06-05T16:46:51Z

Summary

When fallbacks: [], a rate/subscription-limited primary stayed suspended until the provider-reported blockedUntil timestamp literally arrived — which can be days out for subscription caps (e.g. "Next reset in 6 days"). The root cause: shouldProbePrimaryDuringCooldown returned false on !hasFallbackCandidates before checking any recovery condition, so a single-provider agent went silent for days even after the rolling cap recovered.

What changed

Split the early-return guard in shouldProbePrimaryDuringCooldown (src/agents/model-fallback.ts): a non-primary candidate still returns false immediately, but a single-provider primary (!hasFallbackCandidates) now returns true (probe allowed) while the 30-second throttle slot is open. A single-provider primary has no fallback chain to prefer, so "is the primary callable yet?" is a recovery question independent of fallback configuration.
Simplified the billing branch in resolveCooldownDecision: removed the duplicated single-provider probe check (shouldProbeSingleProviderBilling); the unified shouldProbe now covers single-provider recovery for all reasons including billing. Net: one fewer special-case branch.
Updated the existing case that asserted the primary was never invoked with fallbacks: [] + rate-limit cooldown — it now expects the primary to be probed and succeed.
Added a regression case proving a far-future subscription_limit block with fallbacks: [] yields a probe attempt rather than suspend_lanes, while the 30-second probe throttle is still honored.

Multi-fallback setups are unchanged: they keep preferring the fallback chain and only probe the primary near cooldown expiry.

Verification

Focused suites (src/agents/model-fallback.probe.test.ts 14, model-fallback.test.ts 76, auth-profiles.cooldown-auto-expiry.test.ts 6) pass; tsgo -p tsconfig.core.json exits 0; oxfmt --check clean. These are supplemental to the runtime proof below.

Real behavior proof

Behavior addressed: With a single configured provider (fallbacks: []) whose only auth profile carries a far-future subscription_limit block, the model-fallback layer now re-probes the primary and serves the recovered call, instead of suspending lanes until blockedUntil.

Real environment tested: Local OpenClaw runtime, Node 22.22.2, isolated OPENCLAW_HOME, driving the production runWithModelFallback path. A real on-disk auth-profile store was seeded with the exact reported state (blockedUntil = now + 6 days, blockedReason: "subscription_limit", blockedSource: "wham", fallbacks: []); the real auth runtime loaded it (isProfileInCooldown(openai:acct-123)=true).

Exact steps or command run after this patch: on the fixed branch, OPENCLAW_HOME=$(mktemp -d) node --import tsx repro-90702.mts; then for contrast git checkout origin/main -- src/agents/model-fallback.ts and rerun, then git checkout HEAD -- src/agents/model-fallback.ts.

Evidence after fix: console output / redacted runtime log captured from the local node OpenClaw runtime on the fixed branch — the primary is probed and serves the call:

[model-fallback/decision] decision=probe_cooldown_candidate requested=openai/gpt-4.1-mini candidate=openai/gpt-4.1-mini reason=rate_limit next=none
>>> upstream invoked: openai/gpt-4.1-mini (rolling cap recovered) -> OK
[model-fallback/decision] decision=candidate_succeeded requested=openai/gpt-4.1-mini candidate=openai/gpt-4.1-mini next=none
>>> runWithModelFallback result = "RECOVERED-OK"

Same seeded state on current main (before the change) — the primary is never called, matching the reporter's logs:

[model-fallback/decision] decision=skip_candidate requested=openai/gpt-4.1-mini candidate=openai/gpt-4.1-mini reason=rate_limit next=none detail=Provider openai is in cooldown (suspending lanes)
FallbackSummaryError: All models failed (1): openai/gpt-4.1-mini: Provider openai is in cooldown (suspending lanes) (rate_limit)

Observed result after fix: the cooldown decision flips from skip_candidate / suspend_lanes (the bug: indefinite silence and All models failed) to probe_cooldown_candidate; the primary is actually invoked and the recovered rolling cap resumes serving (RECOVERED-OK). The 30-second probe throttle is still honored, so recovery probing cannot hammer the upstream.

What was not tested: a fully live OAuth/WHAM run against a real ChatGPT subscription in a hard-capped state. The decision pipeline and auth-profile cooldown reads exercised here are the production code paths; only the auth-profile store contents were seeded to reproduce the reported block.

…penclaw#90702)

clawsweeper · 2026-06-05T16:49:46Z

Codex review: needs maintainer review before merge. Reviewed June 5, 2026, 1:29 PM ET / 17:29 UTC.

Summary
The PR lets a primary model with no fallback candidates re-probe during an open cooldown throttle slot and updates model-fallback regression tests for rate-limit and subscription-limit cooldowns.

PR surface: Source +6, Tests +42. Total +48 across 2 files.

Reproducibility: yes. The current-main source and latest release both show the !hasFallbackCandidates early return before the probe logic, and the PR body supplies before/after terminal output for the production fallback path with seeded auth-profile state.

Review metrics: none identified.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

none.

Risk before merge

[P2] A fully live hard-capped ChatGPT subscription recovery cycle was not exercised; the supplied proof covers the production fallback decision path with seeded on-disk auth-profile state and redacted terminal output.

Maintainer options:

Decide the mitigation before merge
Land the narrow fallback decision change with its regression coverage after normal CI, keeping the existing per-provider 30-second throttle and multi-fallback reset-window behavior intact.
Pause or close
Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

No automated repair is needed; the patch is reviewable as-is and the remaining action is maintainer acceptance plus normal merge gates.

Security
Cleared: The diff only changes fallback cooldown control flow and tests; it does not add dependencies, workflows, secret handling, or new code-execution paths.

Review details

Best possible solution:

Land the narrow fallback decision change with its regression coverage after normal CI, keeping the existing per-provider 30-second throttle and multi-fallback reset-window behavior intact.

Do we have a high-confidence way to reproduce the issue?

Yes. The current-main source and latest release both show the !hasFallbackCandidates early return before the probe logic, and the PR body supplies before/after terminal output for the production fallback path with seeded auth-profile state.

Is this the best way to solve the issue?

Yes. Reusing the existing cooldown throttle for no-fallback primaries is narrower than changing stored provider reset timestamps or adding a config knob, and it preserves the multi-fallback preference logic.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 21aa297434e4.

Label changes

Label changes:

add P1: The linked bug can keep single-provider agents silent for days after a subscription cap, blocking scheduled and channel replies for real users.
add proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix terminal runtime output from the production fallback path with a seeded on-disk auth-profile store, plus contrasting current-main output.
add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🦞 diamond lobster.
remove rating: 🦐 gold shrimp: Current PR rating is rating: 🐚 platinum hermit, so this older rating label is no longer current.

Label justifications:

P1: The linked bug can keep single-provider agents silent for days after a subscription cap, blocking scheduled and channel replies for real users.
rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🦞 diamond lobster.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body includes after-fix terminal runtime output from the production fallback path with a seeded on-disk auth-profile store, plus contrasting current-main output.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix terminal runtime output from the production fallback path with a seeded on-disk auth-profile store, plus contrasting current-main output.

Evidence reviewed

PR surface:

Source +6, Tests +42. Total +48 across 2 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	1	15	9	+6
Tests	1	56	14	+42
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	2	71	23	+48

What I checked:

Current-main bug still present: On current main, shouldProbePrimaryDuringCooldown returns false when !params.hasFallbackCandidates before checking the throttle or recovery conditions, so a single-provider primary cannot re-probe while its profile has an active cooldown. (src/agents/model-fallback.ts:1063, 21aa297434e4)
Latest release has the same behavior: The latest release tag still has the !params.hasFallbackCandidates early return, so the central bug is not already shipped-fixed. (src/agents/model-fallback.ts:1060, 2e08f0f4221f)
PR merge result fixes the gate narrowly: In the PR merge result, non-primary candidates still return false, but a primary with no fallback candidates returns true after the existing probe throttle opens; multi-fallback logic still proceeds to the existing soonest-cooldown checks. (src/agents/model-fallback.ts:1063, 93a76ad466df)
Regression coverage covers the reported shape: The added regression case seeds a far-future subscription_limit block with fallbacks: [], expects an attempt with markProbe: true, and verifies the recent-probe throttle still suspends. (src/agents/model-fallback.probe.test.ts:363, 93a76ad466df)
Runtime-path coverage checks the actual fallback call: The updated runWithModelFallback case uses a single configured primary, far-future cooldown, and fallbacksOverride: [], then expects the primary run to be called with allowTransientCooldownProbe: true. (src/agents/model-fallback.probe.test.ts:727, 93a76ad466df)
OpenClaw auth-store contract supports the repro: OpenClaw reads WHAM usage data, stores active WHAM reset windows as blockedUntil with blockedReason: "subscription_limit", and resolves active blockedUntil windows as rate_limit, matching the linked bug's stored state. (src/agents/auth-profiles/usage.ts:82, 21aa297434e4)

Likely related people:

steipete: git shortlog shows the heaviest recent authorship on src/agents/model-fallback.ts and the probe tests, with recent adjacent refactors and test maintenance in this area. (role: recent area contributor; confidence: high; commits: cb5bb9b936bb, 6a87d6e81426, dcc3392a1a40; files: src/agents/model-fallback.ts, src/agents/model-fallback.probe.test.ts)
altaywtf: The single-provider billing cooldown probe path was introduced in commit 0669b0d, and later same-provider cooldown probe work lists the same handle as co-author/reviewer. (role: adjacent behavior owner; confidence: medium; commits: 0669b0ddc265, 048e25c2b21d; files: src/agents/model-fallback.ts, src/agents/model-fallback.probe.test.ts)
sebslight: The shouldProbePrimaryDuringCooldown helper was extracted in commit d224776, and the original primary cooldown recovery PR lists this handle as reviewer/co-author. (role: helper extraction and reviewer; confidence: medium; commits: d224776ffbb1, 39bb1b33222e; files: src/agents/model-fallback.ts, src/agents/model-fallback.probe.test.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

849261680 · 2026-06-05T16:57:00Z

Source-level behavior proof

This fix is a pure control-flow change in shouldProbePrimaryDuringCooldown. The bug and fix are source-reproducible per ClawSweeper review. No live environment can easily reproduce a multi-day Codex subscription cap exhaust + recovery cycle.

Before (origin/main, src/agents/model-fallback.ts:1063):

if (!params.isPrimary || !params.hasFallbackCandidates) {
  return false;
}

→ fallbacks:[] → hasFallbackCandidates=false → returns false → primary never re-probed → agent silent for days.

After (this PR):

if (!params.isPrimary) {
  return false;
}
// Single-provider recovery probe — #90702
if (!params.hasFallbackCandidates) {
  return true;  // probe allowed when throttle slot is open
}

→ fallbacks:[] → returns true → primary re-probed on 30s throttle → recovers when upstream is callable.

Unchanged paths verified:

Non-primary (isPrimary=false): still returns false immediately.
Multi-fallback (hasFallbackCandidates=true): still goes through soonest / PROBE_MARGIN_MS logic unchanged.
Billing branch: simplified but equivalent — shouldProbe now handles single-provider recovery for all reasons.

Test coverage (14 passed):

re-probes a single-provider primary blocked by a far-future subscription_limit (#90702) — unit-level resolveCooldownDecision returns { type: "attempt" } not { type: "suspend_lanes" }.
re-probes a single-provider rate-limited primary instead of suspending — integration-level runWithModelFallback calls model run with allowTransientCooldownProbe: true.
30s throttle honored: test confirms a recent probe key results in suspend_lanes.

This is a proof: override case — maintainer verification requested.

sallyom · 2026-06-05T21:20:02Z

Land-ready maintainer review:

Reviewed PR 90717 for [Bug]: blockedUntil for subscription_limit set far in the future never re-probes when no fallback is configured #90702 as a narrow agent fallback/auth cooldown fix.
No review findings found.
Verified current checks are green, including Real behavior proof and agent-runtime-boundary.
Local proof run on PR head: node scripts/run-vitest.mjs src/agents/model-fallback.probe.test.ts.
Local whitespace proof: git diff --check.
Stale-base proof: cherry-picked 6fa95d6143dc217eea6d0db950030788098d8303 cleanly onto current origin/main (c965141d67) and reran node scripts/run-vitest.mjs src/agents/model-fallback.probe.test.ts there.
Dependency contract checked directly in sibling Codex source: ../codex/codex-rs/protocol/src/protocol.rs, ../codex/codex-rs/backend-client/src/client.rs, and ../codex/codex-rs/app-server/README.md for rate-limit reset/window semantics.

Known gap: I did not personally run a live OAuth/WHAM exhausted account; the PR’s seeded on-disk proof plus focused tests and CI are sufficient for this narrow merge.

Selective sync from openclaw/openclaw. Applied one self-contained runtime fix; remaining recent high-value upstream fixes were assessed and found inapplicable to the fork's older/refactored baseline. - Agents/model-fallback: re-probe single-provider primary during cooldown so a fallbacks:[] setup (the common local-model configuration) recovers from rate/subscription caps without waiting for a far-future provider-reported reset. The existing 30s probe throttle still gates recovery probes. (upstream 6da3b1f, openclaw#90717, fixes openclaw#90702)

…aw#90717) Fixes openclaw#90702. Allow a single-provider primary to periodically probe through the existing cooldown throttle even when no fallback chain is configured. This lets WHAM/subscription-limit cooldown state recover without waiting for a far-future provider reset timestamp. Verified: - node scripts/run-vitest.mjs src/agents/model-fallback.probe.test.ts - git diff --check - cherry-pick onto current origin/main and rerun focused regression

…26.6.5) (#963) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [ghcr.io/openclaw/openclaw](https://openclaw.ai) ([source](https://github.com/openclaw/openclaw)) | patch | `2026.6.1` → `2026.6.5` | --- ### Release Notes <details> <summary>openclaw/openclaw (ghcr.io/openclaw/openclaw)</summary> ### [`v2026.6.5`](https://github.com/openclaw/openclaw/blob/HEAD/CHANGELOG.md#202665) [Compare Source](openclaw/openclaw@v2026.6.1...v2026.6.5) ##### Highlights - QQBot now strips model reasoning/thinking scaffolding before native delivery, preventing raw `<thinking>` content from leaking into channel replies. ([#89913](openclaw/openclaw#89913), [#90132](openclaw/openclaw#90132)) Thanks [@openperf](https://github.com/openperf). - MCP tool results now coerce `resource_link`, `resource`, `audio`, malformed image, and future non-text/image blocks at the materialize boundary, preventing Anthropic 400s and poisoned session history after a tool returns richer MCP content. ([#90710](openclaw/openclaw#90710), [#90728](openclaw/openclaw#90728)) Thanks [@RanSHammer](https://github.com/RanSHammer) and [@849261680](https://github.com/849261680). - Anthropic extended-thinking sessions recover after prompt-cache expiry or Gateway restart because stream start events wait for `message_start`, letting pre-generation signature errors trigger the existing recovery retry. ([#90667](openclaw/openclaw#90667), [#90697](openclaw/openclaw#90697)) Thanks [@openperf](https://github.com/openperf). - Parallel is now a bundled `web_search` provider with `PARALLEL_API_KEY` discovery, guarded endpoint handling, cache-safe session ids, onboarding picker support, and docs. ([#85158](openclaw/openclaw#85158)) Thanks [@NormallyGaussian](https://github.com/NormallyGaussian). - Google Vertex ADC users get static catalog rows and runtime model resolution again, while single-provider cooldown recovery and memory adapter status checks are more reliable. ([#90506](openclaw/openclaw#90506), [#90609](openclaw/openclaw#90609), [#90717](openclaw/openclaw#90717), [#90816](openclaw/openclaw#90816)) Thanks [@849261680](https://github.com/849261680). - Matrix can preflight voice notes before mention gating, preserve thread reads/replies through Matrix relations pagination, and carry QA coverage for voice and thread flows. ([#78016](openclaw/openclaw#78016), [#90415](openclaw/openclaw#90415)) - Auth and plugin install state is more durable: auth profiles now live in SQLite, official npm plugin install records keep their trusted pins, and prerelease fallback integrity checks avoid carrying stale integrity forward. ([#89102](openclaw/openclaw#89102), [#88585](openclaw/openclaw#88585)) - macOS node mode no longer silently self-reconnects away from a healthy direct Gateway session, reducing unexpected companion app session churn. ([#90668](openclaw/openclaw#90668), [#90815](openclaw/openclaw#90815)) Thanks [@vrurg](https://github.com/vrurg). - Upgrade and service paths are safer: cron legacy JSON stores migrate during doctor preflight, service env placeholders no longer mask state-dir secrets, WhatsApp startup waits are bounded, and disabled WhatsApp accounts tear down on config reload. ([#90072](openclaw/openclaw#90072), [#90208](openclaw/openclaw#90208), [#90277](openclaw/openclaw#90277), [#90488](openclaw/openclaw#90488), [#90486](openclaw/openclaw#90486), [#87951](openclaw/openclaw#87951), [#87965](openclaw/openclaw#87965)) Thanks [@MonkeyLeeT](https://github.com/MonkeyLeeT), [@sallyom](https://github.com/sallyom), [@mcaxtr](https://github.com/mcaxtr), and [@MukundaKatta](https://github.com/MukundaKatta). ##### Changes - Search/providers: add the Parallel bundled web-search plugin, live provider tests, registration contracts, onboarding/docs wiring, and guarded `api.parallel.ai/v1/search` support. ([#85158](openclaw/openclaw#85158)) Thanks [@NormallyGaussian](https://github.com/NormallyGaussian). - Matrix/channels: add voice-message preflight and thread-aware read/reply behavior, including Matrix QA scenario wiring and docs for voice-message behavior. ([#78016](openclaw/openclaw#78016), [#90415](openclaw/openclaw#90415)) - Skills/ClawHub: install ClawHub skills backed by GitHub repositories through the resolved install API, download the pinned GitHub commit, keep install-policy checks, and report install telemetry after success. ([#90478](openclaw/openclaw#90478)) Thanks [@Patrick-Erichsen](https://github.com/Patrick-Erichsen). - Google Chat/channels: add native approval card actions and click handling so Google Chat approvals use platform-native cards instead of generic message flow. - Mobile: Android provider/model screens now surface expiring, unavailable, unresolved, and attention states more clearly, while iOS settings and Talk tabs keep diagnostics, gateway rows, attachment labels, and unavailable Talk controls reachable. - Memory: QMD search can use the new rerank toggle, and memory adapter status uses the resolved default model identity when checking plain status. ([#61834](openclaw/openclaw#61834)) - Docs/tooling: add Parallel search docs, refresh weather-skill guidance toward `web_fetch`, clarify legacy `openai-codex` auth, document release/test helper scripts, and tighten changed-test routing docs for CI/debugging work. ([#90028](openclaw/openclaw#90028), [#90250](openclaw/openclaw#90250)) Thanks [@fuller-stack-dev](https://github.com/fuller-stack-dev). - Release/process: switch release trains to `YYYY.M.PATCH` monthly patch numbering, keep pre-transition tags compatible, and pin the June 2026 floor at `2026.6.5` after the published beta. - Platform maintenance: refresh Android, Swift/macOS, Docker, CodeQL, Buildx, Docker build/push, and Codex Action dependencies for this release train. ([#74980](openclaw/openclaw#74980), [#81757](openclaw/openclaw#81757), [#86481](openclaw/openclaw#86481), [#86483](openclaw/openclaw#86483), [#90601](openclaw/openclaw#90601)) - QQBot: add `/bot-group-allways on|off` slash command (with named-account and default-account support) to toggle whether group messages require an `@mention` before the bot replies, and clear the runtime config snapshot after the write so the new account-level `defaultRequireMention` takes effect immediately without restart. ([#91423](openclaw/openclaw#91423)) Thanks [@cxyhhhhh](https://github.com/cxyhhhhh). ##### Fixes - Channel content boundaries: QQBot now strips reasoning/thinking tags before sending, preserving final answers while hiding internal model narration from users. ([#89913](openclaw/openclaw#89913), [#90132](openclaw/openclaw#90132)) Thanks [@openperf](https://github.com/openperf). - Agents/MCP/providers: coerce non-text/image MCP tool-result blocks before they reach provider converters, preserving valid images and turning richer MCP content into text instead of malformed image blocks. ([#90710](openclaw/openclaw#90710), [#90728](openclaw/openclaw#90728)) Thanks [@RanSHammer](https://github.com/RanSHammer) and [@849261680](https://github.com/849261680). - Anthropic/Codex/ACP/agent recovery: defer Anthropic stream start events until `message_start`, strip stale compaction thinking signatures before Anthropic replay, detect unsigned thinking-only stalls, refresh prompt fences after compaction writes, reject empty completion handoffs, preserve parent streaming-off overrides/shared progress commentary, forward heartbeat metadata to context-engine hooks, and cover Codex session/thread migration edge cases. ([#90667](openclaw/openclaw#90667), [#90697](openclaw/openclaw#90697), [#90163](openclaw/openclaw#90163), [#90108](openclaw/openclaw#90108), [#89874](openclaw/openclaw#89874), [#89505](openclaw/openclaw#89505), [#90632](openclaw/openclaw#90632), [#89302](openclaw/openclaw#89302), [#90729](openclaw/openclaw#90729), [#90317](openclaw/openclaw#90317), [#90319](openclaw/openclaw#90319)) Thanks [@openperf](https://github.com/openperf), [@100yenadmin](https://github.com/100yenadmin), and [@ooiuuii](https://github.com/ooiuuii). - Provider/model resolution: preserve Google Vertex ADC auth markers in generated catalogs, re-probe a single-provider primary after cooldown, share Codex model visibility, fail closed for unknown model auth, preserve Codex alias availability, keep unresolved profile refs unknown, and avoid resolving auth while listing models. ([#90506](openclaw/openclaw#90506), [#90609](openclaw/openclaw#90609), [#90717](openclaw/openclaw#90717), [#90702](openclaw/openclaw#90702)) Thanks [@849261680](https://github.com/849261680). - Gateway/macOS/mobile: avoid duplicate Gateway probe warnings by identity, rate-limit node pairing requests while preserving paired-node reconnects, keep macOS node mode on a healthy direct Gateway session, keep iOS diagnostics and gateway rows reachable, and avoid Linux ARM Gradle resource tasks during Android builds. ([#85791](openclaw/openclaw#85791), [#90147](openclaw/openclaw#90147), [#90668](openclaw/openclaw#90668), [#90815](openclaw/openclaw#90815)) Thanks [@giodl73-repo](https://github.com/giodl73-repo) and [@vrurg](https://github.com/vrurg). - TUI/chat/Workboard/auto-reply: optimistic user messages stay stable across stale history reloads, runId reassignment, and abort windows instead of disappearing, jumping, or lingering as ghost rows; Workboard stale lifecycle bulk updates no longer overwrite newer status/provenance; message-tool sends now count as delivery. ([#86205](openclaw/openclaw#86205), [#89600](openclaw/openclaw#89600), [#88592](openclaw/openclaw#88592), [#90123](openclaw/openclaw#90123)) Thanks [@RomneyDa](https://github.com/RomneyDa). - Cron/update/service env: doctor config preflight now migrates legacy cron JSON stores into SQLite before runtime reads, service env planning skips unresolved placeholders that would mask state-dir `.env` values, and session transcript rewrites keep registry markers/discriminants consistent. ([#90072](openclaw/openclaw#90072), [#90208](openclaw/openclaw#90208), [#90277](openclaw/openclaw#90277), [#90488](openclaw/openclaw#90488)) Thanks [@MonkeyLeeT](https://github.com/MonkeyLeeT) and [@sallyom](https://github.com/sallyom). - Security/config/tooling: guard MCP HTTP redirects, protect global agent config defaults, and keep release/test/tooling proof failures bounded and explicit. ([#89732](openclaw/openclaw#89732), [#90145](openclaw/openclaw#90145)) - Channels: WhatsApp restarts when per-account config changes, bounds background startup waits, closes failed sockets, and preserves reconnect behavior; Mattermost slash commands keep their state on `globalThis`; Feishu streaming cards preserve full merged content; voice-call tracks Twilio streams after connect; ClickClack reply tools respect `toolsAllow`. ([#87951](openclaw/openclaw#87951), [#87965](openclaw/openclaw#87965), [#90486](openclaw/openclaw#90486), [#68113](openclaw/openclaw#68113), [#90534](openclaw/openclaw#90534), [#90181](openclaw/openclaw#90181), [#90607](openclaw/openclaw#90607), [#89500](openclaw/openclaw#89500)) Thanks [@MukundaKatta](https://github.com/MukundaKatta), [@mcaxtr](https://github.com/mcaxtr), [@infoanton](https://github.com/infoanton), [@mushuiyu886](https://github.com/mushuiyu886), and [@sahibzada-allahyar](https://github.com/sahibzada-allahyar). - Feishu: retry transient send rate-limit errors (HTTP 429, per-chat code 230020, tenant-level code 11232) with linear backoff, including SDK responses that fulfill with rate-limit bodies instead of throwing, and route streaming-card sends through the retry wrapper. ([#89659](openclaw/openclaw#89659)) Thanks [@ladygege](https://github.com/ladygege). - Release/CI/E2E: main CI guard drift, PR merge diff scoping, live Docker credential staging, base-image qualification, installer Docker classification, Playwright dependency install recovery, API-key auth for Codex live Docker lanes, Parallels option terminators, and JSON-mode progress handling are tighter so release proof fails cleaner. ([#90532](openclaw/openclaw#90532), [#90287](openclaw/openclaw#90287), [#90058](openclaw/openclaw#90058)) Thanks [@RomneyDa](https://github.com/RomneyDa), [@hxy91819](https://github.com/hxy91819), and [@mrunalp](https://github.com/mrunalp). - Release/CI/E2E: Docker E2E and live Docker harness runs now apply default memory, CPU, and process ceilings while preserving explicit per-lane overrides. - Release/CI/E2E: plugin lifecycle matrix resource sampling now fails phases that exceed RSS, wall-clock, or CPU ceilings instead of only logging the measurements. - Release/CI/E2E: Codex npm plugin live assertions now cap transcript discovery and diagnostic log reads so failure proof stays bounded. - Tests/state isolation: QA Lab valid-tool-call metrics now require runtime tool-call evidence when runtime parity data is available instead of counting tool-backed scenario pass status alone. - Tests/state isolation: QA Lab runtime parity now fails planned-only tool-call rows without matching tool results instead of treating matching mock plans as real tool evidence. - Tests/state isolation: provider, media, auth, cron, task, session, sandbox, Gateway, and Codex timeout fixtures now scope more home/state/env data per test, reducing cross-test leakage and making release validation failures less noisy. ([#90027](openclaw/openclaw#90027), [#89974](openclaw/openclaw#89974)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).  Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/963

fix(agents): re-probe single-provider primary during cooldown (fixes o…

6fa95d6

…penclaw#90702)

openclaw-barnacle Bot added agents Agent runtime and tooling size: S labels Jun 5, 2026

849261680 mentioned this pull request Jun 5, 2026

[Bug]: blockedUntil for subscription_limit set far in the future never re-probes when no fallback is configured #90702

Closed

openclaw-barnacle Bot added the triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. label Jun 5, 2026

openclaw-barnacle Bot added triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels Jun 5, 2026

clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 5, 2026

sallyom self-assigned this Jun 5, 2026

sallyom merged commit 6da3b1f into openclaw:main Jun 5, 2026
288 of 312 checks passed

frankhli843 mentioned this pull request Jun 6, 2026

chore: upstream sync openclaw -> gemmaclaw 2026-06-06 gemmaclaw/gemmaclaw#271

Merged

6 tasks

Haderach-Ram mentioned this pull request Jun 6, 2026

Ecosystem Digest — 2026-06-06 Haderach-Ram/openclaw-radar#30

Open

teknium1 mentioned this pull request Jun 8, 2026

feat(nous): re-probe single-provider primary during rate-limit cooldown NousResearch/hermes-agent#41610

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(agents): re-probe single-provider primary during cooldown#90717

fix(agents): re-probe single-provider primary during cooldown#90717
sallyom merged 1 commit into
openclaw:mainfrom
849261680:fix/90702-single-provider-cooldown-reprobe

849261680 commented Jun 5, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

849261680 commented Jun 5, 2026

Uh oh!

sallyom commented Jun 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

849261680 commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Verification

Real behavior proof

Uh oh!

clawsweeper Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

849261680 commented Jun 5, 2026

Source-level behavior proof

Uh oh!

sallyom commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

849261680 commented Jun 5, 2026 •

edited

Loading

clawsweeper Bot commented Jun 5, 2026 •

edited

Loading

sallyom commented Jun 5, 2026 •

edited

Loading