[Bug]: blockedUntil for subscription_limit set far in the future never re-probes when no fallback is configured

### Bug type

Behavior bug (incorrect output/state without crash)

### Beta release blocker

No

### Summary

On 2026.5.28, when an `openai-codex` auth profile hits its subscription cap and the upstream reports a "next reset in N days" timestamp, OpenClaw stores that timestamp verbatim into `auth-state.json` as `blockedUntil`; with `fallbacks: []`, the probe-during-cooldown path short-circuits on `hasFallbackCandidates`, so the profile is never re-probed and stays blocked for days even after the rolling cap has recovered.

### Steps to reproduce

1. Install OpenClaw `2026.5.28` and `@openclaw/codex@2026.5.28`, configure with `agents.defaults.model.primary: openai-codex/gpt-5.5` and `fallbacks: []`.
2. Drive enough usage to exhaust the rolling weekly cap (in this case, an accidental heartbeat firing every 30 min for ~24 hours).
3. Observe the upstream returns: `You've reached your Codex subscription usage limit. Next reset in 6 days, Jun 7 at 3:43 PM UTC.`
4. Check `auth-state.json` at `agents/main/agent/auth-state.json`:
   ```json
   "openai-codex:<account>": {
     "blockedUntil": 1780846982712,
     "blockedReason": "subscription_limit",
     "blockedSource": "wham",
     "errorCount": 1,
     "failureCounts": { "rate_limit": 1 },
     "lastFailureAt": 1780401970719
   }
   ```
5. Wait 3 days. Observe every scheduled cron lane logs `decision=skip_candidate ... Provider openai-codex is in cooldown (suspending lanes)`. No model calls made.
6. Run `openclaw infer model run --prompt "say hello in one word"` directly. Returns successfully — the upstream API is callable. The block is purely OpenClaw-side stale state.

### Expected behavior

After the upstream's rolling cap recovers (which happens before the reported "next reset" since it's a rolling window, not a discrete reset), OpenClaw should re-probe the primary and resume serving calls. With no fallback configured, recovery probing should still happen, since "is the primary callable yet?" is a recovery question, not a fallback-switching question.

### Actual behavior

The profile stays blocked until `blockedUntil` arrives literally, regardless of actual API state. In `dist/model-fallback-DRgKirrj.js`:

```js
function shouldProbePrimaryDuringCooldown(params) {
  if (!params.isPrimary || !params.hasFallbackCandidates) return false;
  // ...
}
```

The early return on `!hasFallbackCandidates` means with `fallbacks: []`, no probe ever fires. Gateway logs confirm: ~250 `skip_candidate` entries over 3 days, zero attempts at the actual upstream.

### OpenClaw version

2026.5.28

### Operating system

Ubuntu 24.04

### Install method

npm global

### Model

openai-codex/gpt-5.5

### Provider / routing chain

openclaw -> @openclaw/codex@2026.5.28 -> openai (ChatGPT Plus OAuth)

### Additional provider/model setup details

- Single auth profile: `openai-codex:<account>` (OAuth, ChatGPT Plus subscription)
- `agents.defaults.model.fallbacks: []` (no fallback configured)
- `compaction.maxActiveTranscriptBytes: "500kb"`, `truncateAfterCompaction: true`
- `auth.cooldowns: {}` (defaults)

### Logs, screenshots, and evidence

```
Jun 02 12:06:10 [model-fallback/decision] decision=candidate_failed
    requested=openai-codex/gpt-5.5 candidate=openai-codex/gpt-5.5
    reason=rate_limit next=none
    detail=You've reached your Codex subscription usage limit. Next reset in 6 days, Jun 7 at 3:43 PM UTC.
Jun 02 14:30:00 [model-fallback/decision] decision=skip_candidate
    requested=openai-codex/gpt-5.5 candidate=openai-codex/gpt-5.5
    reason=rate_limit next=none
    detail=Provider openai-codex is in cooldown (suspending lanes)
(repeats every scheduled cron tick for 3 days)
```

The `auth-state.json` snippet above. Direct `openclaw infer model run` succeeded immediately after manually clearing `blockedUntil`.

### Impact and severity

- Affected: any single-host OpenClaw install with one upstream and `fallbacks: []` that hits a subscription cap.
- Severity: blocks workflow — scheduled crons and channel replies stop posting for the entire duration of `blockedUntil`.
- Frequency: triggered once per cap exhaustion, then sticks until manual intervention.
- Consequence: agents go silent for days. In our case, 3 days of no replies to scheduled telegram interactions and four daily cron jobs not firing.

### Additional information

- Related design discussion: #54278 (proposes a `quota_wait` state separate from `reauth_required`). This bug is the concrete shape of one of the problems #54278 describes.
- Two suggested minimal fixes (either alone would have prevented this):
  1. **Cap `blockedUntil` for `subscription_limit` reasons.** Store `min(reportedReset, now + MAX_SUBSCRIPTION_BLOCK_MS)`. With a cap of e.g. 1 hour, the profile gets re-probed an hour later; if still exhausted, the upstream returns the same error and the block is re-armed; if recovered, work resumes. Keep the reported timestamp in a separate `expectedFullResetAt` field for display only.
  2. **Drop the `hasFallbackCandidates` short-circuit for recovery probes.** Split `shouldProbePrimaryDuringCooldown` into "should we try a fallback now?" (legitimately needs fallback candidates) and "should we re-probe the primary now?" (doesn't). The recovery-probe branch should fire on any time-based throttle regardless of fallback configuration.
- Workaround currently in place: hourly cron clearing `blockedUntil` for `subscription_limit` blocks where `blockedUntil > now + 12h` AND `lastFailureAt < now - 6h`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: blockedUntil for subscription_limit set far in the future never re-probes when no fallback is configured #90702

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: blockedUntil for subscription_limit set far in the future never re-probes when no fallback is configured #90702

Description

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions