Skip to content

Cron sessions inherit persisted model override instead of honoring payload.model, causing thundering herd under overload #58533

@Eugene9D

Description

@Eugene9D

Summary

Cron jobs that specify model: openai/gpt-5.4-nano in their payload are not using that model when they execute in isolated sessions. Instead, resolvePersistedLiveSelection() forces the session back to anthropic/claude-sonnet-4-6, overriding the cron's declared model intent. Under Anthropic overload conditions, this causes every cron to amplify the overload instead of gracefully degrading.

Root Cause

The scheduler fires multiple crons simultaneously (observed: 24 UUIDs in one burst). Each cron session goes through resolvePersistedLiveSelection(), which promotes the persisted Sonnet selection over payload.model. This means:

  1. Crons configured for nano/codex still hit Anthropic Sonnet
  2. A burst of 24 simultaneous crons = 24 simultaneous Sonnet requests under overload
  3. Each gets a 503, retries, and amplifies the cascade

Impact

Observed 2026-03-31: SLBE nightly optimizer (configured model: openai/gpt-5.4-nano) failed at 22:00 PT with model_fallback_decision: candidate_failed because the effective model was Sonnet, not nano.

Expected Behavior

  1. Isolated cron sessions should honor payload.model as the effective model
  2. resolvePersistedLiveSelection() should not apply to ephemeral/isolated cron sessions
  3. The scheduler should add jitter to cron bursts (stagger simultaneous crons by 2-5s each)

Proposed Fix

  1. Skip resolvePersistedLiveSelection() for sessions with runtime: isolated - isolated sessions are ephemeral and have no meaningful persisted state to restore
  2. Add scheduler jitter: when N crons are due at the same tick, spread them across a configurable window (default: 5s)
  3. Honor payload.model as authoritative for isolated sessions

Related

#24378 #32533

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions