Skip to content

fix(cron): honor subagent model fallbacks#82302

Merged
steipete merged 1 commit into
mainfrom
fix/cron-subagent-fallbacks-74985
May 15, 2026
Merged

fix(cron): honor subagent model fallbacks#82302
steipete merged 1 commit into
mainfrom
fix/cron-subagent-fallbacks-74985

Conversation

@steipete

Copy link
Copy Markdown
Contributor

Fixes #74985.

Summary:

  • Track whether isolated cron model selection actually came from subagent model config, so hook, payload, session, and agent-primary model paths keep their existing fallback behavior.
  • Resolve and forward agents.*.subagents.model.fallbacks into both runWithModelFallback and the embedded runner's internal fallbackConfigured checks.
  • Add regression coverage for configured subagent fallbacks, explicit empty fallback overrides, embedded-runner passthrough, and the direct executor test caller.

Verification:

  • node scripts/run-vitest.mjs src/cron/isolated-agent/run.message-tool-policy.test.ts src/cron/isolated-agent/run-fallback-policy.test.ts src/cron/isolated-agent/run.payload-fallbacks.test.ts src/agents/pi-embedded-runner/run/fallbacks.test.ts src/agents/agent-scope.test.ts src/cron/isolated-agent.model-formatting.test.ts src/cron/isolated-agent/run.cron-model-override-forwarding.test.ts src/agents/pi-embedded-runner/run/failover-observation.test.ts src/agents/pi-embedded-runner/run/failover-policy.test.ts
  • pnpm exec oxfmt --check --threads=1 CHANGELOG.md src/agents/agent-scope.ts src/agents/pi-embedded-runner/run.ts src/agents/pi-embedded-runner/run/fallbacks.ts src/agents/pi-embedded-runner/run/fallbacks.test.ts src/cron/isolated-agent.model-formatting.test.ts src/cron/isolated-agent/model-selection.ts src/cron/isolated-agent/run-execution.runtime.ts src/cron/isolated-agent/run-executor.ts src/cron/isolated-agent/run-fallback-policy.test.ts src/cron/isolated-agent/run-fallback-policy.ts src/cron/isolated-agent/run.payload-fallbacks.test.ts src/cron/isolated-agent/run.test-harness.ts src/cron/isolated-agent/run.ts src/cron/isolated-agent/run.message-tool-policy.test.ts
  • node scripts/run-oxlint.mjs CHANGELOG.md src/agents/agent-scope.ts src/agents/pi-embedded-runner/run.ts src/agents/pi-embedded-runner/run/fallbacks.ts src/agents/pi-embedded-runner/run/fallbacks.test.ts src/cron/isolated-agent.model-formatting.test.ts src/cron/isolated-agent/model-selection.ts src/cron/isolated-agent/run-execution.runtime.ts src/cron/isolated-agent/run-executor.ts src/cron/isolated-agent/run-fallback-policy.test.ts src/cron/isolated-agent/run-fallback-policy.ts src/cron/isolated-agent/run.payload-fallbacks.test.ts src/cron/isolated-agent/run.test-harness.ts src/cron/isolated-agent/run.ts src/cron/isolated-agent/run.message-tool-policy.test.ts
  • git diff --check
  • /Users/steipete/Projects/agent-scripts/skills/codex-review/scripts/codex-review --mode local

Behavior addressed: isolated cron agentTurn runs that select agents.defaults.subagents.model or per-agent subagents.model now use that model config's fallback list instead of silently falling back to the main agent model policy.
Real environment tested: local OpenClaw checkout with mocked cron/embedded-runner regression tests.
Exact steps or command run after this patch: focused Vitest command above, oxfmt check, oxlint, git diff check, and codex-review local pass.
Evidence after fix: regression tests assert subagent fallback lists reach runWithModelFallback and runEmbeddedPiAgent.modelFallbacksOverride, while hook/payload/session/agent-primary paths stay isolated.
Observed result after fix: focused suite passed with 11 test files and 179 tests; codex-review finished clean with no accepted/actionable findings.
What was not tested: live scheduled cron run against a real provider timeout; this PR covers the model/fallback routing contract through focused regression tests.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: M maintainer Maintainer-authored PR labels May 15, 2026
@clawsweeper

clawsweeper Bot commented May 15, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge.

Summary
The PR threads configured subagent model fallbacks through isolated cron model selection and embedded-runner failover checks, with regression tests and a changelog entry.

Reproducibility: yes. at source level. Current main can select subagents.model for isolated cron agentTurn runs while resolving fallbacks and embedded fallbackConfigured from the normal model policy; the linked issue includes a matching timeout log.

Real behavior proof
Override: A maintainer applied proof: override for this PR.

Next step before merge
The protected maintainer label and proof override make this a human maintainer review and landing decision; no narrow automated repair is needed.

Security
Cleared: The diff only forwards existing model fallback configuration through runtime code and tests; it does not add dependencies, workflows, downloads, package resolution, or secret-handling changes.

Review details

Best possible solution:

Land one targeted subagent-aware cron fallback fix after maintainer review, and keep the linked issue open until the chosen PR merges.

Do we have a high-confidence way to reproduce the issue?

Yes, at source level. Current main can select subagents.model for isolated cron agentTurn runs while resolving fallbacks and embedded fallbackConfigured from the normal model policy; the linked issue includes a matching timeout log.

Is this the best way to solve the issue?

Yes. The PR takes the narrow maintainable path by carrying the selected subagent fallback policy into both the outer fallback runner and embedded failover flag while preserving payload, hook, session, and agent-primary behavior.

Acceptance criteria:

  • node scripts/run-vitest.mjs src/cron/isolated-agent/run.message-tool-policy.test.ts src/cron/isolated-agent/run-fallback-policy.test.ts src/cron/isolated-agent/run.payload-fallbacks.test.ts src/agents/pi-embedded-runner/run/fallbacks.test.ts src/agents/agent-scope.test.ts src/cron/isolated-agent.model-formatting.test.ts src/cron/isolated-agent/run.cron-model-override-forwarding.test.ts src/agents/pi-embedded-runner/run/failover-observation.test.ts src/agents/pi-embedded-runner/run/failover-policy.test.ts
  • pnpm exec oxfmt --check --threads=1 CHANGELOG.md src/agents/agent-scope.ts src/agents/pi-embedded-runner/run.ts src/agents/pi-embedded-runner/run/fallbacks.ts src/agents/pi-embedded-runner/run/fallbacks.test.ts src/cron/isolated-agent.model-formatting.test.ts src/cron/isolated-agent/model-selection.ts src/cron/isolated-agent/run-execution.runtime.ts src/cron/isolated-agent/run-executor.ts src/cron/isolated-agent/run-fallback-policy.test.ts src/cron/isolated-agent/run-fallback-policy.ts src/cron/isolated-agent/run.payload-fallbacks.test.ts src/cron/isolated-agent/run.test-harness.ts src/cron/isolated-agent/run.ts src/cron/isolated-agent/run.message-tool-policy.test.ts
  • node scripts/run-oxlint.mjs CHANGELOG.md src/agents/agent-scope.ts src/agents/pi-embedded-runner/run.ts src/agents/pi-embedded-runner/run/fallbacks.ts src/agents/pi-embedded-runner/run/fallbacks.test.ts src/cron/isolated-agent.model-formatting.test.ts src/cron/isolated-agent/model-selection.ts src/cron/isolated-agent/run-execution.runtime.ts src/cron/isolated-agent/run-executor.ts src/cron/isolated-agent/run-fallback-policy.test.ts src/cron/isolated-agent/run-fallback-policy.ts src/cron/isolated-agent/run.payload-fallbacks.test.ts src/cron/isolated-agent/run.test-harness.ts src/cron/isolated-agent/run.ts src/cron/isolated-agent/run.message-tool-policy.test.ts
  • git diff --check

What I checked:

  • Protected-label gate: The hydrated GitHub context shows this PR is open and labeled maintainer, which requires explicit maintainer handling rather than cleanup close; it also has proof: override. (d6f2bb8173ea)
  • Current main selects subagent models without recording source: Current main chooses agentConfigOverride.subagents.model, then agentConfigOverride.model, then agents.defaults.subagents.model, but the success result only returns provider/model, so the later fallback policy cannot tell when the selected model came from subagent config. (src/cron/isolated-agent/model-selection.ts:85, cce12697a1ee)
  • Current main fallback resolver ignores subagent fallback lists: Current main resolves cron fallbacks from payload fallbacks or resolveEffectiveModelFallbacks, which reads the normal agent/default model policy rather than subagents.model.fallbacks. (src/cron/isolated-agent/run-fallback-policy.ts:14, cce12697a1ee)
  • Current embedded runner failover flag uses normal model fallback config: runEmbeddedPiAgent currently computes fallbackConfigured with hasConfiguredModelFallbacks, so embedded timeout failover decisions do not see subagent-specific fallback arrays unless an override is forwarded. (src/agents/pi-embedded-runner/run.ts:482, cce12697a1ee)
  • Existing config contract supports subagent fallbacks: AgentModelConfig already supports { primary, fallbacks }, and agents.defaults.subagents.model plus per-agent subagents.model use that type, so this is a contract gap rather than a new config option. (src/config/types.agents-shared.ts:8, cce12697a1ee)
  • PR diff addresses the implicated path: The PR adds model-source tracking, resolves subagent fallback overrides, forwards them to runWithModelFallback, and passes the same override into the embedded runner's modelFallbacksOverride path. (src/cron/isolated-agent/run-fallback-policy.ts:1, d6f2bb8173ea)

Likely related people:

  • vincentkoc: Current-line blame and git show route the extracted isolated cron model-selection and fallback-policy files to commit 4e10969, which introduced the current modular cron runtime surface. (role: recent cron runtime contributor; confidence: high; commits: 4e10969aded0; files: src/cron/isolated-agent/model-selection.ts, src/cron/isolated-agent/run-fallback-policy.ts, src/agents/agent-scope.ts)
  • steipete: Recent local history shows multiple cron/fallback refactors and tests by Peter Steinberger, including owner-tool cron coverage and fallback assertion cleanup; this relevance is from merged history, not only authorship of this PR. (role: recent cron and fallback area contributor; confidence: medium; commits: 03a6e3b460a7, 8fd043abac71, b37dc4224072; files: src/cron/isolated-agent, src/agents/model-fallback.ts)
  • yinghaosang: Commit f902697 added the per-job cron payload.fallbacks behavior that the PR preserves and composes with the new subagent fallback path. (role: fallback-policy feature contributor; confidence: medium; commits: f902697bd5c4; files: src/cron/isolated-agent/run-fallback-policy.ts, src/cron/isolated-agent/run.payload-fallbacks.test.ts)
  • Aviral: Commit d0ca02e is titled as the fix for respecting subagents.model in isolated cron sessions, the selected-model behavior whose fallback policy this PR extends. (role: subagent cron model behavior contributor; confidence: medium; commits: d0ca02e963b6; files: src/cron/isolated-agent/model-selection.ts)

Remaining risk / open question:

  • No focused tests or live provider-timeout run were executed during this read-only review.
  • The PR body reports mocked regression tests only; the real-behavior proof gate is waived by proof: override, not satisfied by attached live cron/provider output.

Codex review notes: model gpt-5.5, reasoning high; reviewed against cce12697a1ee.

@steipete steipete force-pushed the fix/cron-subagent-fallbacks-74985 branch from 59c7741 to d6f2bb8 Compare May 15, 2026 20:43
@steipete steipete added the proof: override Maintainer override for the external PR real behavior proof gate. label May 15, 2026
@steipete steipete merged commit e22a7e4 into main May 15, 2026
90 of 95 checks passed
@steipete steipete deleted the fix/cron-subagent-fallbacks-74985 branch May 15, 2026 20:54
@steipete

Copy link
Copy Markdown
Contributor Author

Landed via rebase merge onto main.

  • Gate: focused Vitest suite (11 files / 179 tests), oxfmt, oxlint, git diff --check, final $codex-review, GitHub CI green enough to merge; Real behavior proof passed with maintainer proof: override for this locally proven routing fix.
  • Land commit: d6f2bb8
  • Merge commit: e22a7e4

Thanks @chrisgwynne!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR proof: override Maintainer override for the external PR real behavior proof gate. size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Embedded agent Kimi timeout with no fallback despite fallbacks configured in model_stack

1 participant