Skip to content

test(cron): speed up isolated fallback tests#87520

Merged
RomneyDa merged 1 commit into
mainfrom
codex/perf-isolated-cron-tests
May 28, 2026
Merged

test(cron): speed up isolated fallback tests#87520
RomneyDa merged 1 commit into
mainfrom
codex/perf-isolated-cron-tests

Conversation

@RomneyDa

Copy link
Copy Markdown
Member

Summary

This PR speeds up the isolated cron test hotspot shard without changing production runtime behavior.

  • Adds a test-harness mock for resolveCliRuntimeExecutionProvider so the payload fallback tests can assert cron fallback planning without invoking real CLI runtime/backend discovery.
  • Lets targeted mocked runCronIsolatedAgentTurn suites opt into OPENCLAW_TEST_FAST=1 through setupRunCronIsolatedAgentTurnSuite({ fast: true }).
  • Removes default fake usage from the run harness results so tests that do not assert usage avoid the unrelated usage/cost finalization path.

Test Cost / Size Impact

  • Planned hotspot group before: 166.17s total, 143.35s in tests.
  • Planned hotspot group after: 3.55s total, 1.18s in tests.
  • src/cron/isolated-agent/run.payload-fallbacks.test.ts after focused fix: 2.00s total, 24ms in tests. The former slow Claude CLI compatibility assertion now reports 0-1ms in the grouped runs.
  • Diff size: 4 files, +28/-8.

Related Work Search

I searched GitHub and local repo tooling before implementing. I did not find an exact existing issue or PR covering this isolated cron test performance work.

Adjacent but not duplicate:

Verification

  • node scripts/run-vitest.mjs src/cron/isolated-agent/run.payload-fallbacks.test.ts src/cron/isolated-agent/run.fast-mode.test.ts src/cron/isolated-agent.hook-content-wrapping.test.ts src/cron/isolated-agent.delivery-awareness.test.ts --reporter=verbose
  • node scripts/run-vitest.mjs src/cron/isolated-agent/run.*.test.ts --reporter=verbose
  • pnpm exec oxfmt --check --threads=1 src/cron/isolated-agent/run.test-harness.ts src/cron/isolated-agent/run.suite-helpers.ts src/cron/isolated-agent/run.payload-fallbacks.test.ts src/cron/isolated-agent/run.fast-mode.test.ts
  • node scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.json src/cron/isolated-agent/run.test-harness.ts src/cron/isolated-agent/run.suite-helpers.ts src/cron/isolated-agent/run.payload-fallbacks.test.ts src/cron/isolated-agent/run.fast-mode.test.ts
  • node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.test.src.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/test-src.tsbuildinfo
  • .agents/skills/autoreview/scripts/autoreview --mode local

Real behavior proof

Behavior addressed: Isolated cron test performance for payload fallback and fast-mode planning coverage; no production runtime behavior change intended.
Real environment tested: Local Codex worktree on macOS using repo Node/Vitest wrappers.
Exact steps or command run after this patch: The Verification commands listed above.
Evidence after fix: The planned four-file hotspot group passed 18 tests in 3.55s after a baseline of 166.17s; the broader src/cron/isolated-agent/run.*.test.ts wrapper run passed 16 files and 117 tests in 5.02s; oxfmt, focused oxlint, test-source tsgo, and autoreview all exited cleanly.
Observed result after fix: The slow CLI compatibility fallback assertion is covered through the cron test harness mock instead of real runtime discovery, and the targeted mocked suites complete in seconds while preserving fallback ordering, compatible CLI execution attempts, fast-mode propagation, hook content wrapping, and delivery-awareness coverage.
What was not tested: Full pnpm check, full Vitest suite, and live provider behavior; this is a scoped test-only harness change in a Codex worktree.

@openclaw-barnacle openclaw-barnacle Bot added size: XS maintainer Maintainer-authored PR labels May 28, 2026
@clawsweeper

clawsweeper Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed May 28, 2026, 1:41 AM ET / 05:41 UTC.

Summary
This PR updates the isolated cron test harness and two cron test suites to mock CLI runtime alias resolution, opt selected suites into OPENCLAW_TEST_FAST=1, and avoid default fake usage metadata in harness results.

PR surface: Source +16, Tests +4. Total +20 across 4 files.

Reproducibility: not applicable. this is a test-harness performance cleanup rather than a reported runtime bug. The review checked current main, the PR diff, and the PR body's before/after timing proof instead of a user-facing repro.

Review metrics: none identified.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • [P1] One check run was still in progress in the unauthenticated check-runs page, so final merge should wait for the repository's normal required CI despite the review finding no code defect.

Maintainer options:

  1. Decide the mitigation before merge
    Land the narrow test-harness speedup after maintainer approval and green required CI, keeping production cron runtime behavior unchanged.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • No ClawSweeper repair lane is needed because no actionable patch defect was found; the maintainer-labeled member PR should proceed through maintainer approval and CI.

Security
Cleared: Cleared: the diff is limited to cron test harness and suite files, with no dependency, workflow, package script, secret-handling, or production runtime code changes.

Review details

Best possible solution:

Land the narrow test-harness speedup after maintainer approval and green required CI, keeping production cron runtime behavior unchanged.

Do we have a high-confidence way to reproduce the issue?

Not applicable; this is a test-harness performance cleanup rather than a reported runtime bug. The review checked current main, the PR diff, and the PR body's before/after timing proof instead of a user-facing repro.

Is this the best way to solve the issue?

Yes; the patch targets the expensive test-only seams with a harness mock and per-suite fast-mode opt-in, which is narrower than changing production cron fallback or runtime behavior.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against e805ffd2eb75.

Label changes

Label justifications:

  • P3: This is a low-risk test-harness performance cleanup with no intended production runtime behavior change.
  • rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): Sufficient: the PR body includes after-fix wrapper commands, before/after timings, broader test pass counts, and clean focused static checks for the test-harness speedup.
  • proof: sufficient: Contributor real behavior proof is sufficient. Sufficient: the PR body includes after-fix wrapper commands, before/after timings, broader test pass counts, and clean focused static checks for the test-harness speedup.
Evidence reviewed

PR surface:

Source +16, Tests +4. Total +20 across 4 files.

View PR surface stats
Area Files Added Removed Net
Source 2 20 4 +16
Tests 2 8 4 +4
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 4 28 8 +20

What I checked:

  • Repository policy read: The full root AGENTS.md was read and applied; its review policy keeps maintainer/protected-label PRs in maintainer handling and treats test-performance changes as requiring scoped proof rather than auto-close cleanup. (AGENTS.md:18, e805ffd2eb75)
  • Scoped agent test guide read: The relevant scoped guide says agent tests are often import-bound and asks performance edits to keep expensive runtime work behind dependency injection or narrow helpers, which matches the PR's test-harness mock direction. (src/agents/AGENTS.md:1, e805ffd2eb75)
  • Current main lacks the new fast option: On current main, setupRunCronIsolatedAgentTurnSuite has no options parameter and always clears OPENCLAW_TEST_FAST, so the PR's opt-in fast-suite behavior is not already implemented. (src/cron/isolated-agent/run.suite-helpers.ts:11, e805ffd2eb75)
  • Runtime alias boundary matches the mock: The cron executor resolves resolveCliRuntimeExecutionProvider immediately before deciding whether to run the CLI path, so mocking that dependency in the harness is targeted to the expensive behavior under test. (src/cron/isolated-agent/run-executor.ts:203, e805ffd2eb75)
  • PR diff is test-only and narrow: The fetched PR diff changes only run.test-harness.ts, run.suite-helpers.ts, run.payload-fallbacks.test.ts, and run.fast-mode.test.ts; it adds the alias mock and fast-suite option without production runtime edits. (src/cron/isolated-agent/run.test-harness.ts:49, bb22882c06c6)
  • Contributor proof is present: The PR body reports before/after hotspot timings, focused and broader Vitest wrapper runs, focused oxfmt/oxlint/tsgo, and autoreview results after the patch. (bb22882c06c6)

Likely related people:

  • steipete: Local blame and the commits API show repeated recent ownership of the shared isolated cron harness, fast-mode tests, and payload fallback tests that this PR adjusts. (role: recent area contributor; confidence: high; commits: b1c30f0ba9b8, bb46b79d3c14, cd993f4584e7; files: src/cron/isolated-agent/run.suite-helpers.ts, src/cron/isolated-agent/run.test-harness.ts, src/cron/isolated-agent/run.fast-mode.test.ts)
  • yinghaosang: The payload fallback test surface traces back to the merged payload.fallbacks feature commit, which is the behavior this PR keeps covered while speeding the test. (role: feature-history contributor; confidence: medium; commits: f902697bd5c4; files: src/cron/isolated-agent/run.payload-fallbacks.test.ts)
  • vincentkoc: The test harness history shows adjacent cron runtime seam and lazy-load performance work in the same isolated cron area. (role: adjacent runtime-performance contributor; confidence: medium; commits: 35176f3cb730, a08fbfb1aea9, 101c16b0b133; files: src/cron/isolated-agent/run.test-harness.ts, src/cron/isolated-agent/run.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P3 Low-priority cleanup, docs, polish, ergonomics, or speculative work. labels May 28, 2026
@clawsweeper

clawsweeper Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg: ✨ hatched 🥚 common Clockwork Crabkin. Rarity: 🥚 common. Trait: sleeps inside passing CI.

Details

Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Clockwork Crabkin in ClawSweeper.
Hatchability:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

About:

  • Eggs appear after real-behavior proof passes. They are collectible flavor only.
  • Review momentum changes the shell state: follow-up work warms it, re-review makes it wobble, and a clean final review lets it hatch.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. labels May 28, 2026
@RomneyDa RomneyDa force-pushed the codex/perf-isolated-cron-tests branch from 1dbd3bc to bb22882 Compare May 28, 2026 05:36
@RomneyDa RomneyDa merged commit 127c0ad into main May 28, 2026
99 checks passed
@RomneyDa RomneyDa deleted the codex/perf-isolated-cron-tests branch May 28, 2026 05:45
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 29, 2026
SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintainer Maintainer-authored PR P3 Low-priority cleanup, docs, polish, ergonomics, or speculative work. proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. size: XS status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant