Skip to content

test(cron): stabilize model precedence test and fix potential flakes#32165

Closed
markfietje wants to merge 4 commits intoopenclaw:mainfrom
markfietje:fix/cron-model-precedence-flake
Closed

test(cron): stabilize model precedence test and fix potential flakes#32165
markfietje wants to merge 4 commits intoopenclaw:mainfrom
markfietje:fix/cron-model-precedence-flake

Conversation

@markfietje
Copy link
Contributor

Summary

This PR stabilizes the cron model precedence tests to prevent intermittent CI failures caused by shared mock state or session clobbering.

Problem

Intermittent failures were observed in src/cron/isolated-agent.uses-last-non-empty-agent-text-as.test.ts where expectations on runEmbeddedPiAgent would fail (e.g., receiving anthropic when openai was expected).

This was likely due to:

  1. Shared mock state: Using mock.calls.at(-1) is sensitive to parallel or unexpected internal calls.
  2. Session clobbering: Reusing the same session key (cron:job-1) across multiple sequential calls in a test could lead to persistence race conditions if background tasks were involved.

Solution

  1. Updated expectEmbeddedProviderModel to use toHaveBeenCalledWith with expect.objectContaining, which is more robust than checking the last call manually.
  2. Ensured that runTurnWithStoredModelOverride uses unique session keys for each distinct verification step within the test case.
  3. Cleaned up duplicate helper definitions.

Verification

  • Verified: All 18 tests in src/cron/isolated-agent.uses-last-non-empty-agent-text-as.test.ts pass locally.
  • CI Checks: pnpm check and pnpm tsgo pass 100% green.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 2, 2026

Greptile Summary

Stabilized the cron model precedence test by addressing two root causes of intermittent failures:

  • More robust mock assertions: Replaced fragile mock.calls.at(-1) pattern with toHaveBeenCalledWith + expect.objectContaining, making the test resilient to call order variations
  • Eliminated session clobbering: Made runTurnWithStoredModelOverride accept a sessionKey parameter and updated the precedence test to use unique session keys (cron:job-precedence-1, cron:job-stored, cron:job-payload) for each verification step, preventing persistence race conditions

The changes are minimal, focused, and directly address the stated problems without introducing new risks.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The changes are purely test-focused with no impact on production code. The fixes are well-reasoned and directly address the root causes of test flakiness. The PR author verified all 18 tests pass locally and CI checks are green.
  • No files require special attention

Last reviewed commit: 6cd6a48

@markfietje markfietje force-pushed the fix/cron-model-precedence-flake branch from 6cd6a48 to 5af2616 Compare March 2, 2026 21:00
@markfietje
Copy link
Contributor Author

Closing as this has been superseded by internal stabilization fixes in main (e.g., 1212328). Glad to see the session store race is resolved.

@markfietje markfietje closed this Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant