Skip to content

ci: stabilize nextest across platforms#82

Merged
EffortlessSteven merged 1 commit into
mainfrom
release/ci-nextest-stabilize
Apr 15, 2026
Merged

ci: stabilize nextest across platforms#82
EffortlessSteven merged 1 commit into
mainfrom
release/ci-nextest-stabilize

Conversation

@EffortlessSteven

Copy link
Copy Markdown
Member

Summary

  • Root cause: preflight_allow_dirty_snapshot used spawn_registry_not_found(1), but preflight issues two registry calls per crate (version_exists + check_new_crate). The second call timed out against the exhausted mock, producing a reqwest error that embedded a non-deterministic port and "operation timed out" text — flaking on Linux/macOS/Windows.
  • Fix: bump the mock to 2 expected requests so preflight completes cleanly. Regenerate the snapshot against the actual preflight report.
  • Drop the unknown profile.ci.junit.report-successful key from .config/nextest.toml — it was the source of the ignoring unknown configuration key warning in every CI log.
  • Add Timestamp: normalization to e2e_expanded::normalize_output (matching the helper already in cli_e2e) so the preflight report snapshot stays stable across runs.

Verification

  • cargo nextest show-config — no longer warns about the unknown key.
  • cargo test -p shipper-cli --test e2e_expanded — 126/126 pass locally (previously 125 pass, 1 fail).
  • New snapshot is deterministic: timestamps and plan IDs are normalized, no reqwest timeout text.

Test plan

  • Tests (nextest) (ubuntu-latest) green
  • Tests (nextest) (windows-latest) green
  • Tests (nextest) (macos-latest) green
  • No remaining nextest config warnings in CI logs

This is PR 2 in the release-hardening sprint (audit → nextest → fuzz → package-truth → rehearsal → publish).

@coderabbitai

coderabbitai Bot commented Apr 15, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@EffortlessSteven has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 34 minutes and 56 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 34 minutes and 56 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 327104e3-8a78-476d-bb7a-a13803c8b22e

📥 Commits

Reviewing files that changed from the base of the PR and between b9a5897 and be1e45f.

⛔ Files ignored due to path filters (1)
  • crates/shipper-cli/tests/snapshots/e2e_expanded__preflight_allow_dirty.snap is excluded by !**/*.snap
📒 Files selected for processing (2)
  • .config/nextest.toml
  • crates/shipper-cli/tests/e2e_expanded.rs

Walkthrough

Two targeted updates to test infrastructure: removed explicit successful test reporting from Nextest JUnit configuration, and enhanced e2e test stability by redacting timestamps in snapshot comparisons while correcting a mock registry request expectation count.

Changes

Cohort / File(s) Summary
Nextest CI Configuration
.config/nextest.toml
Removed report-successful = true setting from junit reporter output, changing which test results are included in the generated JUnit XML.
E2E Test Improvements
crates/shipper-cli/tests/e2e_expanded.rs
Enhanced normalize_output helper to redact timestamp values for snapshot stability, and corrected preflight_allow_dirty_snapshot test mock registry expectation from 1 to 2 spawns to align with actual preflight behavior.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Poem

🐰 Timestamps now hidden with a clever redact,
Registry mocks aligned with the facts,
JUnit reports trim and neat,
Our tests grow more stable, what a treat!

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'ci: stabilize nextest across platforms' directly describes the main objective of fixing platform-specific flakiness in the nextest CI configuration and test snapshots.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, explaining the root cause of test flakiness, the fixes applied, and verification steps.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch release/ci-nextest-stabilize

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the test infrastructure and snapshots for the shipper-cli crate. Key changes include modifying the nextest configuration to disable successful test reporting in JUnit output, enhancing the test output normalization to handle timestamps, and updating the preflight_allow_dirty_snapshot test to account for multiple registry calls. Additionally, the preflight test snapshot was updated to reflect the expected report output. I have no feedback to provide.

The preflight flow issues two registry calls per crate (version_exists +
check_new_crate), but the test mock spawn_registry_not_found(1) only
responded to one. The second request timed out after 30s, producing a
non-deterministic reqwest error that included the mock's ephemeral port
number. Bump the mock to 2 expected requests so preflight completes
normally, then regenerate the snapshot against the actual preflight
report output.

Also:
- Drop the unknown profile.ci.junit.report-successful key from
  .config/nextest.toml to eliminate the spurious warning in every CI run.
- Add Timestamp: normalization to e2e_expanded::normalize_output to match
  the one already in cli_e2e, so the preflight report snapshot is stable.
@EffortlessSteven EffortlessSteven force-pushed the release/ci-nextest-stabilize branch from b9a5897 to be1e45f Compare April 15, 2026 21:16
@EffortlessSteven EffortlessSteven merged commit 1a674bb into main Apr 15, 2026
2 checks passed
@EffortlessSteven EffortlessSteven deleted the release/ci-nextest-stabilize branch April 15, 2026 21:16
EffortlessSteven added a commit that referenced this pull request Apr 16, 2026
Completes the PR-matrix / deep-lane split that the previous commit
on this branch stopped short of.

- `test` matrix (3 OSes, every trigger): PROPTEST_CASES=16 always.
  Drop the event-driven ternary; the matrix no longer carries the
  full-strength case count on any trigger. Also cap shrink iterations
  (PROPTEST_MAX_SHRINK_ITERS=1000) so failures surface fast.

- New `crypto-proptests-heavy` job: Ubuntu-only, runs just
  `shipper-encrypt` with PROPTEST_CASES=256. Gated on schedule,
  push-to-main, and workflow_dispatch. Not run on pull_request, so
  developer feedback stays fast; opt in per-branch via dispatch if
  needed.

Net effect:
- PR CI: three OSes at reduced intensity, no Argon2 tax.
- main push + nightly cron: full-strength crypto proof on Ubuntu.
- workflow_dispatch: full-strength on demand against any ref.

Rationale for Ubuntu-only heavy lane: crypto correctness is
platform-independent at the Rust layer we exercise here, and the
three-OS cost for full proptests was 30-60 min of cumulative wall
time every PR without proportional signal.

Note on the stale `profile.ci.junit.report-successful` nextest key:
already removed on main (see #82). No change needed to nextest.toml.
EffortlessSteven added a commit that referenced this pull request Apr 16, 2026
#87)

* ci: fast-path shipper-encrypt proptests on PR matrix

The Argon2-heavy crypto proptests in `shipper-encrypt` (derive_key_*,
decrypt_truncated_*, double_encrypt_roundtrip_*, each_encrypt_produces_*)
were dominating wall-clock on the three-platform `test` matrix at
PROPTEST_CASES=256, costing 10-20 minutes per platform on every PR.

Split the case count by event:
- PR / push to main: 16 cases (fast developer feedback)
- schedule (nightly cron) / workflow_dispatch: 256 cases (full strength)

Also pin coverage to 16 cases unconditionally, since llvm-cov
instrumentation on top of Argon2 proptests is the worst-of-both and
coverage does not need max-strength property exploration.

No change to nextest config, no new jobs, no platform carve-out yet.

* ci: split crypto proptests into Ubuntu-only heavy lane

Completes the PR-matrix / deep-lane split that the previous commit
on this branch stopped short of.

- `test` matrix (3 OSes, every trigger): PROPTEST_CASES=16 always.
  Drop the event-driven ternary; the matrix no longer carries the
  full-strength case count on any trigger. Also cap shrink iterations
  (PROPTEST_MAX_SHRINK_ITERS=1000) so failures surface fast.

- New `crypto-proptests-heavy` job: Ubuntu-only, runs just
  `shipper-encrypt` with PROPTEST_CASES=256. Gated on schedule,
  push-to-main, and workflow_dispatch. Not run on pull_request, so
  developer feedback stays fast; opt in per-branch via dispatch if
  needed.

Net effect:
- PR CI: three OSes at reduced intensity, no Argon2 tax.
- main push + nightly cron: full-strength crypto proof on Ubuntu.
- workflow_dispatch: full-strength on demand against any ref.

Rationale for Ubuntu-only heavy lane: crypto correctness is
platform-independent at the Rust layer we exercise here, and the
three-OS cost for full proptests was 30-60 min of cumulative wall
time every PR without proportional signal.

Note on the stale `profile.ci.junit.report-successful` nextest key:
already removed on main (see #82). No change needed to nextest.toml.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant