feat(harden): no-panic check + release CI gate (#187 PR 8b) by EffortlessSteven · Pull Request #229 · EffortlessMetrics/shipper

EffortlessSteven · 2026-05-12T07:49:15Z

Summary

Closes #187 by wiring the verify mode and release-time enforcement around the baseline detector that landed in PR 8a (#228).

cargo xtask no-panic check re-runs the AST detector, compares fresh findings to policy/no-panic-baseline.json keyed by (path, family, selector_kind, selector_callee, snippet), and reports diffs in four buckets:
- new entries — new debt; fails the gate
- count increases — same site, more occurrences; fails the gate
- resolved entries — debt that no longer exists; reported, never fails
- count decreases — same site, fewer occurrences; reported, never fails
--mode advisory writes target/policy/no-panic-report.json without bailing. Used by cargo xtask policy-report, which now includes no-panic as an eighth advisory area in the unified output.
--mode blocking (the CLI default for check) bails non-zero on any violation.
A new policy-gate job in .github/workflows/release.yml runs cargo xtask no-panic check --mode blocking and cargo xtask check-lint-policy. publish-crates-io depends on both msrv-gate and policy-gate, so a tag push cannot publish if either ledger has drifted since the SHA the baseline was generated at.
Scanner refactor: scan_and_group is now a shared helper called by both baseline() and check(). Keeps the production-vs-test classification identical at baseline-time and check-time; drift between the two would have been a subtle false-positive source.
15 unit tests covering the #[test]/#[bench]/#[cfg(test)]/#[cfg(any(test, ...))]/#[cfg(all(..., test))]/#[cfg(not(test))] classifications and the tests.rs/*_tests.rs/tests//benches//examples/ filename and directory exclusions.

Regression caught during implementation

The previous attrs_imply_test recursed into cfg(not(test)) and misclassified it as test code. That heuristic was technically a "any mention of test anywhere in the cfg" rule, which inverts the meaning of cfg(not(test)) (production-only code). Fixed by walking only any(...) and all(...), not not(...). The attrs_cfg_not_test_is_not_test unit test guards against re-regression.

PR 8a's baseline was generated before the fix, but the production source happened to contain no cfg(not(test)) sites, so the baseline file did not change as part of PR 8b. (Verified: current_entries == baseline_entries == 28 with 0 new, 0 resolved, 0 count_increase, 0 count_decrease.)

Test plan

cargo build -p xtask — clean
cargo test -p xtask — 15/15 passing
cargo clippy --workspace --all-targets -- -D warnings — clean
cargo fmt --all -- --check — clean
cargo xtask no-panic check — clean (28/28 entries match baseline)
cargo xtask no-panic check --mode advisory — writes report, returns 0 even with synthetic violations
cargo xtask policy-report — 8 areas, all clean
CI green (full nextest matrix + new policy-gate job hopefully not triggered since this isn't a tag push — but every other check should pass)

Followups

The 28 baseline entries are real debt to chip away at — most are .lock().unwrap() in engine/parallel/ (Mutex poisoning unhandled) and serde_json::Value indexing in state/execution_state/. First debt burn-down (per the rollout doc) will reduce that count.
The policy-gate job's release-side enforcement is the first gate that runs ONLY on tag push. PR-time CI doesn't run it; a future PR can mirror cargo xtask no-panic check --mode blocking into the PR-time Policy job so drift is caught earlier.

Closes #187.

`cargo xtask no-panic check` re-runs the AST detector landed in PR 8a, compares the fresh result against `policy/no-panic-baseline.json` keyed by (path, family, selector_kind, selector_callee, snippet), and reports the diff in four buckets: new entries new panic-family debt; fails the gate count increases same site, more occurrences; fails the gate resolved entries debt that no longer exists; reported, never fails count decreases same site, fewer occurrences; reported, never fails `--mode advisory` writes the report to target/policy/no-panic-report.json without bailing — used by `cargo xtask policy-report` which now includes no-panic as an eighth advisory area in its unified output. `--mode blocking` (the CLI default for the `check` subcommand) exits non-zero on any violation. A new `policy-gate` job in `.github/workflows/release.yml` runs `cargo xtask no-panic check --mode blocking` plus `check-lint-policy` before `publish-crates-io`, so a tag push cannot publish if either ledger has drifted since the SHA the baseline was generated at. The detector now ships with 15 unit tests covering: - attribute classification: #[test], #[bench], cfg(test), cfg(any( test, ...)), cfg(all(..., test)), and explicitly cfg(not(test)) which is *not* test code despite mentioning `test` (regression fixed during PR 8b — the previous heuristic recursed into not(...) and misclassified production-only sites) - directory exclusions: tests/, benches/, examples/ - filename exclusions: tests.rs and *_tests.rs - production-source negative cases (engine/mod.rs, lib.rs, testable_seam.rs) `scan_and_group` is now a shared helper called by both `baseline()` (write to disk) and `check()` (compare to disk). That keeps the production-vs-test classification identical at baseline-time and check-time — drift would have been a subtle source of false-positive "new debt" findings. The baseline file did not change as part of this PR (current_entries matches baseline_entries; 0 new, 0 resolved, 0 count_increase, 0 count_decrease). xtask/src/no_panic.rs shared scan_and_group helper, check(mode) with diff + report writer, attr_implies_test fix for cfg(not(test)), 15 unit tests xtask/src/main.rs new `no-panic check` subcommand with --mode flag (default Blocking) xtask/src/policy_report.rs no-panic registered as 8th area, `violations` added to headline priority .github/workflows/release.yml new `policy-gate` job; publish-crates-io depends on msrv-gate AND policy-gate CHANGELOG.md Added entry; #187 closed

factory-droid · 2026-05-12T07:49:38Z

Droid finished @EffortlessSteven's task —— View job

Droid Review — Phase 2 Validation Complete

PR #229 (feat/no-panic-check-187-pr8b)

Result: ✅ Approved — zero findings

Validation performed:

Reviewed PR description, candidates, full diff, and existing comments
Inspected all changed surfaces against shipper priority surfaces (publish correctness, registry ambiguity, resume/idempotency, CI workflow gates)
Applied no-panic policy lens via NO_PANIC_POLICY.md + droid-review.md rules

Why no comments:
The diff adds only policy-tooling (xtask/src/no_panic.rs, xtask/src/policy_report.rs) and CI wiring (.github/workflows/release.yml). No production code paths, registry interactions, state persistence, token resolution, or event-log ordering are touched. The scan_and_group refactor is purely structural. The cfg(not(test)) regression fix is correct.

Residual risk:
None identified — this is a pure xtask/CI change.

Validation signal:

Observed: PR test plan (cargo test -p xtask, cargo xtask no-panic check) shows 15/15 passing
Reported: CI workflow additions in release.yml follow established msrv-gate patterns
Not verified: Actual CI run on tag push (by design, policy-gate only triggers on push event)

View validated results →

coderabbitai · 2026-05-12T07:51:54Z

Warning

Rate limit exceeded

@EffortlessSteven has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 33 minutes and 58 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5c11c00d-9ead-4852-a397-e1c564dd0686

📥 Commits

Reviewing files that changed from the base of the PR and between 31db93d and e623aba.

📒 Files selected for processing (5)

.github/workflows/release.yml
CHANGELOG.md
xtask/src/main.rs
xtask/src/no_panic.rs
xtask/src/policy_report.rs

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/no-panic-check-187-pr8b

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e623abaeac

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-12T07:53:21Z

+            } else if meta.path.is_ident("any") || meta.path.is_ident("all") {
                let _ = meta.parse_nested_meta(|inner| {


Recurse through nested cfg test predicates

Observed in the changed attr_implies_test logic: the parser now only checks direct children of any(...)/all(...), so test-only blocks behind nested predicates such as #[cfg(all(unix, any(test, feature = "foo")))] mod tests { ... } are classified as production. With the new release no-panic gate, unwraps in those test-only modules become false violations or get baselined as production debt, despite docs/NO_PANIC_POLICY.md excluding #[cfg(test)] subtrees. Fix direction: make the any/all predicate walk recursive while still treating not(test) as non-test, and validate with an added nested-any/all unit test.

Useful? React with 👍 / 👎.

gemini-code-assist

Code Review

This pull request implements the "No-panic check" and release CI gate. It introduces the cargo xtask no-panic check command, which compares a fresh scan of panic-family call sites against a stored baseline in policy/no-panic-baseline.json. The implementation includes logic to categorize changes as new debt, resolved entries, or count fluctuations, and supports both advisory and blocking modes. Additionally, the PR refactors the scanner to share logic between baseline generation and checking, fixes a regression in cfg(not(test)) classification, and adds unit tests for the detector's exclusion rules. I have no feedback to provide.

) #189 was filed with a richer lane-routing proposal (e.g. moving cross-platform from every-PR to labeled+nightly). That work is deferred because: - Current PR wall-clock is ~24-28 min (recent CI runs from this rollout), close to the original 25-min target without aggressive routing. - Shipper has Windows-specific code paths (path handling, process spawning, line endings). Moving Windows/macOS/aarch64 builds to nightly means platform regressions surface a day later instead of inside the PR — that is a real loss for a release-pipeline product. - Coverage-removing routing changes are hard to undo: contributors stop expecting platform signal, regressions accumulate, then "fixing" the lanes becomes a multi-PR cleanup. So #189 lands as a documentation pass: capture the actual current state of .github/workflows/ so future routing decisions have an accurate baseline. Changes to docs/ci/test-evidence-lanes.md: - Replace the conflated "Always-On" table with a complete workflow inventory: 10 workflows, ~28 jobs across them, columns for trigger / lane / required-for-merge. - Per-job lane map for ci.yml (the load-bearing PR workflow), with predicates, observed wall-clock, and what each job proves. Previous doc was missing fuzz-smoke, cross-platform, release-build, crypto- proptests-heavy, and policy entirely; conflated `lint` into separate `fmt`/`clippy` rows. - Policy gates section now lists all eight xtask checks (file-policy, generated, executable, dependency-surface, workflow, process, network, lint-policy) plus the release-time no-panic check, with the PR each was introduced in. - Advisory/Routed section adds droid-review and droid (was missing). - Scheduled section corrects mutation to "Sunday 04:00 UTC" (was "nightly") and adds droid-security-scan. - Release Proof: full release.yml job table including the policy-gate job added in #229. - New "Routing Changes Deferred to Follow-Up PRs" section enumerates concrete movements with rationale: release-build to release-only, path-filtered fuzz-smoke, split cross-platform so only Linux is every-PR. No workflow files changed. Coverage-removing routing decisions deserve their own focused PRs. CHANGELOG.md Added entry under Documentation docs/ci/test-evidence-lanes.md ~180 lines rewritten

chatgpt-codex-connector Bot reviewed May 12, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 12, 2026

View reviewed changes

EffortlessSteven merged commit 77ff726 into main May 12, 2026
24 checks passed

EffortlessSteven deleted the feat/no-panic-check-187-pr8b branch May 12, 2026 08:19

EffortlessSteven mentioned this pull request May 12, 2026

docs(ci): audit and rewrite test-evidence-lanes (#189, docs-first) #234

Merged

5 tasks

EffortlessSteven mentioned this pull request May 12, 2026

fix(no-panic): first burn-down — 3 indexing sites in shipper-core (#190) #235

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(harden): no-panic check + release CI gate (#187 PR 8b)#229

feat(harden): no-panic check + release CI gate (#187 PR 8b)#229
EffortlessSteven merged 1 commit into
mainfrom
feat/no-panic-check-187-pr8b

EffortlessSteven commented May 12, 2026

Uh oh!

factory-droid Bot commented May 12, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 12, 2026

Rate limit exceeded

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 12, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		} else if meta.path.is_ident("any") \|\| meta.path.is_ident("all") {
		let _ = meta.parse_nested_meta(\|inner\| {

Uh oh!

Conversation

EffortlessSteven commented May 12, 2026

Summary

Regression caught during implementation

Test plan

Followups

Uh oh!

factory-droid Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Droid Review — Phase 2 Validation Complete

Uh oh!

coderabbitai Bot commented May 12, 2026

Rate limit exceeded

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

factory-droid Bot commented May 12, 2026 •

edited

Loading