Refactor publish orchestration into SRP helpers#247
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughThis PR refactors the publish execution pipeline in ChangesPublish pipeline refactoring
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Droid is reviewing code and running a security check… Validation complete — 1 comment posted. |
| ) | ||
| } | ||
|
|
||
| #[allow(clippy::too_many_arguments)] |
There was a problem hiding this comment.
[P2] Missing send_completion_webhook call in finish_parallel_run
The finish_sequential_run function calls send_completion_webhook to send the PublishCompleted webhook, but finish_parallel_run does not. In the parallel path, the webhook is sent inside run_publish_parallel before finish_parallel_run is called, so notification still occurs, but the parallel path's finalization logic is inconsistent with the sequential path. Callers depending on webhooks from the finalize layer may observe inconsistent behavior between parallel and sequential runs.
|
Droid finished @EffortlessSteven's task —— View job Validation complete — 1 comment posted. PR #247 is a well-scoped SRP refactor. One latent panic bug confirmed: the |
| let short = short_state(&progress.state); | ||
| reporter.info(&format!( | ||
| "{}@{}: already complete ({})", | ||
| package.name, package.version, short | ||
| )); | ||
|
|
||
| // #125: explicitly record resume's "state already terminal, trusting it" | ||
| // decision so events.jsonl stays legible even though historical receipt | ||
| // shape excludes already-terminal packages in the resume path. | ||
| event_log.record(PublishEvent { |
There was a problem hiding this comment.
[P1] Unsafe .expect() on resume_from can panic
In apply_resume_from_gate at line 69, the code calls opts.resume_from.as_ref().expect("resume_from was checked"). The .expect() is reachable when reached_resume_point is false and opts.resume_from is None. Although validate_resume_target in bootstrap.rs checks that a resume_from target exists in the plan when specified, it does not establish any invariant that opts.resume_from cannot be None at this point. The mutable reached_resume_point flag is initialized as opts.resume_from.is_none(), so if that guard ever failed or was bypassed, this .expect() would panic. Should use unwrap() with a descriptive message or refactor to eliminate the expect entirely.
| let short = short_state(&progress.state); | |
| reporter.info(&format!( | |
| "{}@{}: already complete ({})", | |
| package.name, package.version, short | |
| )); | |
| // #125: explicitly record resume's "state already terminal, trusting it" | |
| // decision so events.jsonl stays legible even though historical receipt | |
| // shape excludes already-terminal packages in the resume path. | |
| event_log.record(PublishEvent { | |
| opts.resume_from.as_ref().unwrap_or_else(|| { | |
| unreachable!("resume_from must be Some when reached_resume_point is false") | |
| }) |
|
Droid finished @EffortlessSteven's task —— View job Validation complete — 0 comments posted. No review candidates were generated in this pass. The two existing PR review comments (on finalize.rs and resume.rs) were posted in a prior run and are not part of this candidates batch. |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/shipper-core/src/engine/parallel/tests.rs (1)
1618-1632:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winReduce webhook server timeout to avoid 30-second test delay.
The webhook server expects up to 3 requests with a 30-second timeout (line 1622), but the test now expects only 1 webhook (line 1662). After receiving the
PublishStartedevent, the server will wait ~30 seconds for a second request before timing out and breaking. This makes the test unnecessarily slow.⚡ Proposed fix to reduce test duration
Change the expected request count to match the assertion:
let webhook_handle = std::thread::spawn(move || { - // Collect up to 3 requests with a timeout - for _ in 0..3 { - match webhook_server.recv_timeout(Duration::from_secs(30)) { + // Collect up to 1 request with a short timeout + for _ in 0..1 { + match webhook_server.recv_timeout(Duration::from_secs(2)) {Alternatively, just reduce the timeout to a few seconds if you want to keep the loop count flexible:
- match webhook_server.recv_timeout(Duration::from_secs(30)) { + match webhook_server.recv_timeout(Duration::from_secs(2)) {Also applies to: 1662-1663
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/shipper-core/src/engine/parallel/tests.rs` around lines 1618 - 1632, The webhook thread is waiting up to 30s per iteration causing slow tests; change the spawn loop using webhook_server.recv_timeout (and/or the loop count around webhook_handle) so it matches the test expectation: either set the for loop to only iterate once (for _ in 0..1) or reduce the recv_timeout(Duration::from_secs(30)) to a few seconds (e.g., 2–5s) so the webhook_receiver (webhook_received_clone) stops promptly after the expected PublishStarted event; update webhook_handle accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/shipper-core/src/engine/publish/finalize.rs`:
- Around line 166-173: failure_count currently only counts PackageState::Failed
so ExecutionResult::PartialFailure can be reported with failure_count=0; update
the failure_count calculation (used in finalize code handling receipts) to count
any receipt whose PackageState is not a success and not skipped (i.e., treat
states like Uploaded/other non-success states as failures) instead of only
matching PackageState::Failed, leaving skipped_count as-is so webhook metrics
remain consistent with ExecutionResult::PartialFailure.
---
Outside diff comments:
In `@crates/shipper-core/src/engine/parallel/tests.rs`:
- Around line 1618-1632: The webhook thread is waiting up to 30s per iteration
causing slow tests; change the spawn loop using webhook_server.recv_timeout
(and/or the loop count around webhook_handle) so it matches the test
expectation: either set the for loop to only iterate once (for _ in 0..1) or
reduce the recv_timeout(Duration::from_secs(30)) to a few seconds (e.g., 2–5s)
so the webhook_receiver (webhook_received_clone) stops promptly after the
expected PublishStarted event; update webhook_handle accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: d3c39e19-bc71-415f-8ba4-208ddd427134
📒 Files selected for processing (7)
crates/shipper-core/src/engine/mod.rscrates/shipper-core/src/engine/parallel/mod.rscrates/shipper-core/src/engine/parallel/tests.rscrates/shipper-core/src/engine/publish/bootstrap.rscrates/shipper-core/src/engine/publish/finalize.rscrates/shipper-core/src/engine/publish/mod.rscrates/shipper-core/src/engine/publish/resume.rs
|
Droid finished @EffortlessSteven's task —— View job Validation complete — 0 comments posted. No review candidates were generated in this pass. The diff is a clean SRP refactor extracting bootstrap, resume-gating, and finalization into focused submodules. Three comments were already posted in prior runs (factory-droid P1/P2 findings on resume.rs and finalize.rs; coderabbitai finding on finalize.rs). No functional bugs, logic errors, or breaking contract changes identified in this pass. |
Motivation
run_publishimplementation had accumulated many responsibilities (bootstrap, resume gating, event lifecycle, finalization) making it hard to reason about and test; the change breaks those responsibilities into focused units.Description
engine::publishmodule containing three SRP submodules:bootstrap,resume, andfinalizeincrates/shipper-core/src/engine/publish/to own discrete parts of the publish flow.publish::bootstrap(resume target validation, rehearsal gate enforcement, lock acquisition, git/environment capture, state load/initialization, and initial event log writes) and return aPublishBootstraphelper struct used byrun_publish.publish::resume(apply_resume_from_gateandrecord_terminal_resume_skip) so the per-package loop only consults a small resume API.publish::finalize(consistency drift recording,ExecutionFinishedevent, receipt writing, and completion webhook) and exposefinish_sequential_run/finish_parallel_run.run_publishto orchestrate the phases by calling the new helpers and keep the sequential publish loop focused on per-package attempts, retries, readiness checks, and state transitions.pub(in crate::engine)for helpers and tidy imports/usages; runcargo fmtand minor refactors to satisfy clippy warnings.Testing
cargo check -p shipper-corewhich completed successfully.cargo clippy -p shipper-core --all-targets --all-features -- -D warningswhich completed successfully.cargo fmt --all -- --checkandgit diff --checkwith no formatting or diff errors.cargo test -p shipper-coreand targeted tests; the test run did not fully complete cleanly in this environment because registry/mock-driven tests returned403 Forbiddenand a couple of parallel reconcile tests blocked/ran long: specificallyengine::parallel::tests::reconcile_bdd_ambiguous_resolves_to_publishedand related resume tests failed under the mocked registry responses, so those failures are environmental (mock server responses) rather than compile-time errors.Codex Task