Skip to content

Refactor publish orchestration into SRP helpers#247

Merged
EffortlessSteven merged 4 commits into
mainfrom
codex/break-complex-function-into-srp-submodules-45zsn9
May 14, 2026
Merged

Refactor publish orchestration into SRP helpers#247
EffortlessSteven merged 4 commits into
mainfrom
codex/break-complex-function-into-srp-submodules-45zsn9

Conversation

@EffortlessSteven

Copy link
Copy Markdown
Member

Motivation

  • The run_publish implementation had accumulated many responsibilities (bootstrap, resume gating, event lifecycle, finalization) making it hard to reason about and test; the change breaks those responsibilities into focused units.
  • Isolate mechanical phases (lock/state/bootstrap, resume decision, finalization) so each piece can evolve and be audited independently.

Description

  • Add a new engine::publish module containing three SRP submodules: bootstrap, resume, and finalize in crates/shipper-core/src/engine/publish/ to own discrete parts of the publish flow.
  • Move run-start bootstrap work into publish::bootstrap (resume target validation, rehearsal gate enforcement, lock acquisition, git/environment capture, state load/initialization, and initial event log writes) and return a PublishBootstrap helper struct used by run_publish.
  • Move resume decision logic into publish::resume (apply_resume_from_gate and record_terminal_resume_skip) so the per-package loop only consults a small resume API.
  • Move end-of-run work into publish::finalize (consistency drift recording, ExecutionFinished event, receipt writing, and completion webhook) and expose finish_sequential_run / finish_parallel_run.
  • Simplify run_publish to orchestrate the phases by calling the new helpers and keep the sequential publish loop focused on per-package attempts, retries, readiness checks, and state transitions.
  • Adjust visibility to pub(in crate::engine) for helpers and tidy imports/usages; run cargo fmt and minor refactors to satisfy clippy warnings.

Testing

  • Ran cargo check -p shipper-core which completed successfully.
  • Ran cargo clippy -p shipper-core --all-targets --all-features -- -D warnings which completed successfully.
  • Ran cargo fmt --all -- --check and git diff --check with no formatting or diff errors.
  • Ran cargo test -p shipper-core and targeted tests; the test run did not fully complete cleanly in this environment because registry/mock-driven tests returned 403 Forbidden and a couple of parallel reconcile tests blocked/ran long: specifically engine::parallel::tests::reconcile_bdd_ambiguous_resolves_to_published and related resume tests failed under the mocked registry responses, so those failures are environmental (mock server responses) rather than compile-time errors.

Codex Task

@gemini-code-assist

Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, add credits to your account and enable them for code reviews in your settings.

@coderabbitai

coderabbitai Bot commented May 13, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Rate limit exceeded

@EffortlessSteven has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 47 minutes and 26 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 028daeac-e410-41cc-8219-977fe82fd8d0

📥 Commits

Reviewing files that changed from the base of the PR and between dabbaad and 33d067b.

📒 Files selected for processing (2)
  • crates/shipper-core/src/engine/parallel/tests.rs
  • crates/shipper-core/src/engine/publish/finalize.rs

Walkthrough

This PR refactors the publish execution pipeline in shipper-core by extracting complex inline logic from run_publish into three dedicated submodules: bootstrap handles startup, resume manages resume-from gating, and finalize handles end-of-run tasks. Behavior is preserved while improving code organization and testability. Webhook PublishCompleted dispatch is consolidated into finalization, removing it from parallel execution.

Changes

Publish pipeline refactoring

Layer / File(s) Summary
Publish submodule foundation and bootstrap initialization
crates/shipper-core/src/engine/publish/mod.rs, crates/shipper-core/src/engine/publish/bootstrap.rs
New publish submodule with bootstrap, resume, and finalize layers. PublishBootstrap struct bundles state directory, lock, git context, environment, registry client, event paths/logs, and execution state. prepare_publish_run orchestrates the full startup: lock acquisition, git/environment capture, state loading/initialization, execution-start event recording, and package state seeding. Supporting helpers handle lock timeout based on --force, state mismatch validation, and initial event log creation.
Resume-from package gating logic
crates/shipper-core/src/engine/publish/resume.rs
ResumeGate enum and apply_resume_from_gate function decide whether to skip or publish based on resume target and terminal state. record_terminal_resume_skip emits PackageSkipped events when a package is already terminal, writing and clearing the event log in the process.
End-of-run finalization with consistency drift and receipts
crates/shipper-core/src/engine/publish/finalize.rs
Finalization logic for both sequential and parallel: consistency-drift detection, execution-result computation from receipt states, completion webhook dispatch with package counts, and receipt construction with plan/registry identifiers, timestamps, package receipts, derived events path, and optional git context. Unit tests verify Uploaded receipts cause PartialFailure.
Main publish orchestration refactored to use helpers
crates/shipper-core/src/engine/mod.rs
run_publish now calls prepare_publish_run for setup instead of inline logic, replaces resume-from loop conditionals with apply_resume_from_gate and record_terminal_resume_skip, and delegates end-of-run to finish_sequential_run and finish_parallel_run. Imports reorganized to include new publish submodule dependencies.
Parallel engine cleanup and webhook centralization
crates/shipper-core/src/engine/parallel/mod.rs, crates/shipper-core/src/engine/parallel/tests.rs
Removes ExecutionResult import and PublishCompleted webhook dispatch from parallel execution, consolidating completion notifications in finalization. Webhook imports moved to test-only. Test assertions updated to expect only PublishStarted webhook from parallel execution.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 Hops through refactored pipelines bright,
Bootstrap, resume, finalize—each in its right!
Inline logic scattered, now neatly arranged,
In submodules clean where responsibilities changed.
The rabbits rejoice as the code becomes clear! 🌿

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: refactoring the publish orchestration logic into Single Responsibility Principle (SRP) helpers, which is the core architectural improvement described throughout the changeset.
Description check ✅ Passed The description comprehensively explains the motivation, implementation details, and testing approach for the refactoring. It directly relates to the changeset by detailing the new module structure and responsibilities moved into bootstrap, resume, and finalize submodules.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/break-complex-function-into-srp-submodules-45zsn9

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@factory-droid

factory-droid Bot commented May 13, 2026

Copy link
Copy Markdown

Droid is reviewing code and running a security check…

Validation complete — 1 comment posted.

View job run

)
}

#[allow(clippy::too_many_arguments)]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P2] Missing send_completion_webhook call in finish_parallel_run

The finish_sequential_run function calls send_completion_webhook to send the PublishCompleted webhook, but finish_parallel_run does not. In the parallel path, the webhook is sent inside run_publish_parallel before finish_parallel_run is called, so notification still occurs, but the parallel path's finalization logic is inconsistent with the sequential path. Callers depending on webhooks from the finalize layer may observe inconsistent behavior between parallel and sequential runs.

@factory-droid

factory-droid Bot commented May 14, 2026

Copy link
Copy Markdown

Droid finished @EffortlessSteven's task —— View job


Validation complete — 1 comment posted.

PR #247 is a well-scoped SRP refactor. One latent panic bug confirmed: the .expect() on resume_from in apply_resume_from_gate would crash if control-flow invariants were ever bypassed. The parallel module and test changes are correct. No security issues found.

Comment on lines +60 to +69
let short = short_state(&progress.state);
reporter.info(&format!(
"{}@{}: already complete ({})",
package.name, package.version, short
));

// #125: explicitly record resume's "state already terminal, trusting it"
// decision so events.jsonl stays legible even though historical receipt
// shape excludes already-terminal packages in the resume path.
event_log.record(PublishEvent {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] Unsafe .expect() on resume_from can panic

In apply_resume_from_gate at line 69, the code calls opts.resume_from.as_ref().expect("resume_from was checked"). The .expect() is reachable when reached_resume_point is false and opts.resume_from is None. Although validate_resume_target in bootstrap.rs checks that a resume_from target exists in the plan when specified, it does not establish any invariant that opts.resume_from cannot be None at this point. The mutable reached_resume_point flag is initialized as opts.resume_from.is_none(), so if that guard ever failed or was bypassed, this .expect() would panic. Should use unwrap() with a descriptive message or refactor to eliminate the expect entirely.

Suggested change
let short = short_state(&progress.state);
reporter.info(&format!(
"{}@{}: already complete ({})",
package.name, package.version, short
));
// #125: explicitly record resume's "state already terminal, trusting it"
// decision so events.jsonl stays legible even though historical receipt
// shape excludes already-terminal packages in the resume path.
event_log.record(PublishEvent {
opts.resume_from.as_ref().unwrap_or_else(|| {
unreachable!("resume_from must be Some when reached_resume_point is false")
})

@factory-droid

factory-droid Bot commented May 14, 2026

Copy link
Copy Markdown

Droid finished @EffortlessSteven's task —— View job



Validation complete — 0 comments posted.

No review candidates were generated in this pass. The two existing PR review comments (on finalize.rs and resume.rs) were posted in a prior run and are not part of this candidates batch.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/shipper-core/src/engine/parallel/tests.rs (1)

1618-1632: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reduce webhook server timeout to avoid 30-second test delay.

The webhook server expects up to 3 requests with a 30-second timeout (line 1622), but the test now expects only 1 webhook (line 1662). After receiving the PublishStarted event, the server will wait ~30 seconds for a second request before timing out and breaking. This makes the test unnecessarily slow.

⚡ Proposed fix to reduce test duration

Change the expected request count to match the assertion:

     let webhook_handle = std::thread::spawn(move || {
-        // Collect up to 3 requests with a timeout
-        for _ in 0..3 {
-            match webhook_server.recv_timeout(Duration::from_secs(30)) {
+        // Collect up to 1 request with a short timeout
+        for _ in 0..1 {
+            match webhook_server.recv_timeout(Duration::from_secs(2)) {

Alternatively, just reduce the timeout to a few seconds if you want to keep the loop count flexible:

-            match webhook_server.recv_timeout(Duration::from_secs(30)) {
+            match webhook_server.recv_timeout(Duration::from_secs(2)) {

Also applies to: 1662-1663

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/shipper-core/src/engine/parallel/tests.rs` around lines 1618 - 1632,
The webhook thread is waiting up to 30s per iteration causing slow tests; change
the spawn loop using webhook_server.recv_timeout (and/or the loop count around
webhook_handle) so it matches the test expectation: either set the for loop to
only iterate once (for _ in 0..1) or reduce the
recv_timeout(Duration::from_secs(30)) to a few seconds (e.g., 2–5s) so the
webhook_receiver (webhook_received_clone) stops promptly after the expected
PublishStarted event; update webhook_handle accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/shipper-core/src/engine/publish/finalize.rs`:
- Around line 166-173: failure_count currently only counts PackageState::Failed
so ExecutionResult::PartialFailure can be reported with failure_count=0; update
the failure_count calculation (used in finalize code handling receipts) to count
any receipt whose PackageState is not a success and not skipped (i.e., treat
states like Uploaded/other non-success states as failures) instead of only
matching PackageState::Failed, leaving skipped_count as-is so webhook metrics
remain consistent with ExecutionResult::PartialFailure.

---

Outside diff comments:
In `@crates/shipper-core/src/engine/parallel/tests.rs`:
- Around line 1618-1632: The webhook thread is waiting up to 30s per iteration
causing slow tests; change the spawn loop using webhook_server.recv_timeout
(and/or the loop count around webhook_handle) so it matches the test
expectation: either set the for loop to only iterate once (for _ in 0..1) or
reduce the recv_timeout(Duration::from_secs(30)) to a few seconds (e.g., 2–5s)
so the webhook_receiver (webhook_received_clone) stops promptly after the
expected PublishStarted event; update webhook_handle accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d3c39e19-bc71-415f-8ba4-208ddd427134

📥 Commits

Reviewing files that changed from the base of the PR and between 38d23d5 and dabbaad.

📒 Files selected for processing (7)
  • crates/shipper-core/src/engine/mod.rs
  • crates/shipper-core/src/engine/parallel/mod.rs
  • crates/shipper-core/src/engine/parallel/tests.rs
  • crates/shipper-core/src/engine/publish/bootstrap.rs
  • crates/shipper-core/src/engine/publish/finalize.rs
  • crates/shipper-core/src/engine/publish/mod.rs
  • crates/shipper-core/src/engine/publish/resume.rs

Comment thread crates/shipper-core/src/engine/publish/finalize.rs
@factory-droid

factory-droid Bot commented May 14, 2026

Copy link
Copy Markdown

Droid finished @EffortlessSteven's task —— View job



Validation complete — 0 comments posted.

No review candidates were generated in this pass. The diff is a clean SRP refactor extracting bootstrap, resume-gating, and finalization into focused submodules. Three comments were already posted in prior runs (factory-droid P1/P2 findings on resume.rs and finalize.rs; coderabbitai finding on finalize.rs). No functional bugs, logic errors, or breaking contract changes identified in this pass.

@EffortlessSteven EffortlessSteven merged commit 6599b51 into main May 14, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant