Skip to content

feat(reconcile): close ambiguous publish outcomes against registry truth (#99)#111

Merged
EffortlessSteven merged 2 commits into
mainfrom
feat/99-reconcile-ambiguous-publish
Apr 17, 2026
Merged

feat(reconcile): close ambiguous publish outcomes against registry truth (#99)#111
EffortlessSteven merged 2 commits into
mainfrom
feat/99-reconcile-ambiguous-publish

Conversation

@EffortlessSteven

Copy link
Copy Markdown
Member

Closes first half of #99 — the biggest single open safety gap per ROADMAP's five-pillar framing.

Problem

Cargo's publish flow uploads first, then polls the index for visibility. The poll can time out without affecting the upload. So a non-zero cargo exit can coexist with a successful upload. Prior behavior: `ErrorClass::Ambiguous` fell through to the same `backoff_delay + thread::sleep` retry path as `Retryable`, which risks a duplicate upload attempt (or wastes retries while the registry hasn't yet propagated).

Solution

New module `crates/shipper/src/engine/parallel/reconcile.rs`:

```rust
pub(super) fn reconcile_ambiguous_upload(
reg, name, version, config,
) -> ReconciliationOutcome
```

Wraps the existing `readiness::is_version_visible_with_backoff` polling loop and maps its return into three outcomes:

Outcome Caller action
`Published` mark Published, advance — no retry of cargo publish
`NotPublished` fall through to Retryable path (registry confirms no duplicate-upload risk)
`StillUnknown` do not retry; mark `PackageState::Ambiguous`, halt for operator decision

Wire-in at `engine/parallel/publish.rs`: after `classify_cargo_failure` returns Ambiguous, emit `PublishReconciling`, call the reconciliation, emit `PublishReconciled`, and branch on the outcome.

Type additions (shipper-types)

All additive — no breaking changes:

  • `ReconciliationOutcome` enum (`Published` / `NotPublished` / `StillUnknown`, each carrying `attempts` + `elapsed_ms`; `StillUnknown` also carries a `reason` string)
  • `EventType::PublishReconciling { method: ReadinessMethod }`
  • `EventType::PublishReconciled { outcome: ReconciliationOutcome }`

Out of scope (follow-ups)

Files

File Change
`crates/shipper-types/src/lib.rs` new `ReconciliationOutcome` enum + 2 new `EventType` variants
`crates/shipper/src/engine/parallel/reconcile.rs` new module (100 lines + 2 unit tests)
`crates/shipper/src/engine/parallel/mod.rs` `mod reconcile;`
`crates/shipper/src/engine/parallel/publish.rs` wire-in at retry loop (split `Retryable | Ambiguous` arm)
`crates/shipper/src/engine/parallel/CLAUDE.md` document reconcile.rs in the file layout

Net: 5 files, +266 / -1.

Test plan

  • `cargo check -p shipper-types` + `-p shipper` clean
  • `cargo clippy -p shipper -p shipper-types --all-targets -- -D warnings` clean
  • `cargo test -p shipper-types` — 268 pass
  • `cargo test -p shipper --lib parallel` — 68 pass (no regressions)
  • `cargo test -p shipper --lib reconcile` — 2 new unit tests pass
  • Pre-existing flake on main: `ops::git::cleanliness::tests::ensure_git_clean_new_phrasing` fails under full-suite parallel pressure, passes in isolation. Reproduces on `main` without these changes.

Related

Closes the Ambiguous branch of the publish retry loop, replacing blind
backoff+retry with a bounded reconciliation poll that resolves one of three
outcomes against the registry.

## Problem

Cargo's publish flow uploads first, then polls the index for visibility.
The poll can time out without affecting the upload. So a non-zero cargo
exit can coexist with a successful upload. Prior behavior: `ErrorClass::
Ambiguous` fell through to the same `backoff_delay + thread::sleep`
retry path as `Retryable`, which risks a duplicate upload attempt (or
wastes retries while the registry hasn't yet propagated).

This was the biggest single open safety gap per ROADMAP's five-pillar
framing (see #99, #102, #109).

## Design

New module `engine/parallel/reconcile.rs` with:

    pub(super) fn reconcile_ambiguous_upload(
        reg, name, version, config,
    ) -> ReconciliationOutcome

Wraps the existing `readiness::is_version_visible_with_backoff` polling
loop and maps its return into the three-outcome enum in shipper-types:

- `Published`     — version appeared on the registry; caller marks Published
                    and advances. No retry of cargo publish.
- `NotPublished`  — polling completed without the version appearing.
                    Caller falls through to the Retryable path (safe:
                    registry confirms no duplicate-upload risk).
- `StillUnknown`  — polling itself errored. Caller MUST NOT retry; marks
                    PackageState::Ambiguous and halts for operator decision.

Wire-in at `engine/parallel/publish.rs`: after `classify_cargo_failure`
returns Ambiguous, emit `PublishReconciling`, call the reconciliation,
emit `PublishReconciled`, and branch on the outcome. The Retryable arm
remains as the standard backoff+sleep path, now reached from Ambiguous
only when reconciliation has confirmed NotPublished.

## Type additions

shipper-types gains:
- `ReconciliationOutcome { Published | NotPublished | StillUnknown }`
  (each variant carries `attempts` and `elapsed_ms`; StillUnknown also
  carries a `reason` string)
- `EventType::PublishReconciling { method: ReadinessMethod }`
- `EventType::PublishReconciled { outcome: ReconciliationOutcome }`

All additive — no breaking changes to existing consumers.

## Scope

This PR closes the core retry-loop gap. Follow-ups (not in this PR):

- Make `shipper resume` reconcile packages found in `PackageState::
  Ambiguous` before dispatching any further work (tracks under #90/#101).
- Demote `cargo publish` stdout parsing to a hint in docs (trivial docs
  PR; tracks under #99's PR 2).
- Integration-level BDD scenarios exercising the three outcomes end-to-end
  against a `tiny_http` mock.

## Test plan

- [x] `cargo check -p shipper-types` + `-p shipper` clean
- [x] `cargo clippy -p shipper -p shipper-types --all-targets -- -D warnings` clean
- [x] `cargo test -p shipper-types` — 268 pass
- [x] `cargo test -p shipper --lib parallel` — 68 pass (no regressions)
- [x] `cargo test -p shipper --lib reconcile` — 2 new unit tests pass
- [ ] Full-suite flake: `ops::git::cleanliness::tests::ensure_git_clean_new_phrasing`
      fails under parallel-test pressure but passes in isolation. Pre-existing,
      reproduces on main, unrelated to these changes.

Closes first half of #99 (state machine + engine integration).
@coderabbitai

coderabbitai Bot commented Apr 17, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@EffortlessSteven has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 48 minutes and 24 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 48 minutes and 24 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 91a2a618-47a3-49c3-a3c0-0c1f56e84dea

📥 Commits

Reviewing files that changed from the base of the PR and between dc61f1e and 117d4f0.

📒 Files selected for processing (5)
  • crates/shipper-types/src/lib.rs
  • crates/shipper/src/engine/parallel/CLAUDE.md
  • crates/shipper/src/engine/parallel/mod.rs
  • crates/shipper/src/engine/parallel/publish.rs
  • crates/shipper/src/engine/parallel/reconcile.rs
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/99-reconcile-ambiguous-publish

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a reconciliation mechanism to handle ambiguous cargo publish outcomes, preventing blind retries by verifying the registry state. It adds the ReconciliationOutcome enum and a new reconcile module that leverages existing readiness polling. The parallel engine now handles ErrorClass::Ambiguous by either advancing on confirmed publication, retrying on confirmed absence, or halting on inconclusive results. Review feedback highlights a redundant success webhook notification, the need to preserve readiness evidence for auditability, and a missing failure notification when reconciliation remains inconclusive.

Comment on lines +357 to +367
maybe_send_event(
&opts.webhook,
WebhookEvent::PublishSucceeded {
plan_id: ws.plan.plan_id.clone(),
package_name: p.name.clone(),
package_version: p.version.clone(),
duration_ms: start_instant.elapsed().as_millis() as u64,
},
);

last_err = None;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This webhook notification is redundant. When reconciliation succeeds, last_err is set to None and the loop is broken (line 368), which leads to the generic success path at the end of the function (line 665) where the PublishSucceeded event is sent again. This results in duplicate notifications for the same package.

Suggested change
maybe_send_event(
&opts.webhook,
WebhookEvent::PublishSucceeded {
plan_id: ws.plan.plan_id.clone(),
package_name: p.name.clone(),
package_version: p.version.clone(),
duration_ms: start_instant.elapsed().as_millis() as u64,
},
);
last_err = None;
last_err = None;

Comment on lines +312 to +317
let outcome = reconcile_ambiguous_upload(
reg,
&p.name,
&p.version,
&readiness_config,
);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The readiness evidence collected during reconciliation is currently discarded. reconcile_ambiguous_upload should return the Vec<ReadinessEvidence> from its internal polling loop so it can be appended to readiness_evidence in this function. Without this, the final audit receipt will be missing the verification steps performed during reconciliation, which is a regression in auditability.

Comment on lines +389 to +396
return PackagePublishResult {
result: Err(anyhow::anyhow!(
"{}@{}: reconciliation inconclusive: {}",
p.name,
p.version,
reason
)),
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When reconciliation is inconclusive (StillUnknown), the function returns early without sending a failure webhook. This is inconsistent with other failure paths (e.g., line 419) and means operators won't receive a real-time notification when the publish process halts in an ambiguous state.

Suggested change
return PackagePublishResult {
result: Err(anyhow::anyhow!(
"{}@{}: reconciliation inconclusive: {}",
p.name,
p.version,
reason
)),
};
maybe_send_event(
&opts.webhook,
WebhookEvent::PublishFailed {
plan_id: ws.plan.plan_id.clone(),
package_name: p.name.clone(),
package_version: p.version.clone(),
error_class: "Ambiguous".to_string(),
message: format!("reconciliation inconclusive: {}", reason),
},
);
return PackagePublishResult {
result: Err(anyhow::anyhow!(
"{}@{}: reconciliation inconclusive: {}",
p.name,
p.version,
reason
)),
};

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dc7085b575

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

attempts: evidence.len() as u32,
elapsed_ms: start.elapsed().as_millis() as u64,
},
Ok((false, evidence)) => ReconciliationOutcome::NotPublished {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Classify failed reconciliation probes as StillUnknown

This Ok((false, evidence)) branch is treating all non-visible results as conclusively NotPublished, but is_version_visible_with_backoff currently converts API/index probe errors into false (unwrap_or(false) in the readiness helper). In ambiguous publish cases, transient registry/query failures can therefore be misclassified as safe-to-retry, and the caller will re-run cargo publish even though the prior upload may already have succeeded. These probe-failure cases should map to StillUnknown (or preserve error signal) so ambiguous uploads are halted instead of blindly retried.

Useful? React with 👍 / 👎.

…bhook, notify on StillUnknown

Four fixes from external PR #111 review:

1. **cargo fmt**: reconcile_ambiguous_upload call signature reformatted per rustfmt.

2. **Duplicate success webhook**: Published branch was emitting
   WebhookEvent::PublishSucceeded inline AND falling through to the
   end-of-function success emission at the end of publish_package.
   Removed the inline emission — end-of-function handles it. (Pre-existing
   same pattern in readiness-success path is out of scope for this fix.)

3. **Reconciliation evidence preserved**: reconcile_ambiguous_upload now
   returns (ReconciliationOutcome, Vec<ReadinessEvidence>). Both the
   Published and NotPublished branches assign the evidence into
   readiness_evidence so it reaches the package receipt. StillUnknown
   returns an empty evidence vec (we know nothing).

4. **Failure webhook on StillUnknown**: previously the StillUnknown halt
   path marked state Ambiguous and returned Err silently with no webhook.
   Now emits WebhookEvent::PublishFailed with error_class=Ambiguous and
   message including the reconciliation reason, so operators are notified
   the run halted for human decision.
@EffortlessSteven EffortlessSteven merged commit 35fb194 into main Apr 17, 2026
16 checks passed
@EffortlessSteven EffortlessSteven deleted the feat/99-reconcile-ambiguous-publish branch April 17, 2026 02:16
@codecov

codecov Bot commented Apr 17, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 50.98039% with 75 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/shipper/src/engine/parallel/publish.rs 41.41% 58 Missing ⚠️
crates/shipper/src/engine/parallel/reconcile.rs 68.51% 17 Missing ⚠️

📢 Thoughts on this report? Let us know!

EffortlessSteven added a commit that referenced this pull request Apr 17, 2026
… follow-on)

Closes the remaining safety gap from #111: when a prior run left a package
in PackageState::Ambiguous (reconciliation was inconclusive at that time),
a subsequent `shipper resume` previously fell through to blind cargo retry.
That risked re-uploading a crate whose original upload may have actually
succeeded.

## What this adds

Both the sequential (engine::mod) and parallel (engine::parallel::publish)
publish paths now detect `PackageState::Ambiguous` BEFORE any cargo
activity and reconcile against registry truth first.

Outcomes follow the same state machine as in-flight reconciliation:

- **Published** → mark Published, skip any further cargo work for this
  crate, continue/return success receipt with no republish.
- **NotPublished** → clear Ambiguous back to Pending (registry confirms
  no prior upload), fall through to the normal publish flow.
- **StillUnknown** → emit PublishFailed webhook + halt with an
  operator-actionable error; do NOT retry.

## Parallel + sequential symmetry

- Sequential: new `sequential_reconcile` helper in engine::mod wraps the
  full `RegistryClient::is_version_visible_with_backoff` and returns the
  same `(ReconciliationOutcome, Vec<ReadinessEvidence>)` shape as the
  parallel path's helper. Called from the per-package state-check match
  arm.
- Parallel: reuses existing `reconcile_ambiguous_upload` from
  engine::parallel::reconcile. Called just before the retry-loop entry,
  returning a completed PackageResult early on Published or StillUnknown.

Both paths emit `PublishReconciling` and `PublishReconciled` events so
the event stream records the resume-time reconciliation alongside any
in-flight reconciliation from earlier runs.

## Test plan

- [x] `cargo clippy --all-targets -- -D warnings` clean
- [x] `cargo fmt --check` clean
- [x] 192 engine tests pass (no regressions)

BDD scenarios for resume-path reconciliation across all three outcomes
are a natural follow-up; the handler shape mirrors the in-flight path so
regressions here surface through existing integration tests that already
cover the outcome shape.

## Related

- Parent: #99 Reconcile (the biggest non-cosmetic safety gap from the
  v0.3.0-rc.1 retrospective)
- Sibling PRs already merged: #111 (in-flight reconcile), #112 (events-
  as-truth), #113 (retry narration), #114 (registry-aware backoff)
- Master roadmap: #109
EffortlessSteven added a commit that referenced this pull request Apr 17, 2026
… follow-on) (#115)

Closes the remaining safety gap from #111: when a prior run left a package
in PackageState::Ambiguous (reconciliation was inconclusive at that time),
a subsequent `shipper resume` previously fell through to blind cargo retry.
That risked re-uploading a crate whose original upload may have actually
succeeded.

## What this adds

Both the sequential (engine::mod) and parallel (engine::parallel::publish)
publish paths now detect `PackageState::Ambiguous` BEFORE any cargo
activity and reconcile against registry truth first.

Outcomes follow the same state machine as in-flight reconciliation:

- **Published** → mark Published, skip any further cargo work for this
  crate, continue/return success receipt with no republish.
- **NotPublished** → clear Ambiguous back to Pending (registry confirms
  no prior upload), fall through to the normal publish flow.
- **StillUnknown** → emit PublishFailed webhook + halt with an
  operator-actionable error; do NOT retry.

## Parallel + sequential symmetry

- Sequential: new `sequential_reconcile` helper in engine::mod wraps the
  full `RegistryClient::is_version_visible_with_backoff` and returns the
  same `(ReconciliationOutcome, Vec<ReadinessEvidence>)` shape as the
  parallel path's helper. Called from the per-package state-check match
  arm.
- Parallel: reuses existing `reconcile_ambiguous_upload` from
  engine::parallel::reconcile. Called just before the retry-loop entry,
  returning a completed PackageResult early on Published or StillUnknown.

Both paths emit `PublishReconciling` and `PublishReconciled` events so
the event stream records the resume-time reconciliation alongside any
in-flight reconciliation from earlier runs.

## Test plan

- [x] `cargo clippy --all-targets -- -D warnings` clean
- [x] `cargo fmt --check` clean
- [x] 192 engine tests pass (no regressions)

BDD scenarios for resume-path reconciliation across all three outcomes
are a natural follow-up; the handler shape mirrors the in-flight path so
regressions here surface through existing integration tests that already
cover the outcome shape.

## Related

- Parent: #99 Reconcile (the biggest non-cosmetic safety gap from the
  v0.3.0-rc.1 retrospective)
- Sibling PRs already merged: #111 (in-flight reconcile), #112 (events-
  as-truth), #113 (retry narration), #114 (registry-aware backoff)
- Master roadmap: #109
EffortlessSteven added a commit that referenced this pull request Apr 17, 2026
…low-on)

Closes one of the remaining #99 follow-ons identified in the
retrospectives: make it explicit, in the rustdoc that downstream tooling
and contributors will actually read, that cargo's stdout/stderr output is
a classification HINT for Shipper's retry loop — not the authoritative
answer about what happened.

## What this changes

- `ErrorClass` rustdoc (shipper-types): adds a "Classification is a hint,
  not truth" section explaining that pattern-matching cargo text drives
  retry scheduling but ambiguous outcomes MUST be reconciled against
  registry truth. Cross-refs ReconciliationOutcome and the reconcile
  module.
- `classify_cargo_failure` rustdoc (shipper::runtime::execution): states
  up-front "This is a hint, not authoritative truth" and points at the
  reconcile flow for the authoritative resolution on Ambiguous outcomes.
- `docs/explanation/why-shipper.md` gains a new section "Cargo stdout is
  a hint; the registry is the truth" making the same point at the product
  level — and connecting it to why Shipper can be safer than a naive
  shell-script `cargo publish` loop.

## Why this matters

Contributors reading `ErrorClass` and wondering "can I just pattern-match
more cargo strings?" now see the contract: cargo text is the fast-path
hint, registry queries are truth. The reconcile flow introduced in #111
(in-flight) and #115 (resume-path) is now documented as the authoritative
side of that contract, not just implemented.

## Scope

Pure docs/rustdoc change. No code changes. No schema changes. No
snapshot churn. Covered by existing tests.

- `cargo doc --no-deps -Dwarnings` clean
- `cargo test -p shipper-types` + `-p shipper --lib runtime::execution`
  pass (no regressions)

## Related

- Parent: #99 Reconcile
- Prior PRs: #111 (in-flight reconcile), #115 (resume-path reconcile)
- Pillar: Reconcile (#102)
EffortlessSteven added a commit that referenced this pull request Apr 17, 2026
…low-on) (#117)

Closes one of the remaining #99 follow-ons identified in the
retrospectives: make it explicit, in the rustdoc that downstream tooling
and contributors will actually read, that cargo's stdout/stderr output is
a classification HINT for Shipper's retry loop — not the authoritative
answer about what happened.

## What this changes

- `ErrorClass` rustdoc (shipper-types): adds a "Classification is a hint,
  not truth" section explaining that pattern-matching cargo text drives
  retry scheduling but ambiguous outcomes MUST be reconciled against
  registry truth. Cross-refs ReconciliationOutcome and the reconcile
  module.
- `classify_cargo_failure` rustdoc (shipper::runtime::execution): states
  up-front "This is a hint, not authoritative truth" and points at the
  reconcile flow for the authoritative resolution on Ambiguous outcomes.
- `docs/explanation/why-shipper.md` gains a new section "Cargo stdout is
  a hint; the registry is the truth" making the same point at the product
  level — and connecting it to why Shipper can be safer than a naive
  shell-script `cargo publish` loop.

## Why this matters

Contributors reading `ErrorClass` and wondering "can I just pattern-match
more cargo strings?" now see the contract: cargo text is the fast-path
hint, registry queries are truth. The reconcile flow introduced in #111
(in-flight) and #115 (resume-path) is now documented as the authoritative
side of that contract, not just implemented.

## Scope

Pure docs/rustdoc change. No code changes. No schema changes. No
snapshot churn. Covered by existing tests.

- `cargo doc --no-deps -Dwarnings` clean
- `cargo test -p shipper-types` + `-p shipper --lib runtime::execution`
  pass (no regressions)

## Related

- Parent: #99 Reconcile
- Prior PRs: #111 (in-flight reconcile), #115 (resume-path reconcile)
- Pillar: Reconcile (#102)
EffortlessSteven added a commit that referenced this pull request Apr 17, 2026
…omes (#99 follow-on)

Closes the last remaining #99 follow-on: end-to-end integration coverage
for the three reconciliation outcomes the state machine can produce.

## Three scenarios added to engine::parallel::tests

Each exercises publish_package() against a real tiny_http mock registry
with a fake cargo binary configured to exit ambiguously (exit 1, empty
stdout/stderr — which classify_publish_failure maps to
CargoFailureClass::Ambiguous per existing unit tests).

Readiness polling is disabled (readiness.enabled = false) so each
reconcile site reduces to a single registry query — avoids flaky request
counts that a full polling loop would produce.

### 1. reconcile_bdd_ambiguous_resolves_to_published

- cargo fails ambiguously
- entry + post-cargo quick checks return 404
- reconcile's single version_exists poll returns 200
- expected: PackageState::Published, attempts=1 (no second cargo)
- verifies PublishReconciling + PublishReconciled{Published} events
  emitted to the on-disk event log

### 2. reconcile_bdd_ambiguous_resolves_to_not_published_then_retries

- cargo fails ambiguously on every attempt
- registry consistently returns 404
- reconcile resolves to NotPublished, falls through to normal retry
- after max_attempts=2, package ends Failed
- verifies PublishReconciled{NotPublished} event emitted
- verifies cargo was actually retried (attempts=2)

### 3. reconcile_bdd_resume_from_ambiguous_state_skips_republish

- state pre-seeded with PackageState::Ambiguous (simulating a prior
  interrupted run)
- entry check returns 404 (bypasses "already published" short-circuit)
- resume-path reconcile (from PR #115) fires BEFORE retry loop
- single version_exists query returns 200 → Published
- expected: attempts=0 (cargo NOT invoked)
- verifies via SHIPPER_CARGO_ARGS_LOG that cargo binary was never called
- verifies PublishReconciling + PublishReconciled{Published} events emitted

## Verification

- cargo test -p shipper --lib -- reconcile_bdd: 3 pass
- cargo test -p shipper --lib -- engine::parallel: 70 pass (no regressions)
- cargo clippy --all-targets -- -D warnings: clean
- cargo fmt --check: clean

## Why events are read from disk

publish_package flushes event batches to events.jsonl and clears the
in-memory log after each write. Assertions read via
EventLog::read_from_file(&events_path) to see the full persisted stream.

## #99 tracker after this merge

- [x] In-flight reconcile state machine (#111)
- [x] Resume-path reconciliation (#115)
- [x] Docs/contracts: cargo text is a hint, registry is truth (#117)
- [x] BDD integration scenarios for all 3 outcomes (this PR)

#99 is now fully closed.
EffortlessSteven added a commit that referenced this pull request Apr 17, 2026
…omes (#99 follow-on) (#119)

Closes the last remaining #99 follow-on: end-to-end integration coverage
for the three reconciliation outcomes the state machine can produce.

## Three scenarios added to engine::parallel::tests

Each exercises publish_package() against a real tiny_http mock registry
with a fake cargo binary configured to exit ambiguously (exit 1, empty
stdout/stderr — which classify_publish_failure maps to
CargoFailureClass::Ambiguous per existing unit tests).

Readiness polling is disabled (readiness.enabled = false) so each
reconcile site reduces to a single registry query — avoids flaky request
counts that a full polling loop would produce.

### 1. reconcile_bdd_ambiguous_resolves_to_published

- cargo fails ambiguously
- entry + post-cargo quick checks return 404
- reconcile's single version_exists poll returns 200
- expected: PackageState::Published, attempts=1 (no second cargo)
- verifies PublishReconciling + PublishReconciled{Published} events
  emitted to the on-disk event log

### 2. reconcile_bdd_ambiguous_resolves_to_not_published_then_retries

- cargo fails ambiguously on every attempt
- registry consistently returns 404
- reconcile resolves to NotPublished, falls through to normal retry
- after max_attempts=2, package ends Failed
- verifies PublishReconciled{NotPublished} event emitted
- verifies cargo was actually retried (attempts=2)

### 3. reconcile_bdd_resume_from_ambiguous_state_skips_republish

- state pre-seeded with PackageState::Ambiguous (simulating a prior
  interrupted run)
- entry check returns 404 (bypasses "already published" short-circuit)
- resume-path reconcile (from PR #115) fires BEFORE retry loop
- single version_exists query returns 200 → Published
- expected: attempts=0 (cargo NOT invoked)
- verifies via SHIPPER_CARGO_ARGS_LOG that cargo binary was never called
- verifies PublishReconciling + PublishReconciled{Published} events emitted

## Verification

- cargo test -p shipper --lib -- reconcile_bdd: 3 pass
- cargo test -p shipper --lib -- engine::parallel: 70 pass (no regressions)
- cargo clippy --all-targets -- -D warnings: clean
- cargo fmt --check: clean

## Why events are read from disk

publish_package flushes event batches to events.jsonl and clears the
in-memory log after each write. Assertions read via
EventLog::read_from_file(&events_path) to see the full persisted stream.

## #99 tracker after this merge

- [x] In-flight reconcile state machine (#111)
- [x] Resume-path reconciliation (#115)
- [x] Docs/contracts: cargo text is a hint, registry is truth (#117)
- [x] BDD integration scenarios for all 3 outcomes (this PR)

#99 is now fully closed.
EffortlessSteven added a commit that referenced this pull request Apr 18, 2026
… follow-on)

Completes the third acceptance-criteria BDD scenario for #99. The existing
suite already covered Published (registry confirms upload landed) and
NotPublished (registry confirms upload didn't land); this adds the
operator-actionable StillUnknown case where reconciliation itself can't
reach a verdict because the registry is unhealthy.

Setup drives every registry query on /api/v1/crates/demo/0.1.0 to 500,
so version_exists bails with Err for non-200/404 statuses.
reconcile_ambiguous_upload translates that Err into
ReconciliationOutcome::StillUnknown.

Asserts:
  - publish returns Err with "reconciliation inconclusive" message
  - package state is PackageState::Ambiguous (so resume's reconcile
    block will fire rather than a silent retry)
  - exactly one cargo invocation — StillUnknown must not blind-retry
  - PublishReconciling + PublishReconciled{ StillUnknown } events
    are persisted to events.jsonl for operator visibility

Closes the last open AC on #99 acceptance list; reconciliation behavior
itself was already implemented in PR #111 and PR #115.
EffortlessSteven added a commit that referenced this pull request Apr 18, 2026
… follow-on) (#144)

Completes the third acceptance-criteria BDD scenario for #99. The existing
suite already covered Published (registry confirms upload landed) and
NotPublished (registry confirms upload didn't land); this adds the
operator-actionable StillUnknown case where reconciliation itself can't
reach a verdict because the registry is unhealthy.

Setup drives every registry query on /api/v1/crates/demo/0.1.0 to 500,
so version_exists bails with Err for non-200/404 statuses.
reconcile_ambiguous_upload translates that Err into
ReconciliationOutcome::StillUnknown.

Asserts:
  - publish returns Err with "reconciliation inconclusive" message
  - package state is PackageState::Ambiguous (so resume's reconcile
    block will fire rather than a silent retry)
  - exactly one cargo invocation — StillUnknown must not blind-retry
  - PublishReconciling + PublishReconciled{ StillUnknown } events
    are persisted to events.jsonl for operator visibility

Closes the last open AC on #99 acceptance list; reconciliation behavior
itself was already implemented in PR #111 and PR #115.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant