Skip to content

feat(consistency): enforce events-as-truth invariant at end-of-run (#93)#112

Merged
EffortlessSteven merged 1 commit into
mainfrom
feat/93-events-state-consistency
Apr 17, 2026
Merged

feat(consistency): enforce events-as-truth invariant at end-of-run (#93)#112
EffortlessSteven merged 1 commit into
mainfrom
feat/93-events-state-consistency

Conversation

@EffortlessSteven

Copy link
Copy Markdown
Member

Closes #93 PR 1 — turns the prose invariant in `docs/INVARIANTS.md` into code.

Problem

The docs say:

  • `events.jsonl` = truth (append-only)
  • `state.json` = projection
  • `receipt.json` = summary

But nothing in code enforced that the three agreed on which packages were published. A stale/corrupted `state.json` could silently mislead `shipper resume` — the kind of drift that causes duplicate uploads or refuses valid work.

What this adds

Types (`shipper-types`)

  • `StateEventDrift { in_events_only, in_state_only }` with `is_consistent()` helper
  • `EventType::StateEventDriftDetected { drift }` — new event variant for auditors/tooling

Implementation (`shipper::state::consistency`)

  • `verify_events_state_consistency(events_path, state) -> Result`: reads `events.jsonl`, builds sorted sets of packages with `PackagePublished` events vs packages with `PackageState::Published`, returns the delta
  • `format_drift_summary(drift) -> String`: human-readable rendering

Wire-in (`engine::run_publish`)

Both the parallel and sequential end-of-run paths run the check immediately before emitting `ExecutionFinished`:

Outcome Action
Consistent silent
Drift detected warn via Reporter + record `StateEventDriftDetected` event
Check errored (I/O) warn via Reporter, continue

The run is not failed on drift — the publishes already happened. The goal is legibility and audit, not punishment.

Tests

7 unit tests in `state::consistency::tests`:

  • consistent when state and events agree
  • detects `in_events_only` (the dangerous case — resume would duplicate)
  • detects `in_state_only` (event log bypassed)
  • empty state + empty events
  • ignores non-Published states (Failed / Skipped / Pending)
  • drift summary formatting (consistent + mixed-side)

No regressions: 125 engine tests pass, 206 targeted state/parallel/events tests pass, full `cargo clippy --all-targets -- -D warnings` clean, `cargo fmt --check` clean.

Out of scope

Files

File Change
`crates/shipper-types/src/lib.rs` new `StateEventDrift` struct + new `EventType::StateEventDriftDetected` variant
`crates/shipper/src/state/consistency.rs` new module (100 lines + 7 unit tests)
`crates/shipper/src/state/mod.rs` `pub mod consistency;`
`crates/shipper/src/engine/mod.rs` wire check into both end-of-run paths

Net: 4 files, +348 / 0.

Related

Turns the prose invariant in docs/INVARIANTS.md into code: events.jsonl
is authoritative, state.json is a projection, and the two must agree on
which packages were published. Any drift is now detected at end of
`shipper publish` (both parallel and sequential paths), emitted as a
structured event, and surfaced to the operator via the Reporter.

## Types (shipper-types)

- `StateEventDrift { in_events_only, in_state_only }` + `is_consistent()`.
  Labels in `name@version` format consistent with the rest of the event
  stream. Both lists empty iff the projection matches the truth.

- `EventType::StateEventDriftDetected { drift: StateEventDrift }` — a new
  event variant that surfaces drift in events.jsonl for auditors and
  tooling that consume the log.

## Implementation (shipper::state::consistency)

- `verify_events_state_consistency(events_path, state) -> Result<StateEventDrift>`:
  reads events.jsonl, builds the `BTreeSet` of packages with
  `PackagePublished` events, compares against the `BTreeSet` of packages
  with `PackageState::Published` in the in-memory `ExecutionState`, and
  returns the delta on each side.

- `format_drift_summary(drift) -> String`: human-readable one-or-three-line
  rendering used by the Reporter.

## Wire-in (engine::run_publish)

Both the parallel and sequential end-of-run paths in `engine/mod.rs` now
run the consistency check immediately before emitting `ExecutionFinished`:

- Drift detected → warn via Reporter + record `StateEventDriftDetected`
  event. The run is NOT failed — the publishes already happened. The goal
  is legibility, not punishment.
- Check itself errored (I/O) → warn via Reporter, continue (don't mask
  real success by failing on a diagnostic).
- Consistent → silent.

## Tests

7 unit tests in `state::consistency::tests` covering:
- consistent when state and events agree
- detects `in_events_only` (resume would duplicate)
- detects `in_state_only` (event log bypassed)
- empty state + empty events
- ignores non-Published states (Failed / Skipped / Pending)
- drift summary formatting (consistent + mixed-side)

125 engine tests continue to pass (no regressions). 206 targeted tests
across events/parallel/consistency all green.

## Scope

This PR closes #93's PR 1 (consistency check). Follow-up #93 PR 2 is
state-rebuild-from-events for corruption recovery, which is a separate
capability and a follow-up branch.
@gemini-code-assist

Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@coderabbitai

coderabbitai Bot commented Apr 17, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@EffortlessSteven has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 39 minutes and 8 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 39 minutes and 8 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 24325aa2-f226-447e-9362-2e99be2668d0

📥 Commits

Reviewing files that changed from the base of the PR and between 35fb194 and ef899ac.

📒 Files selected for processing (4)
  • crates/shipper-types/src/lib.rs
  • crates/shipper/src/engine/mod.rs
  • crates/shipper/src/state/consistency.rs
  • crates/shipper/src/state/mod.rs
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/93-events-state-consistency

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@EffortlessSteven EffortlessSteven merged commit 0eeb91f into main Apr 17, 2026
16 checks passed
@EffortlessSteven EffortlessSteven deleted the feat/93-events-state-consistency branch April 17, 2026 02:26

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ef899acaa2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

let events_published: BTreeSet<String> = log
.all_events()
.iter()
.filter(|e| matches!(e.event_type, EventType::PackagePublished { .. }))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Count all successful publish outcomes in drift check

This check only treats EventType::PackagePublished as proof of publication, but there are normal success paths that set PackageState::Published without emitting that event (for example, when a failed cargo exit is reconciled by a positive version_exists check in run_publish/parallel publish). In those runs, in_state_only will be reported even though no corruption occurred, producing noisy state_event_drift_detected warnings and reducing the signal of real drift.

Useful? React with 👍 / 👎.

@codecov

codecov Bot commented Apr 17, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 94.00000% with 12 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/shipper/src/engine/mod.rs 59.09% 9 Missing ⚠️
crates/shipper/src/state/consistency.rs 98.28% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

EffortlessSteven added a commit that referenced this pull request Apr 17, 2026
… follow-on)

Closes the remaining safety gap from #111: when a prior run left a package
in PackageState::Ambiguous (reconciliation was inconclusive at that time),
a subsequent `shipper resume` previously fell through to blind cargo retry.
That risked re-uploading a crate whose original upload may have actually
succeeded.

## What this adds

Both the sequential (engine::mod) and parallel (engine::parallel::publish)
publish paths now detect `PackageState::Ambiguous` BEFORE any cargo
activity and reconcile against registry truth first.

Outcomes follow the same state machine as in-flight reconciliation:

- **Published** → mark Published, skip any further cargo work for this
  crate, continue/return success receipt with no republish.
- **NotPublished** → clear Ambiguous back to Pending (registry confirms
  no prior upload), fall through to the normal publish flow.
- **StillUnknown** → emit PublishFailed webhook + halt with an
  operator-actionable error; do NOT retry.

## Parallel + sequential symmetry

- Sequential: new `sequential_reconcile` helper in engine::mod wraps the
  full `RegistryClient::is_version_visible_with_backoff` and returns the
  same `(ReconciliationOutcome, Vec<ReadinessEvidence>)` shape as the
  parallel path's helper. Called from the per-package state-check match
  arm.
- Parallel: reuses existing `reconcile_ambiguous_upload` from
  engine::parallel::reconcile. Called just before the retry-loop entry,
  returning a completed PackageResult early on Published or StillUnknown.

Both paths emit `PublishReconciling` and `PublishReconciled` events so
the event stream records the resume-time reconciliation alongside any
in-flight reconciliation from earlier runs.

## Test plan

- [x] `cargo clippy --all-targets -- -D warnings` clean
- [x] `cargo fmt --check` clean
- [x] 192 engine tests pass (no regressions)

BDD scenarios for resume-path reconciliation across all three outcomes
are a natural follow-up; the handler shape mirrors the in-flight path so
regressions here surface through existing integration tests that already
cover the outcome shape.

## Related

- Parent: #99 Reconcile (the biggest non-cosmetic safety gap from the
  v0.3.0-rc.1 retrospective)
- Sibling PRs already merged: #111 (in-flight reconcile), #112 (events-
  as-truth), #113 (retry narration), #114 (registry-aware backoff)
- Master roadmap: #109
EffortlessSteven added a commit that referenced this pull request Apr 17, 2026
… follow-on) (#115)

Closes the remaining safety gap from #111: when a prior run left a package
in PackageState::Ambiguous (reconciliation was inconclusive at that time),
a subsequent `shipper resume` previously fell through to blind cargo retry.
That risked re-uploading a crate whose original upload may have actually
succeeded.

## What this adds

Both the sequential (engine::mod) and parallel (engine::parallel::publish)
publish paths now detect `PackageState::Ambiguous` BEFORE any cargo
activity and reconcile against registry truth first.

Outcomes follow the same state machine as in-flight reconciliation:

- **Published** → mark Published, skip any further cargo work for this
  crate, continue/return success receipt with no republish.
- **NotPublished** → clear Ambiguous back to Pending (registry confirms
  no prior upload), fall through to the normal publish flow.
- **StillUnknown** → emit PublishFailed webhook + halt with an
  operator-actionable error; do NOT retry.

## Parallel + sequential symmetry

- Sequential: new `sequential_reconcile` helper in engine::mod wraps the
  full `RegistryClient::is_version_visible_with_backoff` and returns the
  same `(ReconciliationOutcome, Vec<ReadinessEvidence>)` shape as the
  parallel path's helper. Called from the per-package state-check match
  arm.
- Parallel: reuses existing `reconcile_ambiguous_upload` from
  engine::parallel::reconcile. Called just before the retry-loop entry,
  returning a completed PackageResult early on Published or StillUnknown.

Both paths emit `PublishReconciling` and `PublishReconciled` events so
the event stream records the resume-time reconciliation alongside any
in-flight reconciliation from earlier runs.

## Test plan

- [x] `cargo clippy --all-targets -- -D warnings` clean
- [x] `cargo fmt --check` clean
- [x] 192 engine tests pass (no regressions)

BDD scenarios for resume-path reconciliation across all three outcomes
are a natural follow-up; the handler shape mirrors the in-flight path so
regressions here surface through existing integration tests that already
cover the outcome shape.

## Related

- Parent: #99 Reconcile (the biggest non-cosmetic safety gap from the
  v0.3.0-rc.1 retrospective)
- Sibling PRs already merged: #111 (in-flight reconcile), #112 (events-
  as-truth), #113 (retry narration), #114 (registry-aware backoff)
- Master roadmap: #109
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Document and enforce the events-as-truth / state-as-projection invariant

1 participant