docs: operator-trust pack — not_proven explainer + stalled-run triage + state-files cheat sheet by EffortlessSteven · Pull Request #118 · EffortlessMetrics/shipper

EffortlessSteven · 2026-04-17T09:24:30Z

Closes the "scary-but-correct UX" papercuts flagged across multiple external retrospectives. Three new docs covering operator-legibility gaps where Shipper's behaviour is correct but the operator's mental model wasn't supported.

New docs (Diátaxis)

File	Quadrant	Purpose
`docs/explanation/finishability.md`	Explanation	Why `finishability = not_proven` is honest-not-alarming on first publish
`docs/how-to/inspect-a-stalled-run.md`	How-to	Live triage — "is the train alive?" 30-second check, question-to-file map, jq recipes
`docs/reference/state-files.md`	Reference	One-page cheat sheet — authority order, per-file fields, field-path caveats, jq one-liners

Index updates

`docs/README.md` (Diátaxis index): new entries under their quadrants
Root `README.md`: extended quick-links strip

Why these three specifically

Three external retrospectives converged on the same list:

`not_proven` reads alarming but is epistemically honest for first-publish (ownership can't be verified on a crate that doesn't exist yet)
Operators don't always know which file (`events.jsonl` / `state.json` / `receipt.json`) answers which question
There's no live-triage guide for "is it still alive?" vs "did it stall?"

The existing `inspect-state-and-receipts.md` covers post-hoc inspection. The new `inspect-a-stalled-run.md` is distinguished: live triage, during the run.

Scope

Pure docs. No code, no schema, no snapshot churn. 5 files changed, +334 / -4.

#103 Narrate umbrella — operator legibility is the theme; scout comment proposed this pack
Complements docs: demote cargo stdout to hint; registry is authoritative (#99 follow-on) #117 (cargo-stdout-as-hint docs) merged earlier this session
Document and enforce the events-as-truth / state-as-projection invariant #93 events-as-truth (the INVARIANTS the cheat sheet references)

Verification

Markdown renders cleanly
All internal doc links resolve
All issue references (Surface retry/backoff state to operators (structured events + CLI visibility) #91, Slim preflight_workspace_verify event: sidecar full output, strip ANSI #92, Document and enforce the events-as-truth / state-as-projection invariant #93, Rehearsal registry: strengthen preflight from dry-run to actual publish + install proof #97, Ambiguous publish reconciliation: state machine against registry truth, no blind retry #99) are real open/closed issues
No `cargo doc` changes needed (no rustdoc edits)

… + state-files cheat sheet Closes the "scary-but-correct UX" papercuts flagged across multiple external retrospectives. Three new docs covering operator-legibility gaps where Shipper's behavior is correct but the operator's mental model wasn't supported. ## New docs (Diátaxis) - **`docs/explanation/finishability.md`** — why `finishability = not_proven` is the correct answer on first publish, not danger. Maps each case (first publish, token lacks ownership, network flake) to concrete operator action. Notes the future rehearsal-registry path (#97) that will promote more NotProven cases to Proven. - **`docs/how-to/inspect-a-stalled-run.md`** — live triage. "Is the train alive or hung?" 30-second check using events.jsonl tail. Maps common questions (current crate, how long waiting, what will resume do, why did it fail) to the authoritative file + jq recipe. Distinguished from the existing `inspect-state-and-receipts.md` which covers post-hoc "what happened." - **`docs/reference/state-files.md`** — one-page cheat sheet. Authority order (events > state > receipt), per-file purpose table, key field paths (including the `.packages[].state.state` nesting caveat — common misread), jq one-liners for the most frequent queries, sidecar files reference. ## Navigation updates - `docs/README.md` (Diátaxis index): add all three new entries under their correct quadrants - Root `README.md`: extend the quick-links strip to include stalled-run triage, state-files cheat sheet, and the not_proven explainer ## Scope Pure docs. No code, no schema, no snapshot churn. ~450 lines of new content in 3 files + 2 index updates. ## Why these three specifically Three external retrospectives converged on the same "scary-but-correct" list: - `not_proven` reads alarming but is epistemically honest for first-publish - Operators don't always know which file answers which question - There's no live-triage guide for "is it still alive?" vs "did it stall?" This pack closes all three in a single focused PR. ## Related - #103 Narrate umbrella (operator legibility is the theme; see scout comment) - #99 follow-ons — complements #117 (cargo-stdout-as-hint docs) merged earlier this session - #93 events-as-truth (the INVARIANTS the cheat sheet references)

coderabbitai · 2026-04-17T09:24:38Z

Warning

Rate limit exceeded

@EffortlessSteven has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 12 minutes and 42 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 12 minutes and 42 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 07a285e6-4f84-491c-97c0-eefd5ad86cfa

📥 Commits

Reviewing files that changed from the base of the PR and between 0a6aa24 and 1484b84.

📒 Files selected for processing (5)

README.md
docs/README.md
docs/explanation/finishability.md
docs/how-to/inspect-a-stalled-run.md
docs/reference/state-files.md

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch docs/scary-but-correct-operator-pack

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request significantly expands the project's documentation by adding a triage guide for stalled or interrupted runs, a reference cheat sheet for the .shipper/ state files, and an explanation of the finishability states used in preflight checks. The main README and documentation index have been updated to incorporate these new resources. Feedback was provided to improve the technical accuracy and formatting of the triage table in the stalled run guide, specifically to ensure that event outcomes mentioned in the documentation match the actual JSON output produced by the tool.

gemini-code-assist · 2026-04-17T09:26:02Z

+| Is the train alive or hung? | `events.jsonl` (latest entries) | `tail -n 20 .shipper/events.jsonl \| jq -c '.'` |
+| What's the current crate? | `events.jsonl` (last `package_started`) | see below |
+| How long has it been waiting? | `events.jsonl` (last `retry_backoff_started`) | see below |
+| Which crates finished? | `events.jsonl` (published events) OR `state.json` | see below |
+| What's next when I resume? | `state.json` (packages with `state.state == "pending"`) | see below |
+| Why did it fail? | `events.jsonl` (last `package_failed` / `publish_reconciled.StillUnknown`) | see below |


chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1484b84b1a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-17T09:29:14Z

+1. Download the `shipper-state-final` artifact from the cancelled run (or `shipper-state-preflight` / `shipper-state-plan` if later stages never ran).
+2. Trigger the `release-resume` workflow_dispatch with `mode=resume` and `artifact_run_id=<cancelled-run-id>`.


Correct resume instructions for non-final artifacts

This workflow guidance says release-resume can continue from shipper-state-preflight/shipper-state-plan, but .github/workflows/release.yml hard-codes the download step to name: shipper-state-final; in a cancelled/timeout run before final upload, following these steps will fail with artifact-not-found instead of resuming. Please either limit the instructions to shipper-state-final or document the required workflow change to select a different artifact name.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-17T09:29:14Z

+| What happened, in order? | `events.jsonl` |
+| What's the current state (fast lookup)? | `state.json` |
+| Did the whole release succeed, and what's the audit trail? | `receipt.json` |
+| What would `shipper resume` skip? | `state.json` (packages with `state.state == "published"`) |


Include skipped packages in resume-skip guidance

This row says resume skips only packages where state.state == "published", but the engine also skips packages already in Skipped state; documenting only published underreports what shipper resume will bypass and can mislead operators during incident triage. The condition here should include both published and skipped.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-17T09:29:14Z

+
+## "What will resume do?"
+
+`shipper resume` reads `state.json`, validates the `plan_id` matches the current workspace, and continues from the first non-terminal package. Terminal states for resume: `Published`, `Skipped`. Non-terminal: `Pending`, `Failed`, `Ambiguous`.


Classify Uploaded as resumable in triage section

The state classification omits Uploaded, but interrupted runs can persist state.state == "uploaded"; resume handles this as a distinct path (skip cargo publish and continue readiness/verification), so excluding it makes the triage checklist incomplete and can cause operators to misinterpret what resume will actually do.

Useful? React with 👍 / 👎.

codecov · 2026-04-17T09:55:45Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

gemini-code-assist Bot reviewed Apr 17, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 17, 2026

View reviewed changes

EffortlessSteven merged commit 0589ba4 into main Apr 17, 2026
19 checks passed

EffortlessSteven deleted the docs/scary-but-correct-operator-pack branch April 17, 2026 09:55

This was referenced Apr 19, 2026

Competency: Ergonomics — gap audit and roadmap #108

Open

Competency: Narrate — gap audit and roadmap #103

Open

Roadmap: post-v0.3.0-rc.1 product scorecard across nine competencies #109

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: operator-trust pack — not_proven explainer + stalled-run triage + state-files cheat sheet#118

docs: operator-trust pack — not_proven explainer + stalled-run triage + state-files cheat sheet#118
EffortlessSteven merged 1 commit into
mainfrom
docs/scary-but-correct-operator-pack

EffortlessSteven commented Apr 17, 2026

Uh oh!

coderabbitai Bot commented Apr 17, 2026

Rate limit exceeded

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Apr 17, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 17, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 17, 2026

Uh oh!

chatgpt-codex-connector Bot Apr 17, 2026

Uh oh!

Uh oh!

codecov Bot commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		1. Download the `shipper-state-final` artifact from the cancelled run (or `shipper-state-preflight` / `shipper-state-plan` if later stages never ran).
		2. Trigger the `release-resume` workflow_dispatch with `mode=resume` and `artifact_run_id=<cancelled-run-id>`.


		## "What will resume do?"

		`shipper resume` reads `state.json`, validates the `plan_id` matches the current workspace, and continues from the first non-terminal package. Terminal states for resume: `Published`, `Skipped`. Non-terminal: `Pending`, `Failed`, `Ambiguous`.

Uh oh!

Conversation

EffortlessSteven commented Apr 17, 2026

New docs (Diátaxis)

Index updates

Why these three specifically

Scope

Related

Verification

Uh oh!

coderabbitai Bot commented Apr 17, 2026

Rate limit exceeded

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented Apr 17, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant