Skip to content

feat(narrate): live retry countdown in CLI (#103 PR 1)#155

Merged
EffortlessSteven merged 3 commits into
mainfrom
feat/103-live-retry-countdown
Apr 21, 2026
Merged

feat(narrate): live retry countdown in CLI (#103 PR 1)#155
EffortlessSteven merged 3 commits into
mainfrom
feat/103-live-retry-countdown

Conversation

@EffortlessSteven

Copy link
Copy Markdown
Member

Summary

Implements PR 1 from the scout's plan on issue #103 — wires the existing RetryBackoffStarted event (landed in PR #113) to a live CLI countdown in the publish progress reporter, lifting the previously dead ProgressReporter::set_status function and replacing the silent thread::sleep(delay) window with per-second narration in TTY mode. Operators watching a CI log no longer stare at a silent terminal during retry-backoff sleeps.

  • New Reporter::retry_wait(...) default method on both engine::Reporter and engine::parallel::Reporter — owns the backoff sleep so richer UIs can render live progress. Default impl preserves the pre-Competency: Narrate — gap audit and roadmap #103 behavior (warn() line + thread::sleep), so every existing reporter implementation keeps working without changes.
  • The two retry-backoff helpers (emit_retry_backoff_event sequential, emit_retry_backoff parallel) record the RetryBackoffStarted event and then delegate the human-facing message + sleep to Reporter::retry_wait. The events.jsonl schema is unchanged.
  • ProgressReporter::set_status is lifted off #[allow(dead_code)] and a new ProgressReporter::retry_countdown(...) drives the rendering: TTY ticks once per second via set_status; non-TTY emits a single line via eprintln!; quiet mode stays silent but still blocks for the full delay.
  • CliReporter grows an Option<ProgressReporter> field installed by the publish/resume subcommands so the engine's retry narration actually drives the bar.

Acceptance criteria (from the scout's plan)

  • When RetryBackoffStarted is emitted, operator sees:
    • TTY: `[1/N] retrying name@ver in 45s... (attempt 2/5, reason: Retryable) - ` (countdown ticks down live).
    • Non-TTY (CI): `[retry] name@ver: (Retryable); next attempt in 45s (attempt 2/5)` (one-shot eprintln!, no timer spam).
  • Event is still emitted to events.jsonl; no changes to event log schema.
  • ProgressReporter::set_status is no longer dead code.
  • Existing reporter implementations (TestReporter, CollectingReporter, SendReporter) keep working via the default trait impl.

Test plan

  • cargo fmt --all -- --check passes.
  • cargo clippy --workspace --all-targets --all-features -- -D warnings passes.
  • cargo test -p shipper-cli — 113/113 lib tests + 22/22 cli_e2e integration tests pass (single-threaded; cli_e2e has a known pre-existing flake on parallel preflight_command_finishability_proven, unrelated to this change).
  • cargo test -p shipper-core — 1890/1890 tests pass.
  • cargo test -p shipper — 39/39 façade tests pass.
  • New progress::tests cases (4): silent-mode blocks for delay, zero delay returns immediately, non-TTY one-shot path doesn't panic, exotic attempt/max boundaries.
  • New lib::tests cases (3): CliReporter::retry_wait without progress blocks for delay, with progress routes through countdown, default trait impl preserves the canonical warn-line format.

Out of scope (deferred to later #103 PRs)

  • PR 2 — ExecutionState per-attempt history enrichment.
  • PR 3 — shipper inspect-events --tail / --follow streaming modes.
  • PR 4 — shipper watch live TUI subcommand.

Closes (partial): #103.

Implements PR 1 from the scout's plan on issue #103. Wires the existing
RetryBackoffStarted event (landed in PR #113) to a live CLI countdown
in the publish progress reporter, lifting the previously dead
`ProgressReporter::set_status` function and replacing the silent
`thread::sleep(delay)` window with per-second narration in TTY mode.

## What changed

- `engine::Reporter` (and the parallel `Reporter` mirror) gain a new
  `retry_wait(name, version, attempt, max_attempts, delay, reason, msg)`
  default method. The default impl preserves the pre-#103 behavior
  (single `warn()` line + `thread::sleep(delay)`), so every existing
  reporter implementation continues to behave exactly as before.
- The two retry-backoff helpers (`engine::emit_retry_backoff_event` and
  `engine::parallel::publish::emit_retry_backoff`) now record the
  `RetryBackoffStarted` event and then delegate the human-facing
  message + the backoff sleep to `Reporter::retry_wait`. The events.jsonl
  schema is unchanged.
- `ProgressReporter::set_status` is lifted off `#[allow(dead_code)]`.
- New `ProgressReporter::retry_countdown(...)` renders:
  - **TTY:** `[1/N] retrying name@ver in 45s... (attempt 2/5, reason: Retryable) - <msg>`
    refreshed once per second via `set_status` until the wait expires.
  - **Non-TTY (CI):** a single `[retry] name@ver: msg (Retryable); next attempt in 45s (attempt 2/5)`
    line via `eprintln!`, then `thread::sleep(delay)` — no per-second
    spam in pipeline logs.
  - **Quiet:** silent, but still blocks for the full delay.
- `CliReporter` grows an optional `progress: Option<ProgressReporter>`
  field. The `publish` and `resume` subcommands install the existing
  per-registry `ProgressReporter` on the reporter before the engine
  call (and `take_progress().finish()` after), so the engine's
  retry-narration path drives the live countdown.
- `CliReporter::retry_wait` overrides the trait default to route
  through the installed progress, with a graceful fallback to the
  warn-line behavior for callers without a progress bar (e.g. tests).

## Acceptance criteria from the scout's plan

- [x] When `RetryBackoffStarted` is emitted, operator sees:
      - TTY: countdown line ticking down once per second
      - Non-TTY: one-shot `[retry] ...` line then sleep
- [x] Event is still emitted to events.jsonl; no event-schema changes
- [x] `ProgressReporter::set_status` is no longer dead code
- [x] Existing reporter implementations (TestReporter, CollectingReporter,
      SendReporter) continue to compile and behave the same via the
      default trait impl

## Tests

- `progress::tests`: 4 new cases — silent-mode blocks for delay, zero
  delay returns immediately, non-TTY one-shot line drives the !is_tty
  branch, exotic attempt/max boundaries don't panic.
- `lib::tests`: 3 new cases — `CliReporter::retry_wait` without progress
  blocks for delay, with progress routes through countdown, default
  trait impl preserves the canonical warn-line format.
- All 113 `shipper-cli` lib tests pass; all 22 `cli_e2e` integration
  tests pass; all 1890 `shipper-core` tests pass; all 39 `shipper`
  facade tests pass.
@coderabbitai

coderabbitai Bot commented Apr 19, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@EffortlessSteven has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 19 minutes and 29 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 19 minutes and 29 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 07202581-00ab-45ca-af8e-6daf3b777a44

📥 Commits

Reviewing files that changed from the base of the PR and between 25e23ad and f49fd1e.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (8)
  • crates/shipper-cli/Cargo.toml
  • crates/shipper-cli/src/lib.rs
  • crates/shipper-cli/src/output/progress/mod.rs
  • crates/shipper-cli/src/output/progress/tests.rs
  • crates/shipper-core/src/engine/mod.rs
  • crates/shipper-core/src/engine/parallel/mod.rs
  • crates/shipper-core/src/engine/parallel/publish.rs
  • crates/shipper-core/src/engine/parallel/tests.rs
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/103-live-retry-countdown

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a live retry-backoff countdown for the CLI by delegating narration and sleep logic to the Reporter trait. Key changes include the addition of a retry_wait method to the Reporter trait and a ticking countdown implementation in ProgressReporter for TTY environments. Feedback indicates a performance issue in parallel mode where the retry sleep occurs while holding a mutex lock, effectively serializing tasks. Additionally, there is a suggestion to refactor CliReporter to reduce code duplication by utilizing the existing warn method.

Comment on lines 88 to 99
{
let mut rep = reporter.lock().unwrap();
rep.warn(&format!(
"{}@{}: {} ({:?}); next attempt in {} (attempt {}/{})",
rep.retry_wait(
pkg_name,
pkg_version,
message,
reason,
humantime::format_duration(delay),
attempt.saturating_add(1),
attempt,
max_attempts,
));
delay,
reason,
message,
);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In parallel mode, the reporter is shared across multiple threads via an Arc<Mutex<dyn Reporter>>. By calling rep.retry_wait(...) while holding the lock, the current thread will sleep for the entire backoff duration (delay) while preventing all other threads from reporting progress (info, warn, error). This effectively serializes the execution of parallel tasks whenever a retry occurs, which is a regression from the pre-#103 behavior where the sleep happened outside the lock.

To fix this, the narration should happen under the lock, but the sleep should occur after the lock is released. Since retry_wait is defined to include the sleep, you should manually narrate using warn and then sleep after the scope of the lock ends.

    {
        let mut rep = reporter.lock().unwrap();
        rep.warn(&format!(
            "{}@{}: {} ({:?}); next attempt in {} (attempt {}/{})",
            pkg_name,
            pkg_version,
            message,
            reason,
            humantime::format_duration(delay),
            attempt.saturating_add(1),
            max_attempts,
        ));
    }

    thread::sleep(delay);

Comment thread crates/shipper-cli/src/lib.rs Outdated
Comment on lines +491 to +511
if let Some(progress) = &self.progress {
progress.retry_countdown(
pkg_name,
pkg_version,
attempt,
max_attempts,
delay,
&format!("{reason:?}"),
message,
);
} else if !self.quiet {
eprintln!(
"[warn] {pkg_name}@{pkg_version}: {message} ({reason:?}); next attempt in {} (attempt {}/{})",
humantime::format_duration(delay),
attempt.saturating_add(1),
max_attempts,
);
std::thread::sleep(delay);
} else {
std::thread::sleep(delay);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for handling retries when no progress reporter is installed can be simplified by leveraging the existing self.warn method. This avoids duplicating the format string and the std::thread::sleep call, and it automatically respects the quiet flag since CliReporter::warn already checks it.

        if let Some(progress) = &self.progress {
            progress.retry_countdown(
                pkg_name,
                pkg_version,
                attempt,
                max_attempts,
                delay,
                &format!("{reason:?}"),
                message,
            );
        } else {
            self.warn(&format!(
                "{pkg_name}@{pkg_version}: {message} ({reason:?}); next attempt in {} (attempt {}/{})",
                humantime::format_duration(delay),
                attempt.saturating_add(1),
                max_attempts,
            ));
            std::thread::sleep(delay);
        }

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4fe862f882

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +89 to +90
let mut rep = reporter.lock().unwrap();
rep.warn(&format!(
"{}@{}: {} ({:?}); next attempt in {} (attempt {}/{})",
rep.retry_wait(

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Release reporter lock before entering retry backoff

In the parallel publish path, emit_retry_backoff acquires reporter.lock() and calls retry_wait while still holding that mutex. The default parallel::Reporter::retry_wait implementation sleeps for delay, and run_publish_parallel_inner uses SendReporter with that default, so a single retry backoff blocks all other workers that try to log (for example at each attempt start) until the sleep ends; this can effectively serialize parallel publishes and cause major runtime inflation/timeouts whenever retries occur.

Useful? React with 👍 / 👎.

@codecov

codecov Bot commented Apr 19, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 89.65517% with 42 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/shipper-core/src/engine/parallel/mod.rs 74.78% 29 Missing ⚠️
crates/shipper-core/src/engine/parallel/publish.rs 82.85% 12 Missing ⚠️
crates/shipper-cli/src/lib.rs 99.34% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@EffortlessSteven EffortlessSteven merged commit e3b4843 into main Apr 21, 2026
20 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant