feat(narrate): live retry countdown in CLI (#103 PR 1)#155
Conversation
Implements PR 1 from the scout's plan on issue #103. Wires the existing RetryBackoffStarted event (landed in PR #113) to a live CLI countdown in the publish progress reporter, lifting the previously dead `ProgressReporter::set_status` function and replacing the silent `thread::sleep(delay)` window with per-second narration in TTY mode. ## What changed - `engine::Reporter` (and the parallel `Reporter` mirror) gain a new `retry_wait(name, version, attempt, max_attempts, delay, reason, msg)` default method. The default impl preserves the pre-#103 behavior (single `warn()` line + `thread::sleep(delay)`), so every existing reporter implementation continues to behave exactly as before. - The two retry-backoff helpers (`engine::emit_retry_backoff_event` and `engine::parallel::publish::emit_retry_backoff`) now record the `RetryBackoffStarted` event and then delegate the human-facing message + the backoff sleep to `Reporter::retry_wait`. The events.jsonl schema is unchanged. - `ProgressReporter::set_status` is lifted off `#[allow(dead_code)]`. - New `ProgressReporter::retry_countdown(...)` renders: - **TTY:** `[1/N] retrying name@ver in 45s... (attempt 2/5, reason: Retryable) - <msg>` refreshed once per second via `set_status` until the wait expires. - **Non-TTY (CI):** a single `[retry] name@ver: msg (Retryable); next attempt in 45s (attempt 2/5)` line via `eprintln!`, then `thread::sleep(delay)` — no per-second spam in pipeline logs. - **Quiet:** silent, but still blocks for the full delay. - `CliReporter` grows an optional `progress: Option<ProgressReporter>` field. The `publish` and `resume` subcommands install the existing per-registry `ProgressReporter` on the reporter before the engine call (and `take_progress().finish()` after), so the engine's retry-narration path drives the live countdown. - `CliReporter::retry_wait` overrides the trait default to route through the installed progress, with a graceful fallback to the warn-line behavior for callers without a progress bar (e.g. tests). ## Acceptance criteria from the scout's plan - [x] When `RetryBackoffStarted` is emitted, operator sees: - TTY: countdown line ticking down once per second - Non-TTY: one-shot `[retry] ...` line then sleep - [x] Event is still emitted to events.jsonl; no event-schema changes - [x] `ProgressReporter::set_status` is no longer dead code - [x] Existing reporter implementations (TestReporter, CollectingReporter, SendReporter) continue to compile and behave the same via the default trait impl ## Tests - `progress::tests`: 4 new cases — silent-mode blocks for delay, zero delay returns immediately, non-TTY one-shot line drives the !is_tty branch, exotic attempt/max boundaries don't panic. - `lib::tests`: 3 new cases — `CliReporter::retry_wait` without progress blocks for delay, with progress routes through countdown, default trait impl preserves the canonical warn-line format. - All 113 `shipper-cli` lib tests pass; all 22 `cli_e2e` integration tests pass; all 1890 `shipper-core` tests pass; all 39 `shipper` facade tests pass.
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 19 minutes and 29 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (8)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a live retry-backoff countdown for the CLI by delegating narration and sleep logic to the Reporter trait. Key changes include the addition of a retry_wait method to the Reporter trait and a ticking countdown implementation in ProgressReporter for TTY environments. Feedback indicates a performance issue in parallel mode where the retry sleep occurs while holding a mutex lock, effectively serializing tasks. Additionally, there is a suggestion to refactor CliReporter to reduce code duplication by utilizing the existing warn method.
| { | ||
| let mut rep = reporter.lock().unwrap(); | ||
| rep.warn(&format!( | ||
| "{}@{}: {} ({:?}); next attempt in {} (attempt {}/{})", | ||
| rep.retry_wait( | ||
| pkg_name, | ||
| pkg_version, | ||
| message, | ||
| reason, | ||
| humantime::format_duration(delay), | ||
| attempt.saturating_add(1), | ||
| attempt, | ||
| max_attempts, | ||
| )); | ||
| delay, | ||
| reason, | ||
| message, | ||
| ); | ||
| } |
There was a problem hiding this comment.
In parallel mode, the reporter is shared across multiple threads via an Arc<Mutex<dyn Reporter>>. By calling rep.retry_wait(...) while holding the lock, the current thread will sleep for the entire backoff duration (delay) while preventing all other threads from reporting progress (info, warn, error). This effectively serializes the execution of parallel tasks whenever a retry occurs, which is a regression from the pre-#103 behavior where the sleep happened outside the lock.
To fix this, the narration should happen under the lock, but the sleep should occur after the lock is released. Since retry_wait is defined to include the sleep, you should manually narrate using warn and then sleep after the scope of the lock ends.
{
let mut rep = reporter.lock().unwrap();
rep.warn(&format!(
"{}@{}: {} ({:?}); next attempt in {} (attempt {}/{})",
pkg_name,
pkg_version,
message,
reason,
humantime::format_duration(delay),
attempt.saturating_add(1),
max_attempts,
));
}
thread::sleep(delay);| if let Some(progress) = &self.progress { | ||
| progress.retry_countdown( | ||
| pkg_name, | ||
| pkg_version, | ||
| attempt, | ||
| max_attempts, | ||
| delay, | ||
| &format!("{reason:?}"), | ||
| message, | ||
| ); | ||
| } else if !self.quiet { | ||
| eprintln!( | ||
| "[warn] {pkg_name}@{pkg_version}: {message} ({reason:?}); next attempt in {} (attempt {}/{})", | ||
| humantime::format_duration(delay), | ||
| attempt.saturating_add(1), | ||
| max_attempts, | ||
| ); | ||
| std::thread::sleep(delay); | ||
| } else { | ||
| std::thread::sleep(delay); | ||
| } |
There was a problem hiding this comment.
The logic for handling retries when no progress reporter is installed can be simplified by leveraging the existing self.warn method. This avoids duplicating the format string and the std::thread::sleep call, and it automatically respects the quiet flag since CliReporter::warn already checks it.
if let Some(progress) = &self.progress {
progress.retry_countdown(
pkg_name,
pkg_version,
attempt,
max_attempts,
delay,
&format!("{reason:?}"),
message,
);
} else {
self.warn(&format!(
"{pkg_name}@{pkg_version}: {message} ({reason:?}); next attempt in {} (attempt {}/{})",
humantime::format_duration(delay),
attempt.saturating_add(1),
max_attempts,
));
std::thread::sleep(delay);
}There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4fe862f882
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let mut rep = reporter.lock().unwrap(); | ||
| rep.warn(&format!( | ||
| "{}@{}: {} ({:?}); next attempt in {} (attempt {}/{})", | ||
| rep.retry_wait( |
There was a problem hiding this comment.
Release reporter lock before entering retry backoff
In the parallel publish path, emit_retry_backoff acquires reporter.lock() and calls retry_wait while still holding that mutex. The default parallel::Reporter::retry_wait implementation sleeps for delay, and run_publish_parallel_inner uses SendReporter with that default, so a single retry backoff blocks all other workers that try to log (for example at each attempt start) until the sleep ends; this can effectively serialize parallel publishes and cause major runtime inflation/timeouts whenever retries occur.
Useful? React with 👍 / 👎.
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Summary
Implements PR 1 from the scout's plan on issue #103 — wires the existing
RetryBackoffStartedevent (landed in PR #113) to a live CLI countdown in the publish progress reporter, lifting the previously deadProgressReporter::set_statusfunction and replacing the silentthread::sleep(delay)window with per-second narration in TTY mode. Operators watching a CI log no longer stare at a silent terminal during retry-backoff sleeps.Reporter::retry_wait(...)default method on bothengine::Reporterandengine::parallel::Reporter— owns the backoff sleep so richer UIs can render live progress. Default impl preserves the pre-Competency: Narrate — gap audit and roadmap #103 behavior (warn()line +thread::sleep), so every existing reporter implementation keeps working without changes.emit_retry_backoff_eventsequential,emit_retry_backoffparallel) record theRetryBackoffStartedevent and then delegate the human-facing message + sleep toReporter::retry_wait. Theevents.jsonlschema is unchanged.ProgressReporter::set_statusis lifted off#[allow(dead_code)]and a newProgressReporter::retry_countdown(...)drives the rendering: TTY ticks once per second viaset_status; non-TTY emits a single line viaeprintln!; quiet mode stays silent but still blocks for the full delay.CliReportergrows anOption<ProgressReporter>field installed by thepublish/resumesubcommands so the engine's retry narration actually drives the bar.Acceptance criteria (from the scout's plan)
RetryBackoffStartedis emitted, operator sees:eprintln!, no timer spam).events.jsonl; no changes to event log schema.ProgressReporter::set_statusis no longer dead code.TestReporter,CollectingReporter,SendReporter) keep working via the default trait impl.Test plan
cargo fmt --all -- --checkpasses.cargo clippy --workspace --all-targets --all-features -- -D warningspasses.cargo test -p shipper-cli— 113/113 lib tests + 22/22cli_e2eintegration tests pass (single-threaded;cli_e2ehas a known pre-existing flake on parallelpreflight_command_finishability_proven, unrelated to this change).cargo test -p shipper-core— 1890/1890 tests pass.cargo test -p shipper— 39/39 façade tests pass.progress::testscases (4): silent-mode blocks for delay, zero delay returns immediately, non-TTY one-shot path doesn't panic, exotic attempt/max boundaries.lib::testscases (3):CliReporter::retry_waitwithout progress blocks for delay, with progress routes through countdown, default trait impl preserves the canonical warn-line format.Out of scope (deferred to later #103 PRs)
ExecutionStateper-attempt history enrichment.shipper inspect-events --tail/--followstreaming modes.shipper watchlive TUI subcommand.Closes (partial): #103.