Skip to content
This repository was archived by the owner on May 14, 2026. It is now read-only.

feat(reporter): emit pnpm:progress and pnpm:fetching-progress#372

Merged
zkochan merged 8 commits into
mainfrom
feat/reporter-progress
May 2, 2026
Merged

feat(reporter): emit pnpm:progress and pnpm:fetching-progress#372
zkochan merged 8 commits into
mainfrom
feat/reporter-progress

Conversation

@zkochan

@zkochan zkochan commented May 2, 2026

Copy link
Copy Markdown
Member

Summary

Backfills two channels from #347 in one PR — both are "progress"
shaped so they share the tarball-side reporter threading.

pnpm:progress — per-package status transitions

Mirrors pnpm's emit at
installing/package-requester/src/packageRequester.ts:435.
Four status values fire per package:

  • resolved — emitted before the tarball download in both
    lockfile paths. In the no-lockfile path,
    install_package_from_registry::run emits once the registry has
    picked a version. In the frozen-lockfile path,
    create_virtual_store::run's warm batch and per-snapshot
    install_package_by_snapshot::run's entry emit on each snapshot
    (the lockfile is the resolution).
  • fetched — emitted from fetch_and_extract_with_retry on
    the successful Ok branch. Failed attempts that retry don't fire
    it; cache-hit early returns don't fire it.
  • found_in_store — emitted from run_without_mem_cache's
    cache-hit early returns (prefetched-cas + load_cached_cas_paths),
    from create_virtual_store::run's warm batch where prefetch
    already settled the bytes, and from run_with_mem_cache's
    in-process dedup hits (so the second requester of a shared URL
    also ticks the per-package counter).
  • imported — emitted after create_cas_files returns Ok in
    both create_virtual_dir_by_snapshot::run and
    install_package_from_registry::install_package_version. method
    is the optimistic wire-mapped configured method
    (Auto/CloneOrCopyclone); pacquet doesn't surface the
    per-package resolved method past link_file's install-scoped
    atomic. Tracked under chore(reporter): backfill missing log events in already-ported code #347 for refinement once that becomes
    per-package observable.

pnpm:fetching-progress — per-tarball download progress

Mirrors pnpm's emit at
installing/package-requester/src/packageRequester.ts:560
and
fetching/tarball-fetcher/src/remoteTarballFetcher.ts:143.
Two status values:

  • started — fires exactly once per HTTP attempt, including
    attempts that fail before the response head arrives (DNS /
    connect / timeout) so retried attempts stay visible in the
    reporter. size is the response's Content-Length when a
    response head arrives, and JSON null (i.e. None) when there
    is no response, or when the response is chunked / unknown-length.
    pnpm's reporter checks size != null before rendering a percent
    gauge.
  • in_progress — gated and throttled to match pnpm exactly:
    • Skip entirely when Content-Length is unknown (chunked /
      streaming) or the tarball is smaller than BIG_TARBALL_SIZE = 5 MB.
      Most npm packages are well under this; per-byte progress for
      tiny tarballs floods the JS reporter with values that would
      render as 100% before any UI tick can show them.
    • Throttle to 500ms (matches lodash.throttle(opts.onProgress, 500)).
    • Trailing edge: a final emit after the body finishes if the
      last in-flight downloaded value differs from the last
      reported one. Matches lodash.throttle's default
      {leading: true, trailing: true} so the consumer sees the
      actual end-of-download byte count.

run_with_mem_cache deadlock fix

Pre-existing bug exposed by review: when the owning requester's
run_without_mem_cache errored, the function returned without
flipping the cache slot to Available or notifying waiters.
Concurrent siblings parked on notify.notified().await forever
and the install hung — a single 404 could deadlock every later
consumer of the same URL.

Fixed by adding CacheValue::Failed. The owner uses match on
the result; on Err it sets Failed, removes the entry from
mem_cache (so a fresh request can retry without inheriting the
failure), and notifies waiters. Waiters check for Failed on
wake and surface a new TarballError::SiblingFetchFailed { url }
instead of the unreachable! the previous code would have hit.

Threading

pacquet-tarball gains a pacquet-reporter dependency.
DownloadTarballToStore carries a new requester: &'a str field
(the install root, same value as the prefix in
pacquet_reporter::StageLog). <R: Reporter> is threaded through
run_with_mem_cache, run_without_mem_cache,
fetch_and_extract_with_retry, and fetch_and_extract_once. The
generic monomorphises away — zero runtime cost. requester and
<R> cascade up through InstallPackageBySnapshot,
InstallPackageFromRegistry, CreateVirtualDirBySnapshot,
CreateVirtualStore, InstallFrozenLockfile,
InstallWithoutLockfile, and Install::run.

Convention compliance

  • All variants mirror upstream @pnpm/core-loggers payload
    field-for-field; imported deliberately omits packageId
    to match pnpm's ProgressMessage shape.
  • Code comments + commit messages + this PR all reference
    upstream permalinks at pinned SHA 086c5e91e8.
  • Wire-shape tests in pacquet-reporter cover every status
    value, the imported-omits-packageId invariant, and the
    size: null JSON-null path.
  • Recording-fake DI tests in pacquet-tarball cover the
    emit-on-success contract, pre-response-failure visibility,
    mem-cache hit emission, and deadlock recovery on owner
    fetch error (regression test for the
    CacheValue::Failed fix).
  • Recording-fake DI test in pacquet-package-manager
    (no_lockfile_install_emits_progress_sequence) pins the
    no-lockfile path's resolved → fetched → imported
    sequence end-to-end via AutoMockInstance.
  • Helper extractions
    (emit_warm_snapshot_progress, emit_progress_resolved)
    with their own unit tests so the constructor logic is
    covered even though the call sites live in the
    not-yet-test-covered frozen-lockfile install path.
  • Throttle / size gate on in_progress matches pnpm exactly
    (500ms, ≥5 MB known size, leading + trailing edges).
  • Verified each test catches the regression: temporarily
    commented out the relevant emit / removed the cleanup, ran
    the test, observed failure, restored.
  • Tick pnpm:progress and pnpm:fetching-progress boxes in
    chore(reporter): backfill missing log events in already-ported code #347 on merge.

Test plan

  • cargo nextest run -p pacquet-reporter — all wire-shape
    tests pass.
  • cargo nextest run -p pacquet-tarball — all 31 tests pass,
    including the new emit / deadlock-recovery regressions.
  • cargo nextest run -p pacquet-package-manager — all tests
    pass; new emit-helper unit tests + the no-lockfile sequence
    test included.
  • just ready — 271 tests pass, lint clean.

References

Copilot AI review requested due to automatic review settings May 2, 2026 07:51
@coderabbitai

coderabbitai Bot commented May 2, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds pnpm-aligned progress channels (pnpm:progress, pnpm:fetching-progress) to the reporter, threads per-install context (requester, package_id) through installer and tarball code paths, and emits progress events at resolved/fetched/found_in_store/imported stages during downloads and virtual-store creation.

Changes

Reporter-Driven Progress & Requester Threading

Layer / File(s) Summary
Reporter Event Types
crates/reporter/src/lib.rs
Adds LogEvent::Progress(ProgressLog) and LogEvent::FetchingProgress(FetchingProgressLog) plus ProgressMessage and FetchingProgressMessage unions with pnpm NDJSON wire-shape and field renames.
Wire Format Tests
crates/reporter/src/tests.rs
Adds serialization tests asserting status tags and flattened field names (packageId, requester, method, to, size).
Data Shape — Requester Threading
crates/package-manager/src/{install.rs,install_frozen_lockfile.rs,install_without_lockfile.rs,install_package_from_registry.rs,install_package_by_snapshot.rs,create_virtual_store.rs,create_virtual_dir_by_snapshot.rs}, crates/tarball/src/lib.rs
Adds requester: &str fields (and package_id where applicable) and threads requester from Install::run through installers, CreateVirtualStore, InstallPackage*, CreateVirtualDirBySnapshot, and DownloadTarballToStore.
Core Progress Emission
crates/package-manager/src/{install_package_from_registry.rs,install_package_by_snapshot.rs,create_virtual_dir_by_snapshot.rs,create_virtual_store.rs}
Emits pnpm:progress resolved before downloads, found_in_store on cache/store reuse, and imported after virtual-dir layout; adds optimistic_wire_method mapping for import method->wire value.
Tarball Download Progress & Cache Tracking
crates/tarball/src/lib.rs
Makes fetch pipeline generic over R: Reporter, emits FetchingProgressMessage::Started per HTTP attempt and throttled in_progress updates, emits ProgressMessage::Fetched on success and FoundInStore on cache reuse; threads requester/package_id through cache and retry paths.
Wiring, Tests & Manifests
crates/tarball/src/tests.rs, crates/tarball/Cargo.toml, tasks/micro-benchmark/{Cargo.toml,src/main.rs}, crates/package-manager/src/*/tests.rs
Updates tests to pass requester and explicit reporter generic parameters, adds event-recording unit tests (resolved/found_in_store/imported/fetching progress), and adds pacquet-reporter workspace dependency where needed.

Sequence Diagram(s)

sequenceDiagram
    participant Installer
    participant Tarball
    participant Store
    participant Reporter
    Installer->>Reporter: pnpm:progress resolved(package_id, requester)
    Installer->>Tarball: download request(package_id, requester)
    Tarball->>Reporter: fetching-progress.started(attempt, package_id)
    Tarball->>Store: check mem-cache / prefetched CAS
    alt cache hit
        Store->>Reporter: pnpm:progress found_in_store(package_id, requester)
        Store-->>Installer: return CAS paths
    else network fetch
        Tarball->>Reporter: fetching-progress.in_progress(downloaded, package_id) (throttled)
        Tarball->>Store: store CAS paths
        Tarball->>Reporter: pnpm:progress fetched(package_id, requester)
    end
    Installer->>Installer: layout virtual dir (create/symlink)
    Installer->>Reporter: pnpm:progress imported(method, requester, to)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • KSXGitHub

Poem

🐇
I hopped through logs with whiskered cheer,
Tagged each package, far and near,
Resolved, fetched, found in store so bright,
Imported paths by moonlit night,
A carrot trail of progress, bold and clear!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title accurately and specifically describes the main feature: adding two new reporter channels (pnpm:progress and pnpm:fetching-progress) for emitting progress events, which is the core focus of the changeset across all modified files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/reporter-progress

Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends Pacquet's pnpm-compatible NDJSON reporter with two new channels: per-package lifecycle events (pnpm:progress) and per-download transfer events (pnpm:fetching-progress). It threads reporter context from the install layer down into tarball fetching/import paths so the existing reporting engine can surface richer install progress.

Changes:

  • Added pnpm:progress and pnpm:fetching-progress event types and wire-shape tests in pacquet-reporter.
  • Threaded R: Reporter and a new requester field through tarball/package-manager flows, emitting resolved, fetched, found_in_store, imported, started, and throttled in_progress events.
  • Updated tarball tests and benchmark code to pass the new reporter generic and requester field.

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tasks/micro-benchmark/src/main.rs Updates benchmark tarball callsites for the new reporter generic and requester field.
tasks/micro-benchmark/Cargo.toml Adds reporter dependency needed by the benchmark.
crates/tarball/src/tests.rs Adapts existing tests and adds reporter-event assertions for tarball download/cache flows.
crates/tarball/src/lib.rs Implements fetch/progress event emission and threads reporter support through tarball download paths.
crates/tarball/Cargo.toml Adds reporter dependency to the tarball crate.
crates/reporter/src/tests.rs Adds serialization tests for the new reporter channels.
crates/reporter/src/lib.rs Defines new log event variants and payload types for pnpm progress channels.
crates/package-manager/src/install_without_lockfile.rs Threads requester through no-lockfile install flow.
crates/package-manager/src/install_package_from_registry.rs Emits resolved/imported events in registry-driven installs and passes reporter context to tarball fetches.
crates/package-manager/src/install_package_by_snapshot.rs Emits resolved events for lockfile snapshots and passes reporter context into fetch/import steps.
crates/package-manager/src/install_frozen_lockfile.rs Threads requester through frozen-lockfile installs.
crates/package-manager/src/install.rs Passes install prefix down as reporter requester context.
crates/package-manager/src/create_virtual_store.rs Emits resolved/found_in_store events for warm prefetched snapshots and passes requester into virtual-dir creation.
crates/package-manager/src/create_virtual_dir_by_snapshot.rs Emits imported events after virtual-store import completes and exposes wire-method mapping.
Cargo.lock Records the new reporter dependencies for affected workspace crates.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/tarball/src/lib.rs Outdated
Comment thread crates/tarball/src/lib.rs Outdated
@codecov

codecov Bot commented May 2, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 83.60656% with 30 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.06%. Comparing base (f79f7a4) to head (1aa2ed3).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
crates/tarball/src/lib.rs 80.61% 19 Missing ⚠️
...package-manager/src/install_package_by_snapshot.rs 60.00% 6 Missing ⚠️
crates/package-manager/src/create_virtual_store.rs 82.14% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #372      +/-   ##
==========================================
+ Coverage   80.17%   81.06%   +0.88%     
==========================================
  Files          64       64              
  Lines        3355     3517     +162     
==========================================
+ Hits         2690     2851     +161     
- Misses        665      666       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented May 2, 2026

Copy link
Copy Markdown

Micro-Benchmark Results

Linux

group                          main                                   pr
-----                          ----                                   --
tarball/download_dependency    1.00     15.8±0.56ms   274.9 KB/sec    1.00     15.8±0.24ms   275.4 KB/sec

zkochan added a commit that referenced this pull request May 2, 2026
…ures

Per Copilot review on #372:

**`pnpm:fetching-progress started` was silently dropped on
connection-level failures.** The previous emit fired *after*
`send().await`, so DNS / connect / timeout failures (which surface
from `send()` itself) returned before any reporter event — every
retry of those failures was invisible to the consumer even though
the retry loop kept iterating. Move the emit ahead of `send()` so
each attempt gets exactly one `started`, regardless of how it
terminates. `size` is `None` here (the response head hasn't arrived
yet); the wire shape allows JSON `null` per pnpm's `size: number |
null`.

**`run_with_mem_cache` cache-hit paths skipped `pnpm:progress`
entirely.** The mem cache deduplicates concurrent fetches of the
same tarball URL across different package_ids. Only the requester
that drives the actual download went through `run_without_mem_cache`
and emitted `fetched`; later requesters sharing the bytes via the
DashMap hit `Available` (or parked on `InProgress` then woke to
`Available`) and returned without any per-package progress event.
That collapsed pnpm's per-package "fetched" counter to first-only.
Emit `found_in_store` from both early-return branches with *this*
requester's package_id so the counter advances for every package.

Tests:
- `started_fires_for_connection_level_failures` drives the failure
  path with an unreachable URL (port 1, connect-refused) and
  asserts `started` fires anyway.
- `mem_cache_hit_emits_found_in_store_for_second_requester` runs
  two `run_with_mem_cache` calls for the same URL with different
  `package_id`s and asserts the second call emits `found_in_store`
  and does not re-hit the network.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
crates/tarball/src/tests.rs (1)

1425-1443: ⚡ Quick win

Add explicit diagnostic logging before new non-assert_eq! assertions

These new assert! event-shape checks should log captured values (eprintln!/dbg!) before asserting so failures are diagnosable without reruns.

Suggested pattern
     let captured = EVENTS.lock().unwrap();
+    eprintln!("captured events: {captured:#?}");
     assert!(
         captured.iter().any(|e| matches!(
             e,
@@
     assert!(
         !captured.iter().any(|e| matches!(
             e,
             LogEvent::Progress(log) if matches!(&log.message, ProgressMessage::Fetched { .. })
         )),
         "fetched must NOT fire on a mem-cache hit; got {captured:?}",
     );

Based on learnings: In Rust tests here, add logging around non-assert_eq! assertions so failures are diagnosable, while assert_eq! already includes differing values.

Also applies to: 1704-1722

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/tarball/src/tests.rs` around lines 1425 - 1443, Before the two
non-assert_eq! assertions that examine the captured events, print the captured
value so failures are diagnosable: add a diagnostic eprintln! or dbg! of the
captured collection immediately before the assert that checks for
LogEvent::Progress with ProgressMessage::FoundInStore for package_id
"second@2.0.0" and requester "/proj", and likewise before the assert that
ensures no ProgressMessage::Fetched occurs; reference the local variable
captured and the matching patterns (LogEvent::Progress,
ProgressMessage::FoundInStore, ProgressMessage::Fetched) so the logged output
shows the full captured contents when the assertion fails.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/tarball/src/lib.rs`:
- Around line 1026-1034: When run_without_mem_cache::<R>() can error, don't
return early while CacheValue::InProgress remains; before returning Err you must
evict or move the cache entry out of InProgress and notify/wake waiters so they
can recover instead of blocking or hitting the unreachable! path. Update the
branches that currently return on error (the one that checks
CacheValue::Available then returns Arc::clone(cas_paths) and the similar branch
around the 1045–1049 area) to replace the InProgress entry with an
evicted/failed state (or remove it) and wake waiters (the same mechanism used
when setting CacheValue::Available) so they observe a recoverable error path;
reference CacheValue::InProgress, run_without_mem_cache::<R>(),
emit_progress_found_in_store::<R>(), package_id, and requester when locating the
exact spots to change.

---

Nitpick comments:
In `@crates/tarball/src/tests.rs`:
- Around line 1425-1443: Before the two non-assert_eq! assertions that examine
the captured events, print the captured value so failures are diagnosable: add a
diagnostic eprintln! or dbg! of the captured collection immediately before the
assert that checks for LogEvent::Progress with ProgressMessage::FoundInStore for
package_id "second@2.0.0" and requester "/proj", and likewise before the assert
that ensures no ProgressMessage::Fetched occurs; reference the local variable
captured and the matching patterns (LogEvent::Progress,
ProgressMessage::FoundInStore, ProgressMessage::Fetched) so the logged output
shows the full captured contents when the assertion fails.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 015bbd98-6d6f-4e31-badf-8a64b56ff74b

📥 Commits

Reviewing files that changed from the base of the PR and between fb5b13c and 61a955d.

📒 Files selected for processing (2)
  • crates/tarball/src/lib.rs
  • crates/tarball/src/tests.rs

Comment thread crates/tarball/src/lib.rs Outdated
Copilot AI review requested due to automatic review settings May 2, 2026 08:48

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 15 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/tarball/src/lib.rs Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/package-manager/src/create_virtual_store.rs (1)

475-493: ⚡ Quick win

Add explicit debug logging before the non-assert_eq! assertion.

This test uses assert!(matches!(...)) on a complex structure; add a dbg! right before the assertion to follow the repo’s test-logging rule.

Suggested patch
         let captured = EVENTS.lock().unwrap();
+        dbg!(&*captured);
         assert!(
             matches!(
                 captured.as_slice(),
                 [
                     LogEvent::Progress(r),

As per coding guidelines, "Follow test-logging guidance in CODE_STYLE_GUIDE.md: log before non-assert_eq! assertions, dbg! complex structures, skip logging for simple scalar assert_eq!."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/package-manager/src/create_virtual_store.rs` around lines 475 - 493,
Insert a debug print of the complex captured events before the non-assert_eq!
assertion: after acquiring captured via let captured = EVENTS.lock().unwrap();
add a dbg!(captured.as_slice()); then keep the existing assert!(matches!(...))
unchanged; this ensures the complex value in captured (containing
LogEvent::Progress and ProgressMessage variants) is logged for test diagnostics
before the pattern-match assertion involving LogEvent, ProgressMessage,
r.message and f.message.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/package-manager/src/create_virtual_store.rs`:
- Around line 475-493: Insert a debug print of the complex captured events
before the non-assert_eq! assertion: after acquiring captured via let captured =
EVENTS.lock().unwrap(); add a dbg!(captured.as_slice()); then keep the existing
assert!(matches!(...)) unchanged; this ensures the complex value in captured
(containing LogEvent::Progress and ProgressMessage variants) is logged for test
diagnostics before the pattern-match assertion involving LogEvent,
ProgressMessage, r.message and f.message.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 7b02e502-e6e0-49fc-9238-24cae6db504d

📥 Commits

Reviewing files that changed from the base of the PR and between 61a955d and f16f0c4.

📒 Files selected for processing (3)
  • crates/package-manager/src/create_virtual_dir_by_snapshot.rs
  • crates/package-manager/src/create_virtual_store.rs
  • crates/package-manager/src/install_package_by_snapshot.rs
✅ Files skipped from review due to trivial changes (1)
  • crates/package-manager/src/create_virtual_dir_by_snapshot.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • crates/package-manager/src/install_package_by_snapshot.rs

@github-actions

github-actions Bot commented May 2, 2026

Copy link
Copy Markdown

Integrated-Benchmark Report (Linux)

Scenario: Frozen Lockfile

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 2.647 ± 0.066 2.561 2.786 1.00 ± 0.04
pacquet@main 2.639 ± 0.079 2.559 2.822 1.00
pnpm 6.550 ± 0.060 6.460 6.658 2.48 ± 0.08
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 2.64724065698,
      "stddev": 0.06550896764243246,
      "median": 2.6568139620799998,
      "user": 2.38549044,
      "system": 3.3007551199999994,
      "min": 2.56122431458,
      "max": 2.78558241358,
      "times": [
        2.78558241358,
        2.69077687158,
        2.56122431458,
        2.66611176158,
        2.60356195458,
        2.61834229358,
        2.66673332258,
        2.6569201265799998,
        2.6567077975799998,
        2.56644571358
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 2.639225296979999,
      "stddev": 0.07856379802535639,
      "median": 2.61232723808,
      "user": 2.38384704,
      "system": 3.3116343199999996,
      "min": 2.5586889135799997,
      "max": 2.8216872935799997,
      "times": [
        2.8216872935799997,
        2.5993025615799996,
        2.60370918258,
        2.62700414558,
        2.6121360175799997,
        2.5586889135799997,
        2.7222741445799996,
        2.65915695158,
        2.61251845858,
        2.5757753005799997
      ]
    },
    {
      "command": "pnpm",
      "mean": 6.55007748468,
      "stddev": 0.05955593235876415,
      "median": 6.5587236540800005,
      "user": 9.00082784,
      "system": 4.450883319999999,
      "min": 6.45974326658,
      "max": 6.65773342558,
      "times": [
        6.519161592580001,
        6.57138418058,
        6.65773342558,
        6.56424565058,
        6.60138190458,
        6.55702750158,
        6.46193706858,
        6.54774044958,
        6.560419806580001,
        6.45974326658
      ]
    }
  ]
}

Scenario: Frozen Lockfile (Hot Cache)

Command Mean [ms] Min [ms] Max [ms] Relative
pacquet@HEAD 708.1 ± 29.7 681.9 785.7 1.00
pacquet@main 747.6 ± 90.9 680.1 986.9 1.06 ± 0.14
pnpm 2640.7 ± 103.3 2524.1 2814.1 3.73 ± 0.21
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 0.70810448812,
      "stddev": 0.029674395219407346,
      "median": 0.69861663952,
      "user": 0.24201302,
      "system": 1.3575238199999997,
      "min": 0.68190658752,
      "max": 0.7856979685200001,
      "times": [
        0.7856979685200001,
        0.71579659652,
        0.6935438795200001,
        0.6932343735200001,
        0.71779625352,
        0.68190658752,
        0.69489763052,
        0.68751152952,
        0.7023356485200001,
        0.7083244135200001
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 0.74759666122,
      "stddev": 0.09090757336192092,
      "median": 0.72468368402,
      "user": 0.2572478199999999,
      "system": 1.35576092,
      "min": 0.68010260152,
      "max": 0.9869039995200001,
      "times": [
        0.76884968152,
        0.74673073152,
        0.7026366365200001,
        0.74910344052,
        0.69421689552,
        0.68627200152,
        0.69257168752,
        0.9869039995200001,
        0.7685789365200001,
        0.68010260152
      ]
    },
    {
      "command": "pnpm",
      "mean": 2.64071946082,
      "stddev": 0.10333161801721397,
      "median": 2.6115890685200003,
      "user": 3.17177152,
      "system": 2.2396764200000003,
      "min": 2.52413809052,
      "max": 2.8141229615200003,
      "times": [
        2.7237752375200004,
        2.52413809052,
        2.6072237355200003,
        2.8141229615200003,
        2.6159544015200002,
        2.6230961225200002,
        2.5659711025200003,
        2.5640551885200003,
        2.80318693352,
        2.56567083452
      ]
    }
  ]
}

zkochan added 5 commits May 2, 2026 21:10
Backfills two channels from #347 in one PR:

**`pnpm:progress`** — per-package status transitions. Mirrors
upstream emit at
https://github.com/pnpm/pnpm/blob/086c5e91e8/installing/package-requester/src/packageRequester.ts#L435.
Four `status` values:

- `resolved`: emitted before the tarball download in both the
  no-lockfile path (`install_package_from_registry::run` once the
  registry has picked a version) and the frozen-lockfile path
  (`create_virtual_store::run`'s warm batch + per-snapshot
  `install_package_by_snapshot::run`'s entry).
- `fetched`: emitted from `fetch_and_extract_with_retry` on the
  successful Ok branch — never on attempts that fail and retry.
- `found_in_store`: emitted from `run_without_mem_cache`'s cache-hit
  early returns (prefetched-cas + load_cached_cas_paths) and from
  `create_virtual_store::run`'s warm batch where prefetch already
  settled the bytes.
- `imported`: emitted after `create_cas_files` returns Ok in both
  the no-lockfile and frozen-lockfile paths. `method` is the
  optimistic wire-mapped configured method (Auto/CloneOrCopy →
  clone) — pacquet doesn't surface the per-package resolved method
  past `link_file`'s install-scoped atomic. Tracked under #347 for
  refinement once the resolved method becomes per-package
  observable.

**`pnpm:fetching-progress`** — per-tarball download progress.
Mirrors upstream emit at
https://github.com/pnpm/pnpm/blob/086c5e91e8/installing/package-requester/src/packageRequester.ts#L560.
Two `status` values:

- `started`: emitted once per HTTP attempt at the top of
  `fetch_and_extract_once`, including attempts that resolve to a
  non-2xx status. `size` is the response's `Content-Length`,
  serializing as JSON `null` for chunked / unknown-length responses
  to match pnpm's `size: number | null` shape.
- `in_progress`: emitted from inside the body-streaming loop,
  throttled to 200ms per attempt (matching pnpm's reporter
  coalescing window). Carries the running `downloaded` byte count.

**Threading.** `pacquet-tarball` gains a `pacquet-reporter`
dependency. `DownloadTarballToStore` carries a new `requester:
&'a str` field (the install root); `<R: Reporter>` is threaded
through `run_with_mem_cache`, `run_without_mem_cache`,
`fetch_and_extract_with_retry`, and `fetch_and_extract_once`. The
generic monomorphises away — zero runtime cost. `requester` and
`<R>` cascade up through `InstallPackageBySnapshot`,
`InstallPackageFromRegistry`, `CreateVirtualDirBySnapshot`,
`CreateVirtualStore`, `InstallFrozenLockfile`,
`InstallWithoutLockfile`, and `Install::run`.

**Tests.**

- New wire-shape tests in `pacquet-reporter`:
  `progress_event_matches_pnpm_wire_shape` (covers all four
  status values + the deliberate absence of `packageId` on
  `imported`),
  `fetching_progress_event_matches_pnpm_wire_shape` (covers
  `started` with both `size: Some` and `size: None` → JSON null,
  and `in_progress`).
- New integration tests in `pacquet-tarball`:
  `fetching_progress_and_fetched_events_fire_during_download`
  (asserts `started` fires once per attempt — including 5xx
  retries — and `fetched` fires exactly once on success);
  `found_in_store_event_fires_on_cache_hit` (populates the v11
  store via a first download, then asserts a second call emits
  `found_in_store` and does not emit `fetched` and does not hit
  the network again).

Closes the `pnpm:progress` and `pnpm:fetching-progress` items in
the #347 audit.
…ures

Per Copilot review on #372:

**`pnpm:fetching-progress started` was silently dropped on
connection-level failures.** The previous emit fired *after*
`send().await`, so DNS / connect / timeout failures (which surface
from `send()` itself) returned before any reporter event — every
retry of those failures was invisible to the consumer even though
the retry loop kept iterating. Move the emit ahead of `send()` so
each attempt gets exactly one `started`, regardless of how it
terminates. `size` is `None` here (the response head hasn't arrived
yet); the wire shape allows JSON `null` per pnpm's `size: number |
null`.

**`run_with_mem_cache` cache-hit paths skipped `pnpm:progress`
entirely.** The mem cache deduplicates concurrent fetches of the
same tarball URL across different package_ids. Only the requester
that drives the actual download went through `run_without_mem_cache`
and emitted `fetched`; later requesters sharing the bytes via the
DashMap hit `Available` (or parked on `InProgress` then woke to
`Available`) and returned without any per-package progress event.
That collapsed pnpm's per-package "fetched" counter to first-only.
Emit `found_in_store` from both early-return branches with *this*
requester's package_id so the counter advances for every package.

Tests:
- `started_fires_for_connection_level_failures` drives the failure
  path with an unreachable URL (port 1, connect-refused) and
  asserts `started` fires anyway.
- `mem_cache_hit_emits_found_in_store_for_second_requester` runs
  two `run_with_mem_cache` calls for the same URL with different
  `package_id`s and asserts the second call emits `found_in_store`
  and does not re-hit the network.
The codecov/patch threshold flagged 42 missing diff lines across
three files (`install_package_by_snapshot.rs`,
`create_virtual_dir_by_snapshot.rs`,
`create_virtual_store.rs`) — all in the frozen-lockfile
non-empty-snapshot install path that the existing test suite
doesn't exercise (the one CLI test that does is `#[ignore]`'d as
flaky on CI).

Three additions to lift coverage of the new emit code without
needing a full lockfile + populated CAFS in tests:

- `optimistic_wire_method_collapses_auto_and_clone_or_copy_to_clone`:
  asserts the configured-method → wire-method mapping covers every
  `pacquet_npmrc::PackageImportMethod` variant, so a future
  variant addition either extends the match or fails this test.
- `run_emits_imported_event_after_create_cas_files`: drives
  `CreateVirtualDirBySnapshot::run` with empty `cas_paths` (no
  files to import) and a default `SnapshotEntry`, asserting the
  `pnpm:progress imported` event fires with the threaded
  `method`/`requester`/`to` fields.
- Helper extractions in `create_virtual_store`
  (`emit_warm_snapshot_progress`) and `install_package_by_snapshot`
  (`emit_progress_resolved`) that move the
  `LogEvent`/`ProgressLog`/`ProgressMessage` constructor logic out
  of the closure / function bodies so it's directly unit-testable
  with a recording reporter, plus the corresponding tests.

The call sites in the not-yet-test-covered functions stay
unchanged in behavior; only the constructor logic moved. A future
PR that adds a non-`#[ignore]` frozen-lockfile install test would
cover the call sites too.
…sert

The new `run_emits_imported_event_after_create_cas_files` test used
`String::ends_with("/react@18.0.0/node_modules/react")` against the
emitted `imported.to`, which Windows fails — the OS-native path
separator there is `\`, so `to` looks like
`C:\\Users\\...\\react@18.0.0\\node_modules\\react`. Switch to
`Path::ends_with`, which is component-based (`react@18.0.0`,
`node_modules`, `react`) and works on every platform.
main landed #369 ("extract all inline test modules into dedicated
files") while this PR was open, switching the workspace convention
to external `tests.rs` files for every test module. The unit tests
this PR added in `create_virtual_dir_by_snapshot.rs`,
`install_package_by_snapshot.rs`, and `create_virtual_store.rs`
were inline; move each one to its dedicated `<parent>/tests.rs`
with a `#[cfg(test)] mod tests;` declaration on the parent, matching
the rest of the package-manager crate.

No behavioral change.
Copilot AI review requested due to automatic review settings May 2, 2026 19:40
@zkochan zkochan force-pushed the feat/reporter-progress branch from 25f0f2e to 5cc718e Compare May 2, 2026 19:40

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 19 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/tarball/src/lib.rs Outdated
Comment thread crates/package-manager/src/install_package_from_registry.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/tarball/src/tests.rs (1)

1425-1443: ⚡ Quick win

Add diagnostic logging before complex assert! predicates in event-capture tests.

At Line 1425 and Line 1704, these non-assert_eq! checks inspect complex captured reporter event state; please log captured before the assertions to match repo test-debugging conventions.

Proposed patch
@@
     let captured = EVENTS.lock().unwrap();
+    eprintln!("captured events: {captured:#?}");
     assert!(
         captured.iter().any(|e| matches!(
@@
     let captured = EVENTS.lock().unwrap();
+    eprintln!("captured events: {captured:#?}");
     assert!(
         captured.iter().any(|e| matches!(

As per coding guidelines: "Follow test-logging guidance in CODE_STYLE_GUIDE.md: log before non-assert_eq! assertions...".

Also applies to: 1704-1722

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/tarball/src/tests.rs` around lines 1425 - 1443, Add diagnostic logging
of the captured events immediately before the complex assert! predicates in the
tests by emitting a concise debug/info log of the captured variable so failures
show the event dump; locate the assertions that reference captured along with
patterns LogEvent::Progress, ProgressMessage::FoundInStore and
ProgressMessage::Fetched in tests.rs (the two spots around the asserts that
check for FoundInStore for "second@2.0.0" and the negative Fetched check) and
insert a log statement printing captured right before each assert! to follow the
test-logging convention.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/tarball/src/tests.rs`:
- Around line 1425-1443: Add diagnostic logging of the captured events
immediately before the complex assert! predicates in the tests by emitting a
concise debug/info log of the captured variable so failures show the event dump;
locate the assertions that reference captured along with patterns
LogEvent::Progress, ProgressMessage::FoundInStore and ProgressMessage::Fetched
in tests.rs (the two spots around the asserts that check for FoundInStore for
"second@2.0.0" and the negative Fetched check) and insert a log statement
printing captured right before each assert! to follow the test-logging
convention.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: ae4d85b9-be19-44da-b512-2effeadebaf2

📥 Commits

Reviewing files that changed from the base of the PR and between 25f0f2e and 5cc718e.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (18)
  • crates/package-manager/src/create_virtual_dir_by_snapshot.rs
  • crates/package-manager/src/create_virtual_dir_by_snapshot/tests.rs
  • crates/package-manager/src/create_virtual_store.rs
  • crates/package-manager/src/create_virtual_store/tests.rs
  • crates/package-manager/src/install.rs
  • crates/package-manager/src/install_frozen_lockfile.rs
  • crates/package-manager/src/install_package_by_snapshot.rs
  • crates/package-manager/src/install_package_by_snapshot/tests.rs
  • crates/package-manager/src/install_package_from_registry.rs
  • crates/package-manager/src/install_package_from_registry/tests.rs
  • crates/package-manager/src/install_without_lockfile.rs
  • crates/reporter/src/lib.rs
  • crates/reporter/src/tests.rs
  • crates/tarball/Cargo.toml
  • crates/tarball/src/lib.rs
  • crates/tarball/src/tests.rs
  • tasks/micro-benchmark/Cargo.toml
  • tasks/micro-benchmark/src/main.rs
✅ Files skipped from review due to trivial changes (6)
  • crates/tarball/Cargo.toml
  • crates/package-manager/src/install_package_by_snapshot/tests.rs
  • crates/package-manager/src/install_package_from_registry/tests.rs
  • tasks/micro-benchmark/Cargo.toml
  • crates/package-manager/src/install.rs
  • crates/package-manager/src/install_package_by_snapshot.rs
🚧 Files skipped from review as they are similar to previous changes (5)
  • tasks/micro-benchmark/src/main.rs
  • crates/reporter/src/tests.rs
  • crates/package-manager/src/create_virtual_dir_by_snapshot.rs
  • crates/reporter/src/lib.rs
  • crates/tarball/src/lib.rs

zkochan added 2 commits May 2, 2026 21:45
…onse coverage

Per Copilot review on #372: the previous fix moved the
`pnpm:fetching-progress started` emit ahead of `send().await` to
ensure connection-level failures still emit, but it always set
`size: None` — even on successful downloads where
`Content-Length` was available. That broke the wire contract for
the percent-gauge use case: consumers couldn't render percent for
fixed-size responses because they never received the size.

Restructure the emit to fire exactly once per attempt, populating
`size` from the response's `Content-Length` when a response head
arrives, and leaving it `None` only when there's no response (DNS
/ connect / timeout failures). Both prior contracts hold:

- Pre-response failures still emit `started` once per retried
  attempt (the original Copilot finding from earlier).
- Successful downloads emit `started` with the populated
  `Content-Length` so consumers can render percent gauges (the
  new Copilot finding).

Tests:
- `fetching_progress_and_fetched_events_fire_during_download`
  extended to assert each `started` event in a 503 + 200 retry
  pattern carries a populated `size` (mockito sends Content-Length
  for both responses).
- `started_fires_for_connection_level_failures` extended to
  assert the connect-refused attempt's `started` carries
  `size: None`.
Per Copilot review on #372: the no-lockfile path's `pnpm:progress`
emits in `install_package_from_registry::run` (the `resolved` and
`imported` events) had no reporter-aware test. The frozen-lockfile
path covers similar emits via
`install::tests::install_emits_pnpm_event_sequence`, and the
tarball-side `fetched` / `found_in_store` are covered in
`pacquet-tarball`'s tests, but a regression in the per-package
ordering or `package_id` / `requester` payload through this path
would currently slip through.

`no_lockfile_install_emits_progress_sequence` runs
`InstallPackageFromRegistry::run` against the workspace's
`AutoMockInstance` (same pattern as
`install::tests::should_install_dependencies`) with a recording
reporter and pins the per-package event sequence:
`resolved → fetched → imported` (the mock store is a fresh
tempdir, so the first install always goes through the network
path), plus the `package_id` and `requester` on the leading
`Resolved` event.
Copilot AI review requested due to automatic review settings May 2, 2026 19:51

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 19 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/tarball/src/lib.rs Outdated
Comment thread crates/tarball/src/lib.rs Outdated
…etch errors

Two correctness fixes per Copilot review on #372.

**1. `pnpm:fetching-progress in_progress` gating + trailing edge**

Previously: emitted for every download regardless of size, throttled
at 200ms with leading-edge only. The trailing edge was dropped — a
download that finished shortly after the last 500ms tick would
leave consumers stuck at a stale `downloaded` value below the real
total.

Now matches upstream pnpm exactly (`fetching/tarball-fetcher/src/remoteTarballFetcher.ts:143`):

- Skip `in_progress` entirely when `Content-Length` is unknown
  (chunked / streaming responses) or the tarball is smaller than
  `BIG_TARBALL_SIZE = 5 MB`. Most npm packages are well under this;
  per-byte progress for tiny tarballs floods the JS reporter with
  values that would render as 100% before any UI tick can show
  them. `started` still fires for every package; only the
  per-chunk channel is gated.
- Throttle to 500ms (matches `lodash.throttle(opts.onProgress,
  500)`); leading edge fires on the first chunk, subsequent chunks
  are coalesced.
- Trailing edge: a final emit after the body finishes if the last
  in-flight `downloaded` value differs from the last reported
  one. Matches `lodash.throttle`'s default `{leading: true,
  trailing: true}`.

**2. `run_with_mem_cache` deadlock on owner fetch error**

Previously: when the *owning* requester's `run_without_mem_cache`
errored, the function returned without flipping the cache slot to
`Available` or notifying waiters. Concurrent siblings parked on
`notify.notified().await` forever and the install hung — a single
404 could deadlock every later consumer of the same URL.

Now: the owner uses `match` on the result; on `Err` it sets a new
`CacheValue::Failed`, removes the entry from `mem_cache` (so a
fresh request can retry without inheriting the failure), and
notifies waiters. Waiters check for `Failed` and surface a new
`TarballError::SiblingFetchFailed { url }` instead of the
unreachable! panic the previous code would have hit if `Notify`
had been triggered.

The owner's original error stays with the owner (it can't be
cloned past `reqwest::Error` / IO chains).

Tests:
- `run_with_mem_cache_recovers_from_owning_fetch_error` —
  drives two concurrent `run_with_mem_cache` calls against a 404
  endpoint; asserts both terminate (with a 30s wall-clock cap to
  catch the deadlock regression) and both surface errors.
@zkochan zkochan merged commit 78afb18 into main May 2, 2026
15 checks passed
@zkochan zkochan deleted the feat/reporter-progress branch May 2, 2026 20:49
zkochan added a commit that referenced this pull request May 2, 2026
Per Copilot review on #374:

- The placeholder `LogEvent::<Variant>(...)` looked like Rust
  generics syntax. Rephrase as concrete examples
  (`LogEvent::Stage(...)`, `LogEvent::Context(...)`) so a porter
  doesn't read it as `LogEvent::<Stage>(...)`.
- The "what not to do" example pointed at
  `pacquet-tarball::run_without_mem_cache`'s gate, but the
  `pnpm:fetching-progress in_progress` throttle actually lives one
  call down, in `fetch_and_extract_once`. Update the reference so
  the cross-link resolves.

The third Copilot finding (`ProgressMessage` and the
`pnpm:fetching-progress` channel referenced as if they don't exist)
was a rebase artifact — this branch was cut before #372 merged. The
rebase onto current `main` makes those references valid; no further
edit needed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants