Skip to content
This repository was archived by the owner on May 14, 2026. It is now read-only.

feat(package-manager): is_built gate for the side-effects cache READ path (#421 slice A)#423

Merged
zkochan merged 4 commits into
mainfrom
feat/side-effects-cache-is-built-gate-421
May 12, 2026
Merged

feat(package-manager): is_built gate for the side-effects cache READ path (#421 slice A)#423
zkochan merged 4 commits into
mainfrom
feat/side-effects-cache-is-built-gate-421

Conversation

@zkochan

@zkochan zkochan commented May 12, 2026

Copy link
Copy Markdown
Member

Summary

Implements slice (A) of #421 — the READ-path gate that consumes the foundation merged in #422. With a matching side-effects-cache entry on disk (typically seeded by a prior pnpm install), BuildModules now skips lifecycle scripts for that snapshot, mirroring upstream's !node.isBuilt filter.

User-visible behavior: a warm pacquet install after pnpm install no longer rebuilds packages whose post-build state is already in the store.

What lands

  • pacquet_graph_hasher::detect_node_major() — spawns node --version once at install start, parses the major version. The parse helper is factored out so the rule is unit-tested without a child process. Missing-node returns None and the gate falls through to "rebuild" (safe).
  • DepsGraphNode flipped to own its strings — was borrowed &'a str, now String. Lets callers building the graph from a lockfile skip a separate-arena lifetime story. No external consumers besides in-crate tests yet, so this is internal-only.
  • PrefetchResult.side_effects_maps surfaces the per-row cache_key → FilesMap overlay that the prefetch loop previously dropped (since feat: foundation for the side-effects cache (#397 item #10, part 1) #422 added it to VerifyResult). Arc-wrapped for the same cold-batch cheap-clone reason cas_paths is.
  • CreateVirtualStore returns a new CreateVirtualStoreOutput struct: package_manifests (existing) plus side_effects_maps_by_snapshot: HashMap<PackageKey, Arc<…>> (new). Peer-variants of the same package share one Arc<…> because the store-index row is keyed peer-stripped.
  • New pacquet_package_manager::deps_graph::build_deps_graph adapter — walks snapshots + packages and builds a DepsGraph<PackageKey> for the hasher. full_pkg_id derives from <pkg_id>:<integrity> for registry / tarball resolutions, or <pkg_id>:<hashObject(resolution)> for git / directory / integrity-less tarball, matching upstream's createFullPkgId.
  • BuildModules is_built gate — after the policy gate, before extraction, the build loop computes calc_dep_state per snapshot (with include_dep_graph_hash: true mirroring upstream's hasSideEffects), looks the cache key up in the threaded-in side_effects_maps, and continues on hit. The Config.side_effects_cache: false flag short-circuits the lookup entirely.
  • InstallFrozenLockfile detects the engine name once and threads it (plus the prefetch's side-effects map and the config flag) into BuildModules.

Tests

  • using_side_effects_cache_skips_rebuild ports the upstream 'using side effects cache' case from installing/deps-installer/test/install/sideEffects.ts:79. Hand-builds the side_effects_maps_by_snapshot entry with the exact cache key BuildModules will compute, asserts no pnpm:lifecycle event fires for @pnpm.e2e/failing-postinstall's postinstall (i.e. the build was skipped — if the gate were broken the script would have run and the install would have failed with LifecycleScript(_)).
  • side_effects_cache_disabled_bypasses_the_gate (negative pair): same overlay shape with side_effects_cache: false, expects the build to run and the install to fail with the underlying exit 1 from the postinstall.
  • deps_graph::tests: registry-resolution uses integrity verbatim, dependencies + optionalDependencies fold into children, snapshot without metadata is skipped.

The upstream test runs the install twice — first to populate the cache via the WRITE path, then to consume it. Pacquet doesn't have a WRITE path yet (#421 slice B), so the port hand-crafts the same in-memory state directly. The behavior under test is identical.

Out of scope

Test plan

  • just ready (503 tests + clippy clean)
  • taplo format --check (clean — just ready doesn't run this, CI does)
  • CI green on Linux / macOS / Windows

Written by an agent (Claude Code, claude-opus-4-7).

Summary by CodeRabbit

  • New Features

    • Side-effects caching: installs can skip lifecycle/postinstall when a matching side-effects cache and engine key are present.
    • Host Node major-version detection added to improve cache-key determinism.
    • Prefetch and install flows now return per-snapshot side-effects overlays alongside manifests.
  • Bug Fixes & Improvements

    • Dependency-graph construction and state hashing made deterministic, stable, and memoized for reliable cache gating.
  • Tests

    • Expanded tests for cache gating, graph construction, and Node version parsing.

Review Change Stack

Implements slice (A) of #421 — pacquet now honors a
side-effects cache populated by a prior pnpm install. With a
matching cache entry on disk, the build phase skips lifecycle
scripts for that snapshot, mirroring upstream's `!node.isBuilt`
filter at <https://github.com/pnpm/pnpm/blob/b4f8f47ac2/building/during-install/src/index.ts#L73-L77>.

What lands:

- `pacquet_graph_hasher::detect_node_major()` — spawns
  `node --version` once at install start and parses the major
  version. Parse helper is factored out so the rule is unit-tested
  without a child process. `None` on missing-node falls through to
  the no-cache path.
- `DepsGraphNode` flipped from borrowed `&'a str` to owned
  `String` so callers building the graph from a lockfile don't
  have to keep a separate arena alive. No external consumers
  besides the in-crate tests yet.
- `PrefetchResult.side_effects_maps` surfaces the
  `verify_result.side_effects_maps` that the prefetch loop
  previously dropped. Keyed by the same `<integrity>\t<pkg_id>`
  store-index row key; values `Arc`-wrapped for the same
  cold-batch cheap-clone reason `cas_paths` is. Empty entries are
  skipped at insert.
- `CreateVirtualStore::run` returns a new `CreateVirtualStoreOutput`
  struct carrying `package_manifests` (existing) plus
  `side_effects_maps_by_snapshot: HashMap<PackageKey, Arc<…>>`
  (new). Peer-variants of the same package share one `Arc<…>`
  because the store-index row is keyed peer-stripped.
- New `pacquet_package_manager::deps_graph::build_deps_graph`
  adapter walks the lockfile (`snapshots` + `packages`) and builds
  a `DepsGraph<PackageKey>` for the hasher. `full_pkg_id` derives
  from `<pkg_id>:<integrity>` for registry / tarball resolutions,
  or `<pkg_id>:<hashObject(resolution)>` for git / directory /
  integrity-less tarball — matching upstream's `createFullPkgId`.
- `BuildModules` grows `packages` / `side_effects_maps_by_snapshot`
  / `engine_name` / `side_effects_cache` fields. After the
  policy gate, before extraction, the build loop computes
  `calc_dep_state` per snapshot (with `include_dep_graph_hash: true`
  to mirror `hasSideEffects` upstream), looks the cache key up in
  the threaded-in `side_effects_maps`, and `continue`s when it
  hits. `side_effects_cache: false` short-circuits entirely so the
  build runs unconditionally.
- `InstallFrozenLockfile` detects the engine name once and threads
  it (plus the prefetch's side-effects map and the config flag)
  into `BuildModules`.

Tests:

- `using_side_effects_cache_skips_rebuild` ports the upstream
  `'using side effects cache'` case from
  `installing/deps-installer/test/install/sideEffects.ts:79`.
  Hand-builds the side_effects_maps_by_snapshot entry with the
  exact cache key `BuildModules` will compute, asserts no
  `pnpm:lifecycle` event fires for `@pnpm.e2e/failing-postinstall`'s
  postinstall (i.e. the build was skipped — if the gate were
  broken the script would have run and the install would have
  failed with `LifecycleScript(_)`).
- `side_effects_cache_disabled_bypasses_the_gate` (negative pair):
  same overlay shape with `side_effects_cache: false`, expects
  the build to run and the install to fail with the underlying
  `exit 1` from the postinstall.
- Adapter unit tests in `deps_graph::tests`: registry-resolution
  uses integrity verbatim, dependencies + optionalDependencies
  fold into children, snapshot without metadata is skipped.

Out of scope for this PR (tracked in #421 slice B): the WRITE
path that lets pacquet seed the cache itself.
Copilot AI review requested due to automatic review settings May 12, 2026 12:41
@coderabbitai

coderabbitai Bot commented May 12, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

This PR implements the read path for side-effects caching during installation: DepsGraphNode is migrated from borrowed to owned strings; Node version detection is added for cache keys; a lockfile-to-graph converter builds dependency snapshots; prefetch extracts side-effects overlays from index.db; CreateVirtualStore returns both manifests and overlays; BuildModules computes deterministic cache keys and skips lifecycle scripts on cache hits; and the install flow threads all data together.

Changes

Side-Effects Cache Read Path Implementation

Layer / File(s) Summary
DepsGraphNode ownership migration
crates/graph-hasher/src/dep_state.rs
DepsGraphNode fields changed from lifetime-borrowed &str to owned String (full_pkg_id, children keys), removing the lifetime parameter. calc_dep_state and calc_dep_graph_hash signatures updated. Cycle detection and parent tracking use owned identifiers. All tests updated to use DepsGraphNode<String>.
Node major version detection
crates/graph-hasher/src/engine_name.rs, crates/graph-hasher/src/lib.rs
New detect_node_major() spawns node --version and extracts major version; internal parse_node_version_output handles optional leading v, pre-release suffixes, and invalid inputs. Tests cover valid variants and error cases. Function re-exported from crate API.
Dependency graph builder from lockfile
crates/package-manager/src/deps_graph.rs, crates/package-manager/src/deps_graph/tests.rs, crates/package-manager/src/lib.rs
New build_deps_graph module constructs DepsGraph<PackageKey> from pnpm lockfile snapshots and packages. Computes pnpm-compatible full_pkg_id (registry resolutions use <pkg_id>:<integrity>, others hash JSON resolution as base64). Flattens dependencies and optional_dependencies into children maps via SnapshotDepRef::resolve. Skips snapshots with missing metadata. Tests validate registry handling, dependency inclusion, optional merging, and missing-metadata skipping. Module re-exported from package-manager lib.
Tarball prefetch side-effects extraction
crates/tarball/src/lib.rs
Adds PrefetchedSideEffectsMaps type and extends PrefetchResult with side_effects_maps field. Updates prefetch_cas_paths to conditionally capture verify_result.side_effects_maps when verification passes and maps are non-empty.
CreateVirtualStore output enrichment
crates/package-manager/src/create_virtual_store.rs
Introduces SideEffectsMapsBySnapshot type and CreateVirtualStoreOutput struct pairing manifests with per-snapshot overlays. Changes CreateVirtualStore::run return type to CreateVirtualStoreOutput. No-snapshots path returns initialized empty output. Prefetch result handling destructures side_effects_maps and populates side_effects_maps_by_snapshot keyed by snapshot PackageKey. Final return wraps both.
BuildModules cache infrastructure
crates/package-manager/src/build_modules.rs, crates/package-manager/Cargo.toml
BuildModules struct adds optional package metadata, side-effects overlays, engine_name (Node major), and boolean cache gate. run() destructures inputs and expands dep-graph/cache initialization to support memoized key computation. Per-snapshot build loop adds conditional cache-hit check: when enabled and preconditions met (overlays present, dep graph available, engine_name known), computes cache key via calc_dep_state and skips lifecycle scripts on overlay entry match. Adds pacquet-graph-hasher workspace dependency.
BuildModules test coverage
crates/package-manager/src/build_modules/tests.rs
Updates all fixture constructors to explicitly set new side-effects/cache fields. Adds Unix-only tests: using_side_effects_cache_skips_rebuild verifies cache hits skip failing postinstall (no lifecycle events), and side_effects_cache_disabled_bypasses_the_gate confirms bypass when disabled (lifecycle error raised). Both use synthetic overlays and calc_dep_state for key computation.
InstallFrozenLockfile integration
crates/package-manager/src/install_frozen_lockfile.rs
run() destructures CreateVirtualStoreOutput for manifests and side-effects maps. Adds single per-install detect_node_major() call to compute engine_name. Threads side_effects_maps, engine_name, and cache config into BuildModules constructor. Adjusts imports.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

  • pnpm/pacquet#421: This PR directly implements the side-effects cache READ path steps listed in the issue: prefetch→CreateVirtualStore output, deps_graph builder, calc_dep_state integration, BuildModules gating, and Node major version detection.

Possibly related PRs

  • pnpm/pacquet#422: Both PRs modify the same graph-hasher crate (dep_state, engine_name, lib); this PR updates ownership and node detection on top of that foundation.
  • pnpm/pacquet#419: Overlaps BuildModules changes—both PRs alter BuildModules struct and run logic in related ways.
  • pnpm/pacquet#391: Both PRs modify BuildModules (lifecycle gating and policy) and interact with similar build execution paths.

Suggested reviewers

  • anthonyshew

"🐰 I hopped through lockfiles, mapping every patch,
I asked Node its major and turned versions into match,
Side-effects tucked away where cache keys softly latch,
Lifecycle sleeps when overlays sing a perfect catch,
A rabbit's little hash dance — quick, clever, and spry."

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: implementing the READ path for the side-effects cache gate. It is specific and directly reflects the primary objective of the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description is comprehensive and detailed, covering all major changes, test coverage, scope boundaries, and implementation details with clear references to upstream code.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/side-effects-cache-is-built-gate-421

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented May 12, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 90.04975% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.83%. Comparing base (9c340a7) to head (04cd888).

Files with missing lines Patch % Lines
crates/package-manager/src/create_virtual_store.rs 61.53% 10 Missing ⚠️
crates/package-manager/src/deps_graph.rs 87.50% 5 Missing ⚠️
crates/tarball/src/lib.rs 57.14% 3 Missing ⚠️
crates/graph-hasher/src/engine_name.rs 95.45% 1 Missing ⚠️
crates/package-manager/src/build_modules.rs 97.82% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #423      +/-   ##
==========================================
+ Coverage   85.71%   85.83%   +0.12%     
==========================================
  Files          82       83       +1     
  Lines        5704     5854     +150     
==========================================
+ Hits         4889     5025     +136     
- Misses        815      829      +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented May 12, 2026

Copy link
Copy Markdown

Micro-Benchmark Results

Linux

group                          main                                   pr
-----                          ----                                   --
tarball/download_dependency    1.02     16.4±0.33ms   264.9 KB/sec    1.00     16.1±0.12ms   269.8 KB/sec

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
crates/package-manager/src/deps_graph.rs (1)

62-67: ⚡ Quick win

Avoid unnecessary string allocation in the no-strip path.

When strip_prefix returns None, the current code clones the original string via .to_string(). A better pattern would consume the original String in the no-strip case:

♻️ Proposed refactor
 fn full_pkg_id_for(pkg_key: &PackageKey, resolution: &LockfileResolution) -> String {
     let pkg_id = pkg_key.to_string();
-    // `PackageKey::to_string` produces `<name>@<ver>` (with a
-    // leading `/` for unscoped — strip it to match upstream's
-    // `pkgIdWithPatchHash` shape).
-    let pkg_id = pkg_id.strip_prefix('/').unwrap_or(&pkg_id).to_string();
+    let pkg_id = pkg_id
+        .strip_prefix('/')
+        .map(|s| s.to_string())
+        .unwrap_or(pkg_id);
     if let Some(integrity) = resolution.integrity() {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/package-manager/src/deps_graph.rs` around lines 62 - 67, The current
full_pkg_id_for creates pkg_id as a String and then calls
pkg_id.strip_prefix('/').unwrap_or(&pkg_id).to_string(), which allocates a new
String even when no prefix is present; change the logic in full_pkg_id_for so
you match on pkg_key.to_string() once and then use an if let or match on
pkg_id.strip_prefix('/') to return stripped.to_string() when a prefix exists but
return the original owned String (pkg_id) when it does not, referencing the
full_pkg_id_for function, the pkg_key.to_string() call and the pkg_id variable
to locate the code.
crates/graph-hasher/src/dep_state.rs (1)

22-29: 🏗️ Heavy lift

Keep full_pkg_id branded instead of exposing a raw String.

This identifier has a persisted, pnpm-compatible format, so making it a plain String on a public struct erases the type distinction right where the cache-key contract matters most. A small newtype here would preserve that boundary without changing the hashing logic.

As per coding guidelines, "Declare a newtype wrapper for branded string types instead of collapsing into plain String or &str to preserve the distinction between validating and non-validating brands" and "Template literal types (like ${string}@${string}) should be treated as branded string types with the same validation discipline applied".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/graph-hasher/src/dep_state.rs` around lines 22 - 29, The public
DepsGraphNode currently exposes full_pkg_id as a raw String which loses the
branded/persisted pnpm-compatible identity; introduce a newtype (e.g.,
FullPkgId(String) or FullPkgId::new) and replace DepsGraphNode::full_pkg_id:
String with full_pkg_id: FullPkgId to preserve the brand; implement thin
conveniences (From<String>, From<&str>, AsRef<str> and Display/Deref or
Borrow<str>) so existing hashing and equality code (the hashing logic that
consumes full_pkg_id) can remain unchanged, and update any construction sites to
wrap/unwrap via the newtype.
crates/package-manager/src/deps_graph/tests.rs (1)

52-71: ⚡ Quick win

Add the companion test for the non-integrity full_pkg_id branch.

This only pins the :<integrity> path. The new adapter’s other parity-critical branch—falling back to hashObject(resolution) when no integrity is present—is still untested here, so a regression there would miss this suite.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/package-manager/src/deps_graph/tests.rs` around lines 52 - 71, Add a
companion test that covers the non-integrity branch: create a snapshot entry for
the same package key but with a resolution that has no integrity (set
SnapshotEntry.resolution/Resolution to a registry resolution lacking integrity),
call build_deps_graph(&snapshots, &packages), get the node for the package and
assert node.full_pkg_id uses the fallback hash form (i.e., the full_pkg_id still
begins with the pkg id prefix like "@scope/foo@1.0.0:" but the portion after the
colon is the hashed resolution form produced by hashObject(resolution) rather
than an "sha512-" integrity string); mirror the existing test structure and
naming (e.g., registry_resolution_full_pkg_id_uses_hashobject_when_no_integrity)
and reuse helper symbols key, registry_metadata, SnapshotEntry,
build_deps_graph, and node.full_pkg_id to locate the code.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/package-manager/src/build_modules.rs`:
- Around line 210-225: The code eagerly builds dep_graph and initializes
deps_state_cache even when the side-effects cache gate is closed; change this so
dep_graph (created by crate::build_deps_graph) and the
pacquet_graph_hasher::DepsStateCache (DepsStateCache::new()) are only created
when the three preconditions hold (config.side_effects_cache is true, an
engine_name/Node major exists, and prefetched cache maps are non-empty), or
defer their initialization lazily on first snapshot that is eligible for
side-effects-cache logic; specifically, wrap the packages.map(...) and
deps_state_cache initialization in the same gate used for side-effects-cache
lookup or replace them with an Option that is constructed on-demand inside the
code paths that call dep_graph or deps_state_cache.

---

Nitpick comments:
In `@crates/graph-hasher/src/dep_state.rs`:
- Around line 22-29: The public DepsGraphNode currently exposes full_pkg_id as a
raw String which loses the branded/persisted pnpm-compatible identity; introduce
a newtype (e.g., FullPkgId(String) or FullPkgId::new) and replace
DepsGraphNode::full_pkg_id: String with full_pkg_id: FullPkgId to preserve the
brand; implement thin conveniences (From<String>, From<&str>, AsRef<str> and
Display/Deref or Borrow<str>) so existing hashing and equality code (the hashing
logic that consumes full_pkg_id) can remain unchanged, and update any
construction sites to wrap/unwrap via the newtype.

In `@crates/package-manager/src/deps_graph.rs`:
- Around line 62-67: The current full_pkg_id_for creates pkg_id as a String and
then calls pkg_id.strip_prefix('/').unwrap_or(&pkg_id).to_string(), which
allocates a new String even when no prefix is present; change the logic in
full_pkg_id_for so you match on pkg_key.to_string() once and then use an if let
or match on pkg_id.strip_prefix('/') to return stripped.to_string() when a
prefix exists but return the original owned String (pkg_id) when it does not,
referencing the full_pkg_id_for function, the pkg_key.to_string() call and the
pkg_id variable to locate the code.

In `@crates/package-manager/src/deps_graph/tests.rs`:
- Around line 52-71: Add a companion test that covers the non-integrity branch:
create a snapshot entry for the same package key but with a resolution that has
no integrity (set SnapshotEntry.resolution/Resolution to a registry resolution
lacking integrity), call build_deps_graph(&snapshots, &packages), get the node
for the package and assert node.full_pkg_id uses the fallback hash form (i.e.,
the full_pkg_id still begins with the pkg id prefix like "@scope/foo@1.0.0:" but
the portion after the colon is the hashed resolution form produced by
hashObject(resolution) rather than an "sha512-" integrity string); mirror the
existing test structure and naming (e.g.,
registry_resolution_full_pkg_id_uses_hashobject_when_no_integrity) and reuse
helper symbols key, registry_metadata, SnapshotEntry, build_deps_graph, and
node.full_pkg_id to locate the code.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 9d95cf0e-72fe-44ff-b1e5-8ad7b95d4d4b

📥 Commits

Reviewing files that changed from the base of the PR and between 9c340a7 and 797f0ca.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (12)
  • crates/graph-hasher/src/dep_state.rs
  • crates/graph-hasher/src/engine_name.rs
  • crates/graph-hasher/src/lib.rs
  • crates/package-manager/Cargo.toml
  • crates/package-manager/src/build_modules.rs
  • crates/package-manager/src/build_modules/tests.rs
  • crates/package-manager/src/create_virtual_store.rs
  • crates/package-manager/src/deps_graph.rs
  • crates/package-manager/src/deps_graph/tests.rs
  • crates/package-manager/src/install_frozen_lockfile.rs
  • crates/package-manager/src/lib.rs
  • crates/tarball/src/lib.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: Agent
  • GitHub Check: Lint and Test (windows-latest)
  • GitHub Check: Lint and Test (macos-latest)
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Code Coverage
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Log emissions are part of 'match pnpm'—when porting a function that fires pnpm:<channel> events through globalLogger, logger.debug(...), or streamParser.write(...), mirror the call site, payload, and ordering so @pnpm/cli.default-reporter parses pacquet's NDJSON the same way
Declare a newtype wrapper for branded string types instead of collapsing into plain String or &str to preserve the distinction between validating and non-validating brands
If upstream always validates before construction of a branded string type, implement validation via TryFrom<String> and/or FromStr without providing an infallible public constructor
If upstream never validates a branded string type, expose an infallible From<String> constructor to provide type-safety without runtime validation
When upstream occasionally constructs a branded string type without validation via bare as assertion, expose a from_str_unchecked constructor so callers can opt into the same unchecked path explicitly
Match upstream serde behavior for branded types—use #[serde(try_from = "String")] for deserialization to go through the validator and #[serde(into = "String")] for serialization
Use #[derive(derive_more::From)] and #[derive(derive_more::Into)] for mechanical conversion impls on branded types rather than handwriting impl blocks
Model string-literal unions (like 'auto' | 'always' | 'never') as Rust enums instead of newtype wrappers
Template literal types (like ${string}@${string}) should be treated as branded string types with the same validation discipline applied
Follow the code style guide in CODE_STYLE_GUIDE.md covering code-level conventions not enforced by tooling: imports, modules, naming, ownership and borrowing, parameter type selection, trait bounds, pattern matching, pipe-trait, error handling, test layout, and cloning of Arc and Rc
Warnings are errors in cargo clippy --locked -- --deny warnings, and should not be silenced with `#[allow(....

Files:

  • crates/package-manager/src/lib.rs
  • crates/graph-hasher/src/engine_name.rs
  • crates/graph-hasher/src/lib.rs
  • crates/package-manager/src/install_frozen_lockfile.rs
  • crates/package-manager/src/build_modules.rs
  • crates/tarball/src/lib.rs
  • crates/package-manager/src/create_virtual_store.rs
  • crates/package-manager/src/deps_graph/tests.rs
  • crates/package-manager/src/deps_graph.rs
  • crates/graph-hasher/src/dep_state.rs
  • crates/package-manager/src/build_modules/tests.rs
🧠 Learnings (2)
📚 Learning: 2026-05-07T23:19:08.272Z
Learnt from: KSXGitHub
Repo: pnpm/pacquet PR: 401
File: tasks/integrated-benchmark/src/work_env.rs:343-344
Timestamp: 2026-05-07T23:19:08.272Z
Learning: When reviewing Rust code in pnpm/pacquet for deprecated API usage, do not automatically treat `serde_saphyr::to_string` as deprecated. In `serde-saphyr` v0.0.25, `serde_saphyr::to_string` has no `#[deprecated]` attribute (the `#[deprecated]` later in `serde-saphyr-0.0.25/src/lib.rs` applies to a different function). Only flag `serde_saphyr::to_string` as deprecated if the resolved dependency version’s source shows `#[deprecated]` on that specific function.

Applied to files:

  • crates/package-manager/src/lib.rs
  • crates/graph-hasher/src/engine_name.rs
  • crates/graph-hasher/src/lib.rs
  • crates/package-manager/src/install_frozen_lockfile.rs
  • crates/package-manager/src/build_modules.rs
  • crates/tarball/src/lib.rs
  • crates/package-manager/src/create_virtual_store.rs
  • crates/package-manager/src/deps_graph/tests.rs
  • crates/package-manager/src/deps_graph.rs
  • crates/graph-hasher/src/dep_state.rs
  • crates/package-manager/src/build_modules/tests.rs
📚 Learning: 2026-05-01T10:01:33.766Z
Learnt from: zkochan
Repo: pnpm/pacquet PR: 349
File: crates/reporter/src/tests.rs:121-121
Timestamp: 2026-05-01T10:01:33.766Z
Learning: In Rust test code, follow the repo’s CODE_STYLE_GUIDE test-logging rule: add logging (e.g., `eprintln!`/`eprintln!(...)`) so that useful diagnostic values are printed when a test fails, unless the assertion is `assert_eq!` (where the differing values are already included). Concretely, if you use assertions like `assert!`, `assert_ne!`, etc., ensure the test logs the relevant actual/expected values (or context) before/around the assertion so failures can be diagnosed without rerunning.

Applied to files:

  • crates/package-manager/src/deps_graph/tests.rs
  • crates/package-manager/src/build_modules/tests.rs
🔇 Additional comments (12)
crates/tarball/src/lib.rs (2)

625-641: LGTM: Side-effects map type alias is well-documented.

The structure mirrors PrefetchedCasPaths and the documentation clearly explains the nested map shape and its relationship to pnpm's PackageFilesResponse.sideEffectsMaps.


776-790: LGTM: Conditional population logic is correct.

The side-effects maps are only inserted when integrity verification passed and the maps are non-empty, matching the guard pattern for manifests above. The Arc::new() wrapping provides clear visibility into the allocation cost.

crates/graph-hasher/src/engine_name.rs (1)

30-61: LGTM: Node version detection is robust and well-tested.

The implementation correctly:

  • Returns None when node is not on PATH or fails to launch
  • Checks the exit status before parsing stdout
  • Tolerates alternative runtimes that omit the leading v
  • Handles parse failures gracefully

The separation into a testable parse_node_version_output helper is clean.

crates/graph-hasher/src/lib.rs (1)

20-21: LGTM: API surface expanded appropriately.

The re-exports make DepsGraphNode, detect_node_major, and engine_name available to downstream crates for cache-key computation.

crates/package-manager/Cargo.toml (1)

21-21: LGTM: Dependency addition enables cache-key computation.

The pacquet-graph-hasher dependency is required for calc_dep_state calls in BuildModules.

crates/package-manager/src/create_virtual_store.rs (2)

37-63: LGTM: Output struct cleanly bundles both manifests and side-effects maps.

The CreateVirtualStoreOutput struct provides a clear signature for returning both pieces of data needed by downstream phases.


349-356: LGTM: Arc-sharing across peer variants is correct.

Peer-resolved variants of the same package share one store-index row, so Arc::cloneing the same side-effects map is the right optimization. The explicit Arc::clone call makes the cost visible at the call site.

crates/package-manager/src/lib.rs (1)

9-9: LGTM: Module wiring is standard.

The deps_graph module and re-export follow the existing pattern.

Also applies to: 31-31

crates/package-manager/src/install_frozen_lockfile.rs (2)

90-96: LGTM: Engine detection happens once per install.

Detecting the Node major version once at the start of the install (rather than per snapshot) is the correct optimization. The None fallback when node is not on PATH safely disables the cache gate.


140-145: LGTM: Side-effects cache inputs threaded correctly.

All three pieces needed by the is_built gate are passed: the per-snapshot overlays, the computed engine name, and the config flag.

crates/package-manager/src/deps_graph.rs (2)

36-51: LGTM: Graph construction logic matches upstream.

The function correctly:

  • Skips snapshots with missing metadata (line 43-45) rather than panicking
  • Computes full_pkg_id using the same logic as pnpm's createFullPkgId
  • Builds children from both dependencies and optional_dependencies

88-103: LGTM: Children map correctly flattens deps + optional deps.

The chaining of dependencies and optional_dependencies via flat_map correctly handles Option<HashMap<...>> and produces the unified alias → PackageKey map pnpm expects.

Comment thread crates/package-manager/src/build_modules.rs Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements the side-effects cache READ path for install builds by threading verified sideEffectsMaps from the store-index prefetch into BuildModules, then skipping lifecycle scripts when calc_dep_state’s cache key is present (pnpm parity with the upstream !node.isBuilt filter).

Changes:

  • Plumbs VerifyResult.side_effects_maps through pacquet_tarball::PrefetchResult and CreateVirtualStore to make per-snapshot side-effects overlays available during builds.
  • Adds a lockfile → deps-graph adapter so BuildModules can compute calc_dep_state cache keys for snapshots.
  • Adds Node-major detection (node --version) to build the upstream engine-name prefix once per install, and introduces tests covering cache-hit skip vs cache-disabled behavior.

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
crates/tarball/src/lib.rs Extends PrefetchResult to include side_effects_maps recovered from verified store-index rows.
crates/package-manager/src/lib.rs Adds and re-exports the new deps_graph module.
crates/package-manager/src/install_frozen_lockfile.rs Threads side_effects_maps_by_snapshot, engine_name, and the config flag into BuildModules.
crates/package-manager/src/deps_graph.rs New adapter to build a DepsGraph from lockfile snapshots/packages for dep-state hashing.
crates/package-manager/src/deps_graph/tests.rs Unit tests for full_pkg_id derivation and children wiring behavior.
crates/package-manager/src/create_virtual_store.rs Returns CreateVirtualStoreOutput including side_effects_maps_by_snapshot (peer variants share Arc).
crates/package-manager/src/build_modules.rs Adds the is_built side-effects-cache gate (cache-key lookup via calc_dep_state) to skip lifecycle scripts on hit.
crates/package-manager/src/build_modules/tests.rs Adds positive/negative tests for the gate; updates existing tests for new BuildModules fields.
crates/package-manager/Cargo.toml Adds pacquet-graph-hasher dependency for the package-manager crate.
crates/graph-hasher/src/lib.rs Re-exports DepsGraphNode and detect_node_major.
crates/graph-hasher/src/engine_name.rs Implements detect_node_major() and unit-tests version-output parsing.
crates/graph-hasher/src/dep_state.rs Updates DepsGraphNode to own strings (removes arena/lifetime coupling).
Cargo.lock Locks the updated workspace dependency graph including pacquet-graph-hasher usage.
Comments suppressed due to low confidence (1)

crates/package-manager/src/build_modules/tests.rs:336

  • The dbg! calls in this test will spam output in cargo test/CI runs and are usually rejected by clippy/lints in many repos. Please remove these debug prints (the assertions already cover the behavior).
    .run::<RecordingReporter>()
    .expect("optional build failure must NOT abort the install");
    dbg!(&ignored);

    let captured = EVENTS.lock().expect("lock").clone();
    dbg!(&captured);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/package-manager/src/build_modules.rs Outdated
Comment thread crates/package-manager/src/install_frozen_lockfile.rs Outdated
Comment thread crates/package-manager/src/deps_graph.rs Outdated
Comment thread crates/graph-hasher/src/engine_name.rs
Comment thread crates/package-manager/src/build_modules/tests.rs Outdated
Comment thread crates/package-manager/src/build_modules/tests.rs
Comment thread crates/package-manager/src/build_modules/tests.rs Outdated
- Replace unicode `…` with ASCII `...` so dylint's
  `perfectionist::unicode-ellipsis-in-comments` rule stops failing CI.
- Build `dep_graph` only when the side-effects-cache gate has a chance
  of firing (config on, `engine_name` present, prefetch surfaced
  cache rows). Otherwise the O(|snapshots|) graph construction is
  dead work on every install.
- Run `detect_node_major()` on `tokio::task::spawn_blocking` to keep
  the async install driver from stalling on the synchronous
  `node --version` spawn.
- Fix misleading "hex digest" comment in `full_pkg_id_for`: the
  fallback path uses base64, matching pnpm's `hashObject` default.
- Tighten `detect_node_major` docstring to reflect that the parser
  accepts a missing leading `v` and stops at the first `.` (or end
  of string).
- Drop `dbg!`/`eprintln!` test noise from the side-effects-cache
  tests.
Removing the `dbg!(&ignored)` in the previous commit left the binding
unused, which dylint rejects under `-D unused-variables`. The test
asserts via the captured event list, not the return value.
Copilot AI review requested due to automatic review settings May 12, 2026 12:58

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/package-manager/src/build_modules/tests.rs (1)

503-504: ⚡ Quick win

Add assertion diagnostics for the error-variant check.

If this matches! assertion fails, the failure output lacks immediate error context. Please log err (or include it in the assert message), similar to the pattern used in fail_when_failing_postinstall_is_required.

Based on learnings: “In Rust test code, follow the repo’s CODE_STYLE_GUIDE test-logging rule ... if you use assertions like assert!, assert_ne!, etc., ensure the test logs the relevant actual/expected values (or context) ...”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/package-manager/src/build_modules/tests.rs` around lines 503 - 504,
The assertion checking that err matches
crate::build_modules::BuildModulesError::LifecycleScript lacks diagnostic
output; update the test to include the actual error in the assertion or log it
before the assert (use the local variable err) so failures show context — e.g.,
either call eprintln!/dbg!(err) or change the assert! to include err in its
message when testing matches!(err,
crate::build_modules::BuildModulesError::LifecycleScript(_)); keep the same
variable and enum symbol names to locate the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@crates/package-manager/src/build_modules/tests.rs`:
- Around line 503-504: The assertion checking that err matches
crate::build_modules::BuildModulesError::LifecycleScript lacks diagnostic
output; update the test to include the actual error in the assertion or log it
before the assert (use the local variable err) so failures show context — e.g.,
either call eprintln!/dbg!(err) or change the assert! to include err in its
message when testing matches!(err,
crate::build_modules::BuildModulesError::LifecycleScript(_)); keep the same
variable and enum symbol names to locate the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: b8d6d999-f7fc-4f1a-bc20-817347d76525

📥 Commits

Reviewing files that changed from the base of the PR and between 1bceca5 and 122acb6.

📒 Files selected for processing (1)
  • crates/package-manager/src/build_modules/tests.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Lint and Test (ubuntu-latest)
  • GitHub Check: Lint and Test (macos-latest)
  • GitHub Check: Lint and Test (windows-latest)
  • GitHub Check: Dylint
  • GitHub Check: Code Coverage
  • GitHub Check: Run benchmark on ubuntu-latest
  • GitHub Check: Agent
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Log emissions are part of 'match pnpm'—when porting a function that fires pnpm:<channel> events through globalLogger, logger.debug(...), or streamParser.write(...), mirror the call site, payload, and ordering so @pnpm/cli.default-reporter parses pacquet's NDJSON the same way
Declare a newtype wrapper for branded string types instead of collapsing into plain String or &str to preserve the distinction between validating and non-validating brands
If upstream always validates before construction of a branded string type, implement validation via TryFrom<String> and/or FromStr without providing an infallible public constructor
If upstream never validates a branded string type, expose an infallible From<String> constructor to provide type-safety without runtime validation
When upstream occasionally constructs a branded string type without validation via bare as assertion, expose a from_str_unchecked constructor so callers can opt into the same unchecked path explicitly
Match upstream serde behavior for branded types—use #[serde(try_from = "String")] for deserialization to go through the validator and #[serde(into = "String")] for serialization
Use #[derive(derive_more::From)] and #[derive(derive_more::Into)] for mechanical conversion impls on branded types rather than handwriting impl blocks
Model string-literal unions (like 'auto' | 'always' | 'never') as Rust enums instead of newtype wrappers
Template literal types (like ${string}@${string}) should be treated as branded string types with the same validation discipline applied
Follow the code style guide in CODE_STYLE_GUIDE.md covering code-level conventions not enforced by tooling: imports, modules, naming, ownership and borrowing, parameter type selection, trait bounds, pattern matching, pipe-trait, error handling, test layout, and cloning of Arc and Rc
Warnings are errors in cargo clippy --locked -- --deny warnings, and should not be silenced with `#[allow(....

Files:

  • crates/package-manager/src/build_modules/tests.rs
🧠 Learnings (2)
📚 Learning: 2026-05-01T10:01:33.766Z
Learnt from: zkochan
Repo: pnpm/pacquet PR: 349
File: crates/reporter/src/tests.rs:121-121
Timestamp: 2026-05-01T10:01:33.766Z
Learning: In Rust test code, follow the repo’s CODE_STYLE_GUIDE test-logging rule: add logging (e.g., `eprintln!`/`eprintln!(...)`) so that useful diagnostic values are printed when a test fails, unless the assertion is `assert_eq!` (where the differing values are already included). Concretely, if you use assertions like `assert!`, `assert_ne!`, etc., ensure the test logs the relevant actual/expected values (or context) before/around the assertion so failures can be diagnosed without rerunning.

Applied to files:

  • crates/package-manager/src/build_modules/tests.rs
📚 Learning: 2026-05-07T23:19:08.272Z
Learnt from: KSXGitHub
Repo: pnpm/pacquet PR: 401
File: tasks/integrated-benchmark/src/work_env.rs:343-344
Timestamp: 2026-05-07T23:19:08.272Z
Learning: When reviewing Rust code in pnpm/pacquet for deprecated API usage, do not automatically treat `serde_saphyr::to_string` as deprecated. In `serde-saphyr` v0.0.25, `serde_saphyr::to_string` has no `#[deprecated]` attribute (the `#[deprecated]` later in `serde-saphyr-0.0.25/src/lib.rs` applies to a different function). Only flag `serde_saphyr::to_string` as deprecated if the resolved dependency version’s source shows `#[deprecated]` on that specific function.

Applied to files:

  • crates/package-manager/src/build_modules/tests.rs
🔇 Additional comments (1)
crates/package-manager/src/build_modules/tests.rs (1)

350-461: Strong upstream-parity coverage for the side-effects-cache read gate.

This test is precise: it computes the same dep-state key path and validates rebuild skipping by asserting no LogEvent::Lifecycle emission.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 3 comments.

Comment thread crates/package-manager/src/deps_graph.rs Outdated
Comment thread crates/package-manager/src/deps_graph/tests.rs Outdated
Comment thread crates/package-manager/src/deps_graph/tests.rs Outdated
Pacquet only parses v9 lockfiles, whose snapshot keys render as
`<name>@<ver>` with no leading `/` (pre-v6 used `/<name>/<ver>` —
unsupported). The `strip_prefix('/')` call in `full_pkg_id_for` and
the matching comment in the test were therefore dead/misleading.

Also drop the unused `hex_tail` parameter from the test integrity
helper.
@github-actions

Copy link
Copy Markdown

Integrated-Benchmark Report (Linux)

Scenario: Frozen Lockfile

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 2.575 ± 0.164 2.401 2.872 1.05 ± 0.07
pacquet@main 2.444 ± 0.069 2.358 2.581 1.00
pnpm 6.238 ± 0.082 6.107 6.377 2.55 ± 0.08
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 2.5754836944,
      "stddev": 0.16403784036417063,
      "median": 2.4805746404000004,
      "user": 2.48279752,
      "system": 3.50350124,
      "min": 2.4010603184000003,
      "max": 2.8723646104,
      "times": [
        2.7062244884,
        2.7449617414,
        2.4962450654,
        2.7022467284,
        2.8723646104,
        2.4625844984,
        2.4649042154000003,
        2.4566162374,
        2.4476290404000003,
        2.4010603184000003
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 2.4442607843,
      "stddev": 0.06911809380864085,
      "median": 2.4371898884,
      "user": 2.46052832,
      "system": 3.44357544,
      "min": 2.3583072344000002,
      "max": 2.5809509514,
      "times": [
        2.3894768534,
        2.5809509514,
        2.5193827044,
        2.4308915884,
        2.4736726534,
        2.3676922814,
        2.4136424714,
        2.3583072344000002,
        2.4434881884,
        2.4651029164000002
      ]
    },
    {
      "command": "pnpm",
      "mean": 6.2375077939,
      "stddev": 0.08181468058189033,
      "median": 6.2554993624,
      "user": 9.116679620000003,
      "system": 4.539465339999998,
      "min": 6.1068963354,
      "max": 6.3767380544000005,
      "times": [
        6.3767380544000005,
        6.3040029204,
        6.1117151124,
        6.2205879244,
        6.2668058014,
        6.1068963354,
        6.2627605044,
        6.2482382204,
        6.2674135024,
        6.2099195634
      ]
    }
  ]
}

Scenario: Frozen Lockfile (Hot Cache)

Command Mean [ms] Min [ms] Max [ms] Relative
pacquet@HEAD 684.3 ± 38.6 650.1 788.8 1.00
pacquet@main 758.6 ± 48.8 688.4 864.4 1.11 ± 0.09
pnpm 2662.6 ± 81.1 2552.2 2814.7 3.89 ± 0.25
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 0.6842708955800001,
      "stddev": 0.038589254790412336,
      "median": 0.67647917368,
      "user": 0.35142826,
      "system": 1.47135334,
      "min": 0.65012200018,
      "max": 0.78882972818,
      "times": [
        0.78882972818,
        0.66896645718,
        0.68064075718,
        0.65012200018,
        0.67678005718,
        0.69014171118,
        0.66766728218,
        0.65928498718,
        0.68409768518,
        0.67617829018
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 0.75863153358,
      "stddev": 0.0487957928976274,
      "median": 0.74697253918,
      "user": 0.34740536000000005,
      "system": 1.50975574,
      "min": 0.68835436818,
      "max": 0.86439147718,
      "times": [
        0.79756802018,
        0.68835436818,
        0.75188197518,
        0.71606527418,
        0.86439147718,
        0.74116361318,
        0.74206310318,
        0.75900062418,
        0.73726895518,
        0.78855792518
      ]
    },
    {
      "command": "pnpm",
      "mean": 2.6625847368800004,
      "stddev": 0.08105080724397883,
      "median": 2.6570760621800003,
      "user": 3.2390090599999994,
      "system": 2.26072694,
      "min": 2.55217601418,
      "max": 2.81474226018,
      "times": [
        2.7076934081800004,
        2.55217601418,
        2.6941505941800004,
        2.81474226018,
        2.6452080811800003,
        2.60804152518,
        2.61057699418,
        2.5763452641800004,
        2.7479691841800005,
        2.6689440431800002
      ]
    }
  ]
}

@zkochan zkochan merged commit 97d7439 into main May 12, 2026
16 checks passed
@zkochan zkochan deleted the feat/side-effects-cache-is-built-gate-421 branch May 12, 2026 13:30
zkochan added a commit that referenced this pull request May 12, 2026
`crate::upload` and `add_files_from_dir` each name both a module
and a function inside it; rustdoc's intra-doc-link resolver flags
the bare ``[`name`]`` form as ambiguous under `-D warnings`. Adding
the `()` suffix routes each link to the function explicitly,
matching the same fix from PR #423.
zkochan added a commit that referenced this pull request May 12, 2026
## Summary

Implements slice (B) of #421 — the WRITE path for pnpm's side-effects cache. With this PR, a successful postinstall now re-CAFSes the built package directory, diffs against the pristine `PackageFilesIndex.files` row, and mutates the row's `side_effects` map so a future install (pacquet *or* pnpm) can skip the rebuild via the READ path landed in #423.

User-visible behavior: lifecycle scripts that produce build artifacts (`node-gyp` outputs, generated source files, etc.) get cached the first time the install runs; subsequent installs against the same lockfile + Node major skip the script and reuse the cached overlay.

## What lands

- **`pacquet_store_dir::add_files_from_dir`** — directory walker that re-CAFSes a post-build package directory. Mirrors pnpm/pnpm@`7e3145f9fc`:`store/cafs/src/addFilesFromDir.ts:16-194` + `worker/src/start.ts:312-383`. Containment: `dunce::canonicalize` each symlink target, drop if out of root; read/recurse via the resolved canonical path so retargeting a symlink between containment check and read can't smuggle out-of-root data into CAFS. Skips top-level `node_modules` (matches upstream `includeNodeModules: false`). Cycle-safe via a recursion-stack `visited: HashSet<PathBuf>` (entries inserted on descend, removed on return).
- **`pacquet_store_dir::upload` + `calculate_diff`** — `upload(store_dir, built_pkg_location, files_index_file, cache_key, writer)` walks the package via `add_files_from_dir` and queues a `WriteMsg::SideEffectsUpload { key, cache_key, current_files }` to the writer task. The actual R/M/W of the existing `PackageFilesIndex` row happens inside the writer task (see below), so concurrent uploads to the same row stay commutative.
- **`StoreIndexWriter` now drives a `WriteMsg` enum** (`Replace` vs `SideEffectsUpload`). The batch loop coalesces per-key updates into one `PackageFilesIndex` value before flushing — multiple side-effects uploads for the same row apply in arrival order against a single in-memory state instead of racing against a stale read on a separate readonly handle. The R/M/W (load existing row → check `algo == HASH_ALGORITHM = "sha512"` → `calculate_diff` → insert into `side_effects` map) happens inside the writer's transaction. Loading short-circuits with a `debug!` / `warn!` when the base row is missing or unreadable, matching upstream's `if (!existingFilesIndex) return` / `ALGO_MISMATCH` bail-outs at `pnpm/pnpm@7e3145f9fc:worker/src/start.ts:342-371`.
- **`Config.side_effects_cache_readonly: bool`** (default `false`) — mirrors pnpm's `sideEffectsCacheReadonly`. Two derived helpers consolidate the gate semantics so precedence stays single-sourced:
  - `Config::side_effects_cache_read()` = `cache || readonly`
  - `Config::side_effects_cache_write()` = `cache && !readonly`
- **`BuildModules` WRITE-path call site** — after `run_postinstall_hooks` returns `Ok`, if `side_effects_cache_write` is on and `StoreIndexWriter` + `StoreDir` are both threaded in, calls `pacquet_store_dir::upload(...)` with the snapshot's `store_index_key(integrity, pkg_id)` and the `calc_dep_state` cache key. All upload errors are swallowed with `tracing::warn!`, mirroring upstream's `try { upload } catch { logger.warn }` at `pnpm/pnpm@7e3145f9fc:building/during-install/src/index.ts:208-215`.
- **`build_deps_subgraph`** — the cache-hashing dep graph is now bounded to the forward closure of `requires_build` snapshots. Pure-JS installs feed in an empty root iterator and the function returns immediately; installs with buildable deps walk only what `calc_dep_state` will actually traverse. Upstream's `lockfileToDepGraph` builds the full graph because its consumers extend beyond cache hashing; pacquet's graph is consumed only by `calc_dep_state` today, so the closure-bounded walk produces byte-identical cache keys with strictly less work. Restored the cold-install benchmark to parity with `main`.
- **`StoreIndexWriter` hoisted up to `InstallFrozenLockfile::run`** — previously spawned + drained inside `CreateVirtualStore::run`; now it spans both the prefetch/download phase and the build phase so the WRITE-path upload can queue through it. The final batch flushes after `BuildModules::run` returns.
- **Byte-stable msgpack encoding** — `encode_package_files_index` now iterates `files`, `side_effects`, and `SideEffectsDiff.added` in sorted-key order so two pacquet writes of the same logical row produce identical bytes. Was prompted by a review on PR #424 pointing out that `HashMap` iteration randomization would otherwise make every row look "changed" on byte-diff even when content was identical.
- **`cache_key` refactor in `BuildModules`** — `calc_dep_state` now fires once per snapshot (before the gate check) so both the READ gate and the new WRITE site read the same value.

## Tests

- **`add_files_from_dir::tests`** (9 cases): top-level files, nested forward-slash paths, exec-bit propagation, top-level-vs-nested `node_modules`, symlink-out-of-root drop, symlink-in-root follow, directory-cycle termination, missing-root error.
- **`upload::tests`** (6 cases): identical / added-only / deleted-only / digest change / mode change / mixed — covers every branch of upstream's `calculateDiff`.
- **`deps_graph::tests`** (new for slice B): empty-roots → empty subgraph; forward-closure inclusion + exclusion; cycle termination.
- **`workspace_yaml::tests`**: `sideEffectsCacheReadonly` parses + applies; the 4-state truth table for `side_effects_cache_read()` / `side_effects_cache_write()`.
- **`build_modules::tests`** (new):
  - `write_path_populates_side_effects_row` — full WRITE-path: pre-seed a base row, run a postinstall that creates `generated.txt`, drain the writer task, assert `side_effects[cache_key].added` contains `generated.txt` and NOT pristine `index.js`. Mirrors the spirit of upstream's `'a postinstall script does not modify the original sources added to the store'` at `pnpm/pnpm@7e3145f9fc:sideEffects.ts:189-223`.
  - `write_path_disabled_skips_upload` — same scenario with `side_effects_cache_write: false` → row's `side_effects` stays `None`. Counterpart of upstream's gate on `sideEffectsCacheWrite`.
  - `upload_error_does_not_interrupt_install` — postinstall produces a 0-permission file in the package dir; the walker fails on `fs::read` with `EACCES`, surfacing as `UploadError::AddFilesFromDir(_)` which `BuildModules` swallows with `tracing::warn!`. Asserts (a) the postinstall-created `generated.txt` is on disk and (b) the pre-seeded base row's `side_effects` stays `None`. Mirrors `'uploading errors do not interrupt installation'` at `pnpm/pnpm@7e3145f9fc:sideEffects.ts:166-187`.

## Out of scope

- **Patch-file hash in cache key** — depends on `patchedDependencies` (#397 item 9). Today's WRITE side stamps `patch_file_hash: None`; same as the existing READ side.
- **`requires_build` recompute** in the R/M/W path. Upstream recomputes from `(manifest, filesMap)` when the existing row's field is `None`; pacquet leaves the field as-is (cold-batch downloads already populate it with a real value).
- **GVS / hoisted-node-linker** variants of the upstream tests.
KSXGitHub pushed a commit that referenced this pull request May 13, 2026
Pull in 28 commits from upstream main, including the
`pacquet-npmrc` → `pacquet-config` rename (#420) plus features:
- supportedArchitectures + --cpu/--os/--libc (#456)
- frozen-lockfile (#442, #443, #447, #450)
- git-fetcher (#436 / #446, #451, #454)
- side-effects cache (#421 / #422, #423, #424)
- real-hoist + global-virtual-store (#432 / #438 / #444, #449, #452)
- patchedDependencies + allow-builds (#425, #427, #428)
- engine/platform installability (#434 / #439)

Conflicts resolved:
- `crates/npmrc/` files migrated under the renamed
  `crates/config/` directory; `Npmrc` → `Config` everywhere
  except `NpmrcAuth` (which keeps the `.npmrc`-domain name).
- `Config::current` reads the env-var DI generic `Api: EnvVar`
  for ${VAR}-substitution in `.npmrc`. Production turbofish in
  `cli_args.rs` is `Config::current::<RealApi, _, _, _, _>(...)`.
- Two-phase `NpmrcAuth::apply_*` retained so default-registry
  creds key at the yaml-resolved registry URL.
- New `Config::auth_headers` field plumbed through
  `install_package_by_snapshot`'s `DownloadTarballToStore`.
- Tests under `crates/config/src/workspace_yaml/tests.rs`
  pick up the new ParseYaml unit test added on this branch.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants