feat: side-effects cache WRITE path (#421 slice B)#424
Conversation
Compute `calc_dep_state` once per snapshot and store the resulting `Option<String>` before the `is_built` gate check. Pure refactor — the gate-side comparison still uses the same value the previous inline call produced. Routes the value to one binding so a future WRITE-path upload site can reuse the exact key the gate consumed, provably identical because both reads come from the same `let`. `deps_state_cache` makes a hypothetical second call free, but funneling through one site keeps the cache-key composition single-sourced.
Adds `Config.side_effects_cache_readonly: bool` (default `false`) and the corresponding `pnpm-workspace.yaml`'s `sideEffectsCacheReadonly` field via `WorkspaceSettings`. Mirrors upstream's pnpm/pnpm@7e3145f9fc:config/reader/src/Config.ts:124. Two helper methods on `Config` consolidate the gate semantics so the precedence stays single-sourced: - `side_effects_cache_read()` = `cache || readonly` - `side_effects_cache_write()` = `cache && !readonly` Upstream's derivation uses JS `??` (nullish coalesce on tri-state optional values) at config/reader/src/index.ts:614-615; pacquet collapses that to two booleans since `Config` uses eager defaults rather than `Option<bool>`. The user-intuitive semantics — setting `sideEffectsCacheReadonly: true` actually blocks writes — match the flag name, with a small drift from upstream when *both* flags are explicitly set (documented inline on `side_effects_cache_write`).
Adds `pacquet_store_dir::add_files_from_dir(store_dir, pkg_root)` —
walks a (post-build) package directory, writes every regular file
into CAFS via `StoreDir::write_cas_file`, and returns an `AddedFiles
{ files }` map keyed by forward-slash relative path. Ports
pnpm/pnpm@7e3145f9fc:store/cafs/src/addFilesFromDir.ts:16-194 +
worker/src/start.ts:312-383.
Mirrors upstream's containment + cycle defenses:
- Symlinks: `dunce::canonicalize` the target, drop if it resolves
outside `pkg_root` (`isSubdir(rootDir, realPath)`).
- Top-level `node_modules` skipped (`includeNodeModules` defaults
to `false` upstream); nested ones walk normally.
- `visited: HashSet<PathBuf>` carries canonical directory paths
so a `sub/up -> ..` loop terminates.
Pulls `walkdir` and `dunce` into `pacquet-store-dir/Cargo.toml`
from the workspace defaults.
Unit tests cover: top-level files, nested forward-slash paths,
exec-bit propagation, top-level-vs-nested node_modules,
out-of-root-symlink drop, in-root-symlink follow, directory
cycle termination, and missing-root error surfacing.
Wire-up to `BuildModules` lands in subsequent commits along with
the `calculate_diff` helper and the upload wrapper.
Adds `pacquet_store_dir::upload` and `pacquet_store_dir::calculate_diff`. `upload(store_dir, built_pkg_location, files_index_file, side_effects_cache_key, writer)` ports pnpm/pnpm@7e3145f9fc:store/controller/src/storeController/index.ts:90-99 and the worker-side body at pnpm/pnpm@7e3145f9fc:worker/src/start.ts:342-371. Re-hashes the post-build package dir via `add_files_from_dir`, opens a fresh read-only `StoreIndex`, fetches the existing row by `files_index_file`, bails `Ok(())` if missing (matches upstream's `if (!existingFilesIndex) return`), validates the row's `algo` matches `HASH_ALGORITHM = "sha512"` (matches upstream's `ALGO_MISMATCH` guard), inserts the diff into the row's `side_effects` map, and re-queues through `StoreIndexWriter::queue`. `calculate_diff` ports the set-difference helper at pnpm/pnpm@7e3145f9fc:worker/src/start.ts:411-434. Returns `SideEffectsDiff { added, deleted }` with `Option<...>` fields so empty sides elide from msgpack the same way they do upstream. Unit tests cover identical/added-only/deleted-only/digest-change/ mode-change/mixed cases — the upstream test vectors for the helper.
Adds the WRITE-path call site in `BuildModules`. After a successful
`run_postinstall_hooks`, if `side_effects_cache_write` is on and a
`StoreIndexWriter` + `StoreDir` are threaded in, `BuildModules` calls
`pacquet_store_dir::upload(...)` with the snapshot's
`store_index_key(integrity, pkg_id)` and the cache key computed by
`calc_dep_state`. Upload errors are swallowed with `tracing::warn!`,
mirroring upstream's `try { upload } catch { logger.warn }` at
pnpm/pnpm@7e3145f9fc:building/during-install/src/index.ts:208-215.
`cache_gate_active` now also opens up for the WRITE side so the dep
graph gets built even when no READ-side overlay is present, ensuring
`cache_key` is non-None when WRITE is enabled.
New `BuildModules` fields are all optional / bool-with-default-false
so existing callers (and the 6 existing test sites) need only the
three new field initializers; behavior is unchanged when WRITE is
off.
Tests:
- `write_path_populates_side_effects_row`: pre-seeds a base
`PackageFilesIndex` row, runs a postinstall that creates
`generated.txt`, drains the writer task, re-reads the row, and
asserts `side_effects[cache_key].added` contains `generated.txt`
and NOT pristine `index.js`. Mirrors the spirit of upstream's
`'a postinstall script does not modify the original sources
added to the store'` test at sideEffects.ts:189-223.
- `write_path_disabled_skips_upload`: same scenario with
`side_effects_cache_write: false` — row's `side_effects` stays
`None`. Counterpart of upstream's gate on `sideEffectsCacheWrite`.
The batched store-index writer was spawned + drained inside `CreateVirtualStore::run`. The side-effects-cache WRITE path in `BuildModules` now also needs to queue rows through that same writer, which means it has to outlive `CreateVirtualStore::run`. This commit hoists the `spawn` and the `drop + await` calls up to `InstallFrozenLockfile::run`, threads the `&Arc<StoreIndexWriter>` into both `CreateVirtualStore` (existing consumer) and `BuildModules` (new consumer), and keeps the existing best-effort error semantics (`warn!` + continue). Behaviour is unchanged for the prefetch / cold-batch path. The final batch now flushes after `BuildModules::run` returns, so any WRITE-path mutations queued during the build phase land before the install closes out.
Adds `upload_error_does_not_interrupt_install`, mirroring upstream's `'uploading errors do not interrupt installation'` at pnpm/pnpm@7e3145f9fc:installing/deps-installer/test/install/sideEffects.ts:166-187. Upstream stubs `opts.storeController.upload` to throw; pacquet reaches the same property by handing `BuildModules` a `store_dir` with no `index.db`. The WRITE path's `StoreIndex::open_readonly_in` returns an error, `upload` surfaces it as `UploadError::OpenIndex`, `BuildModules` swallows it with a `tracing::warn!` and proceeds — matching upstream's `try { upload } catch { logger.warn }` shape. The postinstall-generated artifact on disk is the proof that the build ran end-to-end past the swallowed upload error. The upstream `'a postinstall script does not modify the original sources added to the store'` test is already covered by `write_path_populates_side_effects_row` (this PR's earlier commit), which asserts the post-build diff carries the new file and the pristine file's digest is unchanged — same property as upstream's "original sources unchanged" assertion against `getFilePathByModeInCafs`.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR implements the complete side-effects cache write path: Config introduces readonly and read/write gate helpers; new store-dir modules provide file indexing and diff computation; StoreIndexWriter gains side-effects upload coalescing; the writer is threaded through CreateVirtualStore/BuildModules/Install orchestration; tests cover gates, indexing, diffs, and upload resilience. ChangesSide-effects cache write path
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
`crate::upload` and `add_files_from_dir` each name both a module and a function inside it; rustdoc's intra-doc-link resolver flags the bare ``[`name`]`` form as ambiguous under `-D warnings`. Adding the `()` suffix routes each link to the function explicitly, matching the same fix from PR #423.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/package-manager/src/install_frozen_lockfile.rs (1)
158-171:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPass the read gate here, not the raw cache flag.
BuildModulesnow gets a separate write gate, but the read-side field is still wired toconfig.side_effects_cache. That breaks the new readonly mode:side_effects_cache=false+side_effects_cache_readonly=trueshould still consume cached side effects, but this path will force a rebuild.Suggested fix
let ignored_builds = BuildModules { virtual_store_dir: &config.virtual_store_dir, modules_dir: &config.modules_dir, lockfile_dir: manifest_dir, snapshots, packages, importers, allow_build_policy: &allow_build_policy, side_effects_maps_by_snapshot: Some(&side_effects_maps_by_snapshot), engine_name: engine_name.as_deref(), - side_effects_cache: config.side_effects_cache, + side_effects_cache: config.side_effects_cache_read(), side_effects_cache_write: config.side_effects_cache_write(), store_dir: Some(&config.store_dir), store_index_writer: Some(&store_index_writer), }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/package-manager/src/install_frozen_lockfile.rs` around lines 158 - 171, The BuildModules initializer is still using config.side_effects_cache for the read gate, breaking readonly mode; change the read-side argument to the dedicated read gate (e.g. pass config.side_effects_cache_readonly) or, to preserve current semantics, pass the OR of both flags (config.side_effects_cache || config.side_effects_cache_readonly) into the BuildModules read field (refer to the BuildModules struct construction and the side_effects_cache_read / side_effects_cache fields) while keeping side_effects_cache_write as-is.
🧹 Nitpick comments (2)
crates/config/src/workspace_yaml/tests.rs (1)
169-189: ⚡ Quick winPrefer
assert_eq!for the truth-table cases.These bare
assert!calls won't show which(read, write)pair failed. Folding each case into oneassert_eq!on the expected tuple keeps the coverage and makes failures self-describing.♻️ Suggested rewrite
fn side_effects_cache_gates_truth_table() { let mut config = Config::new(); - assert!(config.side_effects_cache_read()); - assert!(config.side_effects_cache_write()); + assert_eq!( + (config.side_effects_cache_read(), config.side_effects_cache_write()), + (true, true), + ); config.side_effects_cache = false; config.side_effects_cache_readonly = false; - assert!(!config.side_effects_cache_read()); - assert!(!config.side_effects_cache_write()); + assert_eq!( + (config.side_effects_cache_read(), config.side_effects_cache_write()), + (false, false), + ); config.side_effects_cache = true; config.side_effects_cache_readonly = true; - assert!(config.side_effects_cache_read()); - assert!(!config.side_effects_cache_write()); + assert_eq!( + (config.side_effects_cache_read(), config.side_effects_cache_write()), + (true, false), + ); config.side_effects_cache = false; config.side_effects_cache_readonly = true; - assert!(config.side_effects_cache_read()); - assert!(!config.side_effects_cache_write()); + assert_eq!( + (config.side_effects_cache_read(), config.side_effects_cache_write()), + (true, false), + ); }Based on learnings: In Rust test code, follow the repo’s CODE_STYLE_GUIDE test-logging rule: add logging (e.g.,
eprintln!/eprintln!(...)) so that useful diagnostic values are printed when a test fails, unless the assertion isassert_eq!.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/config/src/workspace_yaml/tests.rs` around lines 169 - 189, The test side_effects_cache_gates_truth_table currently uses separate assert! calls for read/write pairs which don't show which pair failed; replace each pair of assert!(config.side_effects_cache_read()) / assert!(config.side_effects_cache_write()) (and their negated forms) with a single assert_eq! comparing (config.side_effects_cache_read(), config.side_effects_cache_write()) to the expected (bool,bool) tuple for that case. Locate the assertions in the side_effects_cache_gates_truth_table test (created via Config::new(), toggling config.side_effects_cache and config.side_effects_cache_readonly) and update them to use assert_eq! so failures are self-describing; no extra eprintln! logging is needed once assert_eq! is used.crates/store-dir/src/upload/tests.rs (1)
41-48: ⚡ Quick winMake the
added_onlyassertion self-describing.
assert!(added.contains_key("new"))gives no visibility into the computed diff on failure. Either logaddedfirst or assert on the expected key set withassert_eq!.Based on learnings: In Rust test code, follow the repo’s CODE_STYLE_GUIDE test-logging rule: add logging (e.g.,
eprintln!/eprintln!(...)) so that useful diagnostic values are printed when a test fails, unless the assertion isassert_eq!.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/store-dir/src/upload/tests.rs` around lines 41 - 48, The test added_only should be made self-describing by either printing the computed `added` before asserting or by using an equality assertion on the expected key set; update the test function `added_only` (which calls `calculate_diff` and checks `diff.added`) to either emit `eprintln!("{:?}", added)` right before the assertion or replace `assert!(added.contains_key("new"))` with an `assert_eq!` that compares the `added` keys to the expected set (e.g., collect `added.keys()` into a Vec/HashSet and assert it equals the single-element collection containing "new") so failures show the actual diff.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/store-dir/src/add_files_from_dir/tests.rs`:
- Around line 169-172: The test uses a hardcoded Unix absolute path; make it
platform-neutral by creating a fresh temporary directory (e.g., via
tempfile::tempdir()) and building a non-existent path under that tempdir, then
call add_files_from_dir(&store_dir, &nonexistent_path) and assert it returns
Err(AddFilesFromDirError::CanonicalizeRoot { .. }); update the
missing_root_errors test to use make_store(), tempfile::tempdir(),
tempdir.path().join("nonexistent") (or similar) as the root path so the test is
deterministic on Windows and Unix.
In `@crates/store-dir/src/upload.rs`:
- Around line 81-104: The code performs a read-modify-write using the on-disk
index only and then calls writer.queue(files_index_file, existing), which causes
races where concurrent uploads overwrite earlier queued side_effects; fix by
merging any pending queued row for the same files_index_file from the writer
before inserting the new side_effects entry so queued mutations are accumulated.
Concretely, inside upload() after loading `existing` (the result of index.get)
and before modifying `existing.side_effects`, attempt to retrieve the
writer-owned pending entry for `files_index_file` (or otherwise ask the
StoreIndexWriter for a pending/queued row), deserialize/merge its
`.side_effects` into `existing.side_effects`, then compute `diff` and insert
into the merged map, and finally call
`writer.queue(files_index_file.to_string(), existing)`; ensure you reference
`StoreIndex::open_readonly_in`, `index.get`, `existing`,
`side_effects_cache_key`, `calculate_diff`, and `writer.queue` when locating the
code to change.
---
Outside diff comments:
In `@crates/package-manager/src/install_frozen_lockfile.rs`:
- Around line 158-171: The BuildModules initializer is still using
config.side_effects_cache for the read gate, breaking readonly mode; change the
read-side argument to the dedicated read gate (e.g. pass
config.side_effects_cache_readonly) or, to preserve current semantics, pass the
OR of both flags (config.side_effects_cache ||
config.side_effects_cache_readonly) into the BuildModules read field (refer to
the BuildModules struct construction and the side_effects_cache_read /
side_effects_cache fields) while keeping side_effects_cache_write as-is.
---
Nitpick comments:
In `@crates/config/src/workspace_yaml/tests.rs`:
- Around line 169-189: The test side_effects_cache_gates_truth_table currently
uses separate assert! calls for read/write pairs which don't show which pair
failed; replace each pair of assert!(config.side_effects_cache_read()) /
assert!(config.side_effects_cache_write()) (and their negated forms) with a
single assert_eq! comparing (config.side_effects_cache_read(),
config.side_effects_cache_write()) to the expected (bool,bool) tuple for that
case. Locate the assertions in the side_effects_cache_gates_truth_table test
(created via Config::new(), toggling config.side_effects_cache and
config.side_effects_cache_readonly) and update them to use assert_eq! so
failures are self-describing; no extra eprintln! logging is needed once
assert_eq! is used.
In `@crates/store-dir/src/upload/tests.rs`:
- Around line 41-48: The test added_only should be made self-describing by
either printing the computed `added` before asserting or by using an equality
assertion on the expected key set; update the test function `added_only` (which
calls `calculate_diff` and checks `diff.added`) to either emit
`eprintln!("{:?}", added)` right before the assertion or replace
`assert!(added.contains_key("new"))` with an `assert_eq!` that compares the
`added` keys to the expected set (e.g., collect `added.keys()` into a
Vec/HashSet and assert it equals the single-element collection containing "new")
so failures show the actual diff.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: e9b17bb0-c3af-4629-a74d-ad7912473796
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (15)
crates/config/src/lib.rscrates/config/src/workspace_yaml.rscrates/config/src/workspace_yaml/tests.rscrates/package-manager/Cargo.tomlcrates/package-manager/src/build_modules.rscrates/package-manager/src/build_modules/tests.rscrates/package-manager/src/create_virtual_store.rscrates/package-manager/src/install_frozen_lockfile.rscrates/package-manager/src/install_package_from_registry/tests.rscrates/store-dir/Cargo.tomlcrates/store-dir/src/add_files_from_dir.rscrates/store-dir/src/add_files_from_dir/tests.rscrates/store-dir/src/lib.rscrates/store-dir/src/upload.rscrates/store-dir/src/upload/tests.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Agent
- GitHub Check: Lint and Test (windows-latest)
- GitHub Check: Lint and Test (macos-latest)
- GitHub Check: Cargo Deny
- GitHub Check: Dylint
- GitHub Check: Run benchmark on ubuntu-latest
- GitHub Check: Run benchmark on ubuntu-latest
- GitHub Check: Code Coverage
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Log emissions are part of 'match pnpm'—when porting a function that firespnpm:<channel>events throughglobalLogger,logger.debug(...), orstreamParser.write(...), mirror the call site, payload, and ordering so@pnpm/cli.default-reporterparses pacquet's NDJSON the same way
Declare a newtype wrapper for branded string types instead of collapsing into plainStringor&strto preserve the distinction between validating and non-validating brands
If upstream always validates before construction of a branded string type, implement validation viaTryFrom<String>and/orFromStrwithout providing an infallible public constructor
If upstream never validates a branded string type, expose an infallibleFrom<String>constructor to provide type-safety without runtime validation
When upstream occasionally constructs a branded string type without validation via bareasassertion, expose afrom_str_uncheckedconstructor so callers can opt into the same unchecked path explicitly
Match upstream serde behavior for branded types—use#[serde(try_from = "String")]for deserialization to go through the validator and#[serde(into = "String")]for serialization
Use#[derive(derive_more::From)]and#[derive(derive_more::Into)]for mechanical conversion impls on branded types rather than handwritingimplblocks
Model string-literal unions (like'auto' | 'always' | 'never') as Rustenums instead of newtype wrappers
Template literal types (like${string}@${string}) should be treated as branded string types with the same validation discipline applied
Follow the code style guide inCODE_STYLE_GUIDE.mdcovering code-level conventions not enforced by tooling: imports, modules, naming, ownership and borrowing, parameter type selection, trait bounds, pattern matching,pipe-trait, error handling, test layout, and cloning ofArcandRc
Warnings are errors incargo clippy --locked -- --deny warnings, and should not be silenced with `#[allow(....
Files:
crates/config/src/workspace_yaml.rscrates/package-manager/src/install_frozen_lockfile.rscrates/package-manager/src/install_package_from_registry/tests.rscrates/config/src/lib.rscrates/package-manager/src/build_modules.rscrates/config/src/workspace_yaml/tests.rscrates/store-dir/src/add_files_from_dir/tests.rscrates/package-manager/src/create_virtual_store.rscrates/store-dir/src/add_files_from_dir.rscrates/store-dir/src/upload.rscrates/store-dir/src/upload/tests.rscrates/store-dir/src/lib.rscrates/package-manager/src/build_modules/tests.rs
🧠 Learnings (2)
📚 Learning: 2026-05-07T23:19:08.272Z
Learnt from: KSXGitHub
Repo: pnpm/pacquet PR: 401
File: tasks/integrated-benchmark/src/work_env.rs:343-344
Timestamp: 2026-05-07T23:19:08.272Z
Learning: When reviewing Rust code in pnpm/pacquet for deprecated API usage, do not automatically treat `serde_saphyr::to_string` as deprecated. In `serde-saphyr` v0.0.25, `serde_saphyr::to_string` has no `#[deprecated]` attribute (the `#[deprecated]` later in `serde-saphyr-0.0.25/src/lib.rs` applies to a different function). Only flag `serde_saphyr::to_string` as deprecated if the resolved dependency version’s source shows `#[deprecated]` on that specific function.
Applied to files:
crates/config/src/workspace_yaml.rscrates/package-manager/src/install_frozen_lockfile.rscrates/package-manager/src/install_package_from_registry/tests.rscrates/config/src/lib.rscrates/package-manager/src/build_modules.rscrates/config/src/workspace_yaml/tests.rscrates/store-dir/src/add_files_from_dir/tests.rscrates/package-manager/src/create_virtual_store.rscrates/store-dir/src/add_files_from_dir.rscrates/store-dir/src/upload.rscrates/store-dir/src/upload/tests.rscrates/store-dir/src/lib.rscrates/package-manager/src/build_modules/tests.rs
📚 Learning: 2026-05-01T10:01:33.766Z
Learnt from: zkochan
Repo: pnpm/pacquet PR: 349
File: crates/reporter/src/tests.rs:121-121
Timestamp: 2026-05-01T10:01:33.766Z
Learning: In Rust test code, follow the repo’s CODE_STYLE_GUIDE test-logging rule: add logging (e.g., `eprintln!`/`eprintln!(...)`) so that useful diagnostic values are printed when a test fails, unless the assertion is `assert_eq!` (where the differing values are already included). Concretely, if you use assertions like `assert!`, `assert_ne!`, etc., ensure the test logs the relevant actual/expected values (or context) before/around the assertion so failures can be diagnosed without rerunning.
Applied to files:
crates/package-manager/src/install_package_from_registry/tests.rscrates/config/src/workspace_yaml/tests.rscrates/store-dir/src/add_files_from_dir/tests.rscrates/store-dir/src/upload/tests.rscrates/package-manager/src/build_modules/tests.rs
There was a problem hiding this comment.
Pull request overview
Implements the side-effects cache WRITE path so pacquet can persist postinstall-produced artifacts into the store-index, enabling subsequent installs (pacquet or pnpm) to skip rebuilds via the already-landed READ gate.
Changes:
- Added
add_files_from_dir+uploadinpacquet-store-dirto re-CAFS a built package dir, compute a side-effects diff, and queue a read-modify-write update toPackageFilesIndex.side_effects. - Introduced
Config.side_effects_cache_readonlyplusside_effects_cache_read()/side_effects_cache_write()helpers to mirror pnpm’s read/write gating semantics. - Hoisted
StoreIndexWriterto span download + build phases, and wired a best-effort upload call after successful postinstall inBuildModules, with extensive new test coverage.
Reviewed changes
Copilot reviewed 15 out of 16 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/store-dir/src/add_files_from_dir.rs | Directory walker that re-CAFSes an on-disk package directory with symlink containment + cycle handling. |
| crates/store-dir/src/add_files_from_dir/tests.rs | Unit tests for directory walking, node_modules skipping, symlink containment, and cycle termination. |
| crates/store-dir/src/upload.rs | Worker-side orchestrator to diff vs pristine index row and queue side-effects cache updates; includes calculate_diff. |
| crates/store-dir/src/upload/tests.rs | Unit tests covering calculateDiff-style cases (identical/add/delete/mode/digest/mixed). |
| crates/store-dir/src/lib.rs | Exposes new add_files_from_dir and upload APIs from the crate. |
| crates/store-dir/Cargo.toml | Adds dependencies needed for the new walker/upload code. |
| crates/config/src/lib.rs | Adds side_effects_cache_readonly and helper methods for READ/WRITE gate semantics. |
| crates/config/src/workspace_yaml.rs | Parses and applies sideEffectsCacheReadonly from pnpm-workspace.yaml. |
| crates/config/src/workspace_yaml/tests.rs | Tests YAML parsing and the read/write truth table for cache gates. |
| crates/package-manager/src/install_frozen_lockfile.rs | Owns StoreIndexWriter for the full install and threads it into download + build phases; awaits final flush at end. |
| crates/package-manager/src/create_virtual_store.rs | Accepts caller-owned StoreIndexWriter instead of spawning/awaiting internally. |
| crates/package-manager/src/build_modules.rs | Adds WRITE-path upload call site after successful postinstall; refactors cache key computation to be shared. |
| crates/package-manager/src/build_modules/tests.rs | Adds end-to-end-ish tests for write path population, write-disabled behavior, and swallowed upload errors. |
| crates/package-manager/src/install_package_from_registry/tests.rs | Updates test config construction for the new readonly flag. |
| crates/package-manager/Cargo.toml | Adds sha2 as a dev-dependency for new test helpers. |
| Cargo.lock | Lockfile updates for newly referenced crates. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #424 +/- ##
==========================================
+ Coverage 85.82% 86.33% +0.50%
==========================================
Files 83 85 +2
Lines 5854 6123 +269
==========================================
+ Hits 5024 5286 +262
- Misses 830 837 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Micro-Benchmark ResultsLinux |
- Move side-effects R/M/W into the writer task (coderabbit). `StoreIndexWriter` now drives a `WriteMsg` enum with `Replace` and `SideEffectsUpload` variants. The batch loop coalesces per-key updates into one `PackageFilesIndex` value before flushing, so two uploads against the same row in the same batch no longer race the stale read on a separate readonly handle. `upload()` shrinks to walk-then-queue. - Close TOCTOU on symlinks (coderabbit): the walker now reads files and recurses into directories via the resolved canonical path instead of the original entry, so retargeting a symlink between containment check and read can't smuggle out-of-root data into CAFS. - Make `missing_root_errors` platform-neutral (coderabbit): use a tempdir subpath instead of `/this/does/not/exist/anywhere` so Windows CI doesn't see a spuriously-present path. - Pass `config.side_effects_cache_read()` to `BuildModules.side_effects_cache` (copilot): the field is the READ gate, so the precedence helper should drive it. - Make `calculate_diff` iteration deterministic (copilot): use `BTreeSet` over the union of base/current keys so the resulting `deleted: Vec<String>` is lexically sorted and the msgpack payload is byte-stable for the same input. - Drop the unused `walkdir` workspace dep from `pacquet-store-dir` (copilot): the walker uses `std::fs::read_dir`.
`[crate::upload]` was the last remaining ambiguous intra-doc link between the `upload` module and the function. Adding `()` routes it to the function so rustdoc's `-D warnings` build stops failing.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
crates/store-dir/src/upload.rs (1)
59-65: ⚡ Quick winAccept
&StoreIndexWriterhere instead of&Arc<_>.
upload()only needs shared access to the writer, so&Arc<StoreIndexWriter>unnecessarily couples the API to one ownership strategy. Taking&StoreIndexWriterkeeps the signature broader and existingArccall sites still work via deref coercion.Suggested change
pub fn upload( store_dir: &StoreDir, built_pkg_location: &Path, files_index_file: &str, side_effects_cache_key: &str, - writer: &Arc<StoreIndexWriter>, + writer: &StoreIndexWriter, ) -> Result<(), UploadError> {As per coding guidelines, "Choose owned vs. borrowed parameters to minimize copies; widen to the most encompassing type (
&Pathover&PathBuf,&strover&String) when it doesn't force extra copies".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/store-dir/src/upload.rs` around lines 59 - 65, Change the upload function signature to take a borrowed StoreIndexWriter (pub fn upload(..., writer: &StoreIndexWriter, ...) -> Result<..., UploadError>) instead of &Arc<StoreIndexWriter>; update any internal uses to work with &StoreIndexWriter (no cloning or Arc-specific methods), and update all call sites that currently pass an Arc<StoreIndexWriter> to pass a reference to the inner value (e.g., writer.as_ref() or &*writer) so existing Arc ownership remains unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/store-dir/src/store_index.rs`:
- Around line 197-200: The side_effects HashMap and the SideEffectsDiff::added
map must be serialized deterministically; add a custom serde serializer (e.g.
serialize_map_sorted or serialize_hashmap_as_btreemap) that converts HashMap
entries into a BTreeMap (sorting keys) before calling the serializer, then
annotate the problematic fields with #[serde(serialize_with =
"serialize_map_sorted")] (apply this to the row.side_effects field in the struct
that contains it and to the SideEffectsDiff::added field definition). Implement
the serializer to accept the usual (S: Serializer) signature and forward
serialization of the BTreeMap so msgpack output is byte-stable and matches the
project's existing pattern.
---
Nitpick comments:
In `@crates/store-dir/src/upload.rs`:
- Around line 59-65: Change the upload function signature to take a borrowed
StoreIndexWriter (pub fn upload(..., writer: &StoreIndexWriter, ...) ->
Result<..., UploadError>) instead of &Arc<StoreIndexWriter>; update any internal
uses to work with &StoreIndexWriter (no cloning or Arc-specific methods), and
update all call sites that currently pass an Arc<StoreIndexWriter> to pass a
reference to the inner value (e.g., writer.as_ref() or &*writer) so existing Arc
ownership remains unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 650b01c5-11e1-4eb2-9a65-ee562c0cb845
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (6)
crates/package-manager/src/install_frozen_lockfile.rscrates/store-dir/Cargo.tomlcrates/store-dir/src/add_files_from_dir.rscrates/store-dir/src/add_files_from_dir/tests.rscrates/store-dir/src/store_index.rscrates/store-dir/src/upload.rs
✅ Files skipped from review due to trivial changes (1)
- crates/store-dir/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (3)
- crates/package-manager/src/install_frozen_lockfile.rs
- crates/store-dir/src/add_files_from_dir/tests.rs
- crates/store-dir/src/add_files_from_dir.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Code Coverage
- GitHub Check: Lint and Test (macos-latest)
- GitHub Check: Lint and Test (windows-latest)
- GitHub Check: Run benchmark on ubuntu-latest
- GitHub Check: Run benchmark on ubuntu-latest
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Log emissions are part of 'match pnpm'—when porting a function that firespnpm:<channel>events throughglobalLogger,logger.debug(...), orstreamParser.write(...), mirror the call site, payload, and ordering so@pnpm/cli.default-reporterparses pacquet's NDJSON the same way
Declare a newtype wrapper for branded string types instead of collapsing into plainStringor&strto preserve the distinction between validating and non-validating brands
If upstream always validates before construction of a branded string type, implement validation viaTryFrom<String>and/orFromStrwithout providing an infallible public constructor
If upstream never validates a branded string type, expose an infallibleFrom<String>constructor to provide type-safety without runtime validation
When upstream occasionally constructs a branded string type without validation via bareasassertion, expose afrom_str_uncheckedconstructor so callers can opt into the same unchecked path explicitly
Match upstream serde behavior for branded types—use#[serde(try_from = "String")]for deserialization to go through the validator and#[serde(into = "String")]for serialization
Use#[derive(derive_more::From)]and#[derive(derive_more::Into)]for mechanical conversion impls on branded types rather than handwritingimplblocks
Model string-literal unions (like'auto' | 'always' | 'never') as Rustenums instead of newtype wrappers
Template literal types (like${string}@${string}) should be treated as branded string types with the same validation discipline applied
Follow the code style guide inCODE_STYLE_GUIDE.mdcovering code-level conventions not enforced by tooling: imports, modules, naming, ownership and borrowing, parameter type selection, trait bounds, pattern matching,pipe-trait, error handling, test layout, and cloning ofArcandRc
Warnings are errors incargo clippy --locked -- --deny warnings, and should not be silenced with `#[allow(....
Files:
crates/store-dir/src/upload.rscrates/store-dir/src/store_index.rs
🧠 Learnings (1)
📚 Learning: 2026-05-07T23:19:08.272Z
Learnt from: KSXGitHub
Repo: pnpm/pacquet PR: 401
File: tasks/integrated-benchmark/src/work_env.rs:343-344
Timestamp: 2026-05-07T23:19:08.272Z
Learning: When reviewing Rust code in pnpm/pacquet for deprecated API usage, do not automatically treat `serde_saphyr::to_string` as deprecated. In `serde-saphyr` v0.0.25, `serde_saphyr::to_string` has no `#[deprecated]` attribute (the `#[deprecated]` later in `serde-saphyr-0.0.25/src/lib.rs` applies to a different function). Only flag `serde_saphyr::to_string` as deprecated if the resolved dependency version’s source shows `#[deprecated]` on that specific function.
Applied to files:
crates/store-dir/src/upload.rscrates/store-dir/src/store_index.rs
Integrated-Benchmark Report (Linux)Scenario: Frozen Lockfile
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 2.4689039816399996,
"stddev": 0.04780507005721563,
"median": 2.4491665432399996,
"user": 2.58218524,
"system": 3.4716213000000002,
"min": 2.41898022624,
"max": 2.5420708292399996,
"times": [
2.44893726224,
2.5420708292399996,
2.51206773224,
2.41898022624,
2.42238800624,
2.5323051952399998,
2.43452208124,
2.50072212524,
2.4276505342399997,
2.44939582424
]
},
{
"command": "pacquet@main",
"mean": 2.4678160689399995,
"stddev": 0.07200481956971332,
"median": 2.46928902374,
"user": 2.55450604,
"system": 3.3838941,
"min": 2.36701871324,
"max": 2.55915867524,
"times": [
2.5508388302399996,
2.55915867524,
2.42198255324,
2.43932594624,
2.36701871324,
2.4221017992399996,
2.3722784982399996,
2.4992521012399997,
2.52728649124,
2.5189170812399997
]
},
{
"command": "pnpm",
"mean": 5.92899292664,
"stddev": 0.08322227215606012,
"median": 5.93109513524,
"user": 8.73746164,
"system": 4.343755300000001,
"min": 5.789286720240001,
"max": 6.03226841624,
"times": [
5.90861164824,
5.789286720240001,
5.8075996432400006,
5.95357862224,
5.89604724324,
6.03226841624,
5.99601960424,
6.01162277224,
5.9881097722400005,
5.90678482424
]
}
]
}Scenario: Frozen Lockfile (Hot Cache)
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 0.6956557923,
"stddev": 0.06422375898028904,
"median": 0.6753533603999999,
"user": 0.35833357999999993,
"system": 1.49355988,
"min": 0.6416866079,
"max": 0.8651463669,
"times": [
0.8651463669,
0.7211930129,
0.6732501849,
0.6987764299,
0.6989813569000001,
0.6696046889,
0.6416866079,
0.6513977209,
0.6774565359,
0.6590650179
]
},
{
"command": "pacquet@main",
"mean": 0.7024205165,
"stddev": 0.08080488487627817,
"median": 0.6735188819,
"user": 0.34971257999999994,
"system": 1.51433668,
"min": 0.6552940409,
"max": 0.9205762319,
"times": [
0.7351236929,
0.6734529659,
0.6816737759,
0.7083415179,
0.6585410649000001,
0.6617263689,
0.6558907079,
0.9205762319,
0.6735847979,
0.6552940409
]
},
{
"command": "pnpm",
"mean": 2.5261615645,
"stddev": 0.11989112782652504,
"median": 2.4853160134000003,
"user": 2.96723048,
"system": 2.2039737799999997,
"min": 2.4146114469000004,
"max": 2.7850201139000004,
"times": [
2.6163077129000003,
2.4177157519000003,
2.4146114469000004,
2.7850201139000004,
2.4360639569000004,
2.4656249519,
2.5050070749000004,
2.4591027329000004,
2.6413636409000003,
2.5207982619000004
]
}
]
} |
- Add a custom sorted-map serializer for `side_effects` and `SideEffectsDiff.added` (coderabbit, copilot): converts each `HashMap` to a `BTreeMap` (sorted by key) before emitting, so the msgpack payload is byte-stable for the same logical overlay. Without this the maps go on the wire in `HashMap`-randomized order even though `calculate_diff` builds them deterministically. - Honest comment on `calculate_diff` (copilot): drops the overclaimed "byte-stable msgpack" promise; the in-loop `BTreeSet` only orders the `deleted` vector, and byte-stability of `added` is now provided by the new serializer. - Rewrite `upload_error_does_not_interrupt_install` to actually fail the upload (copilot): the previous test relied on `StoreIndex::open_readonly_in` returning an error when there's no `index.db`, but `StoreIndexWriter::spawn` auto-creates the file. The new fixture has the postinstall produce a 0-permission file in the package directory, which makes `add_files_from_dir` fail with `EACCES` on `fs::read` and forces a real `UploadError::AddFilesFromDir`. The test asserts the build kept going despite the upload error.
`serialize_sorted_map_opt` is a private helper, so intra-doc links to it fail under `-D warnings` with `this item is private`. Drop the link brackets and keep the name as bare code style so the doc still surfaces the connection without the rustdoc error.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/store-dir/src/upload.rs (1)
120-127: 💤 Low valueConsider deriving
CloneforCafsFileInfo.The manual
clone_infohelper could be eliminated by addingClonetoCafsFileInfo's derives, making the code more idiomatic.-#[derive(Debug, PartialEq, Serialize, Deserialize)] +#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)] pub struct CafsFileInfo {Then in
calculate_diff:- added.insert(file.to_string(), clone_info(now)); + added.insert(file.to_string(), now.clone());🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/store-dir/src/upload.rs` around lines 120 - 127, The manual clone_info function is redundant—derive Clone on the CafsFileInfo struct and remove clone_info; then update callers (e.g., in calculate_diff) to call .clone() on CafsFileInfo instances instead of using clone_info. Ensure the struct definition for CafsFileInfo includes Clone (and keep other derives intact), delete the clone_info function, and replace its usages with direct .clone() calls.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@crates/store-dir/src/upload.rs`:
- Around line 120-127: The manual clone_info function is redundant—derive Clone
on the CafsFileInfo struct and remove clone_info; then update callers (e.g., in
calculate_diff) to call .clone() on CafsFileInfo instances instead of using
clone_info. Ensure the struct definition for CafsFileInfo includes Clone (and
keep other derives intact), delete the clone_info function, and replace its
usages with direct .clone() calls.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 99edeed0-f05e-42c6-abf6-0370ed09f294
📒 Files selected for processing (3)
crates/package-manager/src/build_modules/tests.rscrates/store-dir/src/store_index.rscrates/store-dir/src/upload.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/package-manager/src/build_modules/tests.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Agent
- GitHub Check: Run benchmark on ubuntu-latest
- GitHub Check: Lint and Test (ubuntu-latest)
- GitHub Check: Lint and Test (macos-latest)
- GitHub Check: Lint and Test (windows-latest)
- GitHub Check: Run benchmark on ubuntu-latest
- GitHub Check: Code Coverage
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Log emissions are part of 'match pnpm'—when porting a function that firespnpm:<channel>events throughglobalLogger,logger.debug(...), orstreamParser.write(...), mirror the call site, payload, and ordering so@pnpm/cli.default-reporterparses pacquet's NDJSON the same way
Declare a newtype wrapper for branded string types instead of collapsing into plainStringor&strto preserve the distinction between validating and non-validating brands
If upstream always validates before construction of a branded string type, implement validation viaTryFrom<String>and/orFromStrwithout providing an infallible public constructor
If upstream never validates a branded string type, expose an infallibleFrom<String>constructor to provide type-safety without runtime validation
When upstream occasionally constructs a branded string type without validation via bareasassertion, expose afrom_str_uncheckedconstructor so callers can opt into the same unchecked path explicitly
Match upstream serde behavior for branded types—use#[serde(try_from = "String")]for deserialization to go through the validator and#[serde(into = "String")]for serialization
Use#[derive(derive_more::From)]and#[derive(derive_more::Into)]for mechanical conversion impls on branded types rather than handwritingimplblocks
Model string-literal unions (like'auto' | 'always' | 'never') as Rustenums instead of newtype wrappers
Template literal types (like${string}@${string}) should be treated as branded string types with the same validation discipline applied
Follow the code style guide inCODE_STYLE_GUIDE.mdcovering code-level conventions not enforced by tooling: imports, modules, naming, ownership and borrowing, parameter type selection, trait bounds, pattern matching,pipe-trait, error handling, test layout, and cloning ofArcandRc
Warnings are errors incargo clippy --locked -- --deny warnings, and should not be silenced with `#[allow(....
Files:
crates/store-dir/src/store_index.rscrates/store-dir/src/upload.rs
🧠 Learnings (1)
📚 Learning: 2026-05-07T23:19:08.272Z
Learnt from: KSXGitHub
Repo: pnpm/pacquet PR: 401
File: tasks/integrated-benchmark/src/work_env.rs:343-344
Timestamp: 2026-05-07T23:19:08.272Z
Learning: When reviewing Rust code in pnpm/pacquet for deprecated API usage, do not automatically treat `serde_saphyr::to_string` as deprecated. In `serde-saphyr` v0.0.25, `serde_saphyr::to_string` has no `#[deprecated]` attribute (the `#[deprecated]` later in `serde-saphyr-0.0.25/src/lib.rs` applies to a different function). Only flag `serde_saphyr::to_string` as deprecated if the resolved dependency version’s source shows `#[deprecated]` on that specific function.
Applied to files:
crates/store-dir/src/store_index.rscrates/store-dir/src/upload.rs
🔇 Additional comments (8)
crates/store-dir/src/store_index.rs (4)
769-790: LGTM! Byte-stable serialization.The
serialize_sorted_map_opthelper correctly addresses the deterministic serialization requirement by convertingHashMaptoBTreeMapbefore emission. This ensures msgpack payloads are byte-identical for the same logical overlay across runs.
62-91: LGTM! Clean enum design for message-passing.The
WriteMsgenum cleanly separates the two write patterns, and the doc comments thoroughly explain why routing R/M/W through the writer task ensures commutativity. This correctly addresses the stale-read race condition from the prior review.
146-204: LGTM! Correct coalescing implementation.The batch-coalescing logic correctly accumulates multiple mutations for the same key:
- Loading from
pendingfirst ensures sequentialSideEffectsUploadmessages build on each other- The graceful fallback to SQLite handles the first sighting of a key in a batch
- Algo mismatch and missing-row cases are logged and skipped appropriately
234-269: LGTM! Clean API design.Good refactoring with the
send_msghelper to avoid duplicating the channel-send + warning-guard logic. The public API clearly distinguishes full replacement (queue) from R/M/W (queue_side_effects_upload).crates/store-dir/src/upload.rs (4)
1-21: LGTM! Well-documented module.Clear module documentation with upstream pnpm references, and focused imports without star-imports.
23-36: LGTM! Clean error type and algorithm constant.The
UploadErrorenum follows the crate's error patterns withderive_moreandmiette, andHASH_ALGORITHMprovides a single source of truth for the algo check in both upload and writer task.
59-74: LGTM! Clean orchestrator.The function correctly delegates the heavy lifting:
add_files_from_dirfor rehashing andqueue_side_effects_uploadfor the R/M/W (which happens inside the writer task for commutativity). This addresses the stale-read race from the prior review.
89-118: LGTM! Correct diff algorithm.The set-difference logic correctly identifies added, modified, and deleted files. Using
BTreeSetfor deterministic iteration ensuresdeletedis sorted, whileadded's stability is handled by theserialize_sorted_map_optserializer.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 16 out of 17 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
crates/package-manager/src/install_frozen_lockfile.rs:102
StoreIndexWriter::spawnis awaited only on the happy path. Any?return before the finaldrop(store_index_writer)/writer_task.await(e.g.CreateVirtualStore/BuildModuleserrors) will drop theJoinHandleand may skip flushing queued rows (and may leave a blocking task running until runtime shutdown). Consider using a guard/RAII pattern so the sender is dropped and the writer task is awaited on all exit paths.
// Spawn the batched store-index writer here so it lives
// across both the prefetch/download phase (consumers in
// `CreateVirtualStore`) and the build phase (the new
// side-effects-cache WRITE-path upload site in
// `BuildModules`). We drop the orchestrator's clone and
// await the join handle at the end of `run`, so the final
// batch flushes once every queued row from both phases has
// been processed. A writer open / task failure is degraded
// to a `warn!` and the install still succeeds — pacquet's
// existing best-effort stance on cache writes.
let (store_index_writer, writer_task) = StoreIndexWriter::spawn(&config.store_dir);
let CreateVirtualStoreOutput { package_manifests, side_effects_maps_by_snapshot } =
CreateVirtualStore {
http_client,
config,
packages,
snapshots,
logged_methods,
requester,
store_index_writer: &store_index_writer,
}
.run::<R>()
.await
.map_err(InstallFrozenLockfileError::CreateVirtualStore)?;
- Preserve same-batch Replace on algo-mismatch (copilot). `SideEffectsUpload`'s previous shape removed the row from `pending` before the algo check, then `continue`'d without reinserting — dropping any earlier-in-batch `Replace` for the same key. Switched to a `HashMap::entry()` flow so the row stays in `pending` across every short-circuit; the side-effects mutation is what gets suppressed, not the row write. - Strengthen the upload-error test (copilot). Now seeds a base `PackageFilesIndex` row before the install runs and asserts the row's `side_effects` stays `None` after the swallowed upload error, matching upstream's `filesIndex2.sideEffects toBeFalsy()` at sideEffects.ts:186.
Cold-install benchmark on PR #424 showed a ~5% regression vs main (2.446s vs 2.333s on Linux). Tracked it to `write_gate_active` firing on every install — default config has `side_effects_cache_write = true`, `engine_name = Some` when node is on PATH, and `store_index_writer = Some` always; so the dep graph was being built O(|snapshots|) even for pure-JS installs with zero buildable packages. The WRITE-path upload site only ever runs for snapshots whose `requires_build` is true (it's the same loop body that gates on that bit). So if no snapshot requires a build, the dep graph is dead work — the upload-site closure never fires. Gate `write_gate_active` on `requires_build_map.values().any(|&v| v)` to restore the pre-PR behavior on cold installs without postinstalls. Behaviour preserved when at least one snapshot requires a build: the dep graph still gets built and the WRITE-path runs as before. The existing `write_path_*` tests use the `failing-postinstall` / `postinstall-modifies-source` fixtures which produce `requires_build = true`, so they continue to exercise the active-gate path.
When a lockfile has only a handful of `requires_build` snapshots, `build_deps_graph` was still walking every entry in `snapshots:` — needless work for the cache-key compute, which only ever recurses into the children of `requires_build` roots via `calc_dep_state`. Add a sibling `build_deps_subgraph` that takes an iterator of root keys and BFS-walks the forward closure (dependencies + optional dependencies). `BuildModules` now feeds in the `requires_build` keys; the resulting graph contains exactly the nodes `calc_dep_state` will visit and no more. For a pure-JS install (zero `requires_build` keys), the iterator is empty and the function returns an empty graph in O(0). For an install with a single leaf-dep postinstall, the walk visits just that snapshot. For a postinstall on a top-level dep, the walk visits the full subgraph below it — still bounded to what `calc_dep_state` would have traversed anyway. Upstream's `lockfileToDepGraph` (pnpm/pnpm@7e3145f9fc:deps/graph-builder/src/lockfileToDepGraph.ts:123-170) builds the full graph because its consumers extend well beyond cache hashing (direct-deps map, hierarchy + hoisting, bin linking). In pacquet today the only consumer is `calc_dep_state`, so the trimmed walk produces identical cache keys with much less work. Tests cover empty-roots → empty graph, forward-closure inclusion + exclusion, and cycle termination.
`build_deps_subgraph` already returns an empty graph in O(0) when the root iterator is empty (which is the case for pure-JS installs where no snapshot is `requires_build`). The extra `any_requires_build` check on `cache_gate_active` and `write_gate_active` was belt-and-suspenders that the subgraph builder makes unnecessary — same behavior on every input, less code.
- Sort maps in the msgpackr-records encoder for real byte stability (coderabbit). The previous "fix" used a serde `serialize_with` helper, but `StoreIndex::set_many` writes via the bespoke `encode_package_files_index` which iterates `HashMap`s directly — the serde attribute was dead code. Now `files`, `side_effects`, and `SideEffectsDiff.added` all iterate via a `sorted_by_key` helper that emits in lexicographic key order, so row payloads are byte-stable across pacquet writes of the same logical state. Drops the now-unused `serialize_sorted_map_opt` helper and the misleading doc-comments. - Add `store_dir.is_some()` to `write_gate_active` (copilot). The upload site is later gated on both `store_index_writer` AND `store_dir`; this matches the gate to the use sites. - Clarify `WalkCtx.visited` semantics (copilot). It's a recursion-stack set (entries inserted on descend, removed on return), not a global visited set. Updated the doc-comment to spell this out and reference upstream's same shape. Also extract `apply_write_msg` / `load_pending_row` out of `StoreIndexWriter::spawn` to flatten 7-8 levels of nesting inside the spawn-blocking closure — same control flow, easier to read.
Pull in 28 commits from upstream main, including the `pacquet-npmrc` → `pacquet-config` rename (#420) plus features: - supportedArchitectures + --cpu/--os/--libc (#456) - frozen-lockfile (#442, #443, #447, #450) - git-fetcher (#436 / #446, #451, #454) - side-effects cache (#421 / #422, #423, #424) - real-hoist + global-virtual-store (#432 / #438 / #444, #449, #452) - patchedDependencies + allow-builds (#425, #427, #428) - engine/platform installability (#434 / #439) Conflicts resolved: - `crates/npmrc/` files migrated under the renamed `crates/config/` directory; `Npmrc` → `Config` everywhere except `NpmrcAuth` (which keeps the `.npmrc`-domain name). - `Config::current` reads the env-var DI generic `Api: EnvVar` for ${VAR}-substitution in `.npmrc`. Production turbofish in `cli_args.rs` is `Config::current::<RealApi, _, _, _, _>(...)`. - Two-phase `NpmrcAuth::apply_*` retained so default-registry creds key at the yaml-resolved registry URL. - New `Config::auth_headers` field plumbed through `install_package_by_snapshot`'s `DownloadTarballToStore`. - Tests under `crates/config/src/workspace_yaml/tests.rs` pick up the new ParseYaml unit test added on this branch.
Summary
Implements slice (B) of #421 — the WRITE path for pnpm's side-effects cache. With this PR, a successful postinstall now re-CAFSes the built package directory, diffs against the pristine
PackageFilesIndex.filesrow, and mutates the row'sside_effectsmap so a future install (pacquet or pnpm) can skip the rebuild via the READ path landed in #423.User-visible behavior: lifecycle scripts that produce build artifacts (
node-gypoutputs, generated source files, etc.) get cached the first time the install runs; subsequent installs against the same lockfile + Node major skip the script and reuse the cached overlay.What lands
pacquet_store_dir::add_files_from_dir— directory walker that re-CAFSes a post-build package directory. Mirrors pnpm/pnpm@7e3145f9fc:store/cafs/src/addFilesFromDir.ts:16-194+worker/src/start.ts:312-383. Containment:dunce::canonicalizeeach symlink target, drop if out of root; read/recurse via the resolved canonical path so retargeting a symlink between containment check and read can't smuggle out-of-root data into CAFS. Skips top-levelnode_modules(matches upstreamincludeNodeModules: false). Cycle-safe via a recursion-stackvisited: HashSet<PathBuf>(entries inserted on descend, removed on return).pacquet_store_dir::upload+calculate_diff—upload(store_dir, built_pkg_location, files_index_file, cache_key, writer)walks the package viaadd_files_from_dirand queues aWriteMsg::SideEffectsUpload { key, cache_key, current_files }to the writer task. The actual R/M/W of the existingPackageFilesIndexrow happens inside the writer task (see below), so concurrent uploads to the same row stay commutative.StoreIndexWriternow drives aWriteMsgenum (ReplacevsSideEffectsUpload). The batch loop coalesces per-key updates into onePackageFilesIndexvalue before flushing — multiple side-effects uploads for the same row apply in arrival order against a single in-memory state instead of racing against a stale read on a separate readonly handle. The R/M/W (load existing row → checkalgo == HASH_ALGORITHM = "sha512"→calculate_diff→ insert intoside_effectsmap) happens inside the writer's transaction. Loading short-circuits with adebug!/warn!when the base row is missing or unreadable, matching upstream'sif (!existingFilesIndex) return/ALGO_MISMATCHbail-outs atpnpm/pnpm@7e3145f9fc:worker/src/start.ts:342-371.Config.side_effects_cache_readonly: bool(defaultfalse) — mirrors pnpm'ssideEffectsCacheReadonly. Two derived helpers consolidate the gate semantics so precedence stays single-sourced:Config::side_effects_cache_read()=cache || readonlyConfig::side_effects_cache_write()=cache && !readonlyBuildModulesWRITE-path call site — afterrun_postinstall_hooksreturnsOk, ifside_effects_cache_writeis on andStoreIndexWriter+StoreDirare both threaded in, callspacquet_store_dir::upload(...)with the snapshot'sstore_index_key(integrity, pkg_id)and thecalc_dep_statecache key. All upload errors are swallowed withtracing::warn!, mirroring upstream'stry { upload } catch { logger.warn }atpnpm/pnpm@7e3145f9fc:building/during-install/src/index.ts:208-215.build_deps_subgraph— the cache-hashing dep graph is now bounded to the forward closure ofrequires_buildsnapshots. Pure-JS installs feed in an empty root iterator and the function returns immediately; installs with buildable deps walk only whatcalc_dep_statewill actually traverse. Upstream'slockfileToDepGraphbuilds the full graph because its consumers extend beyond cache hashing; pacquet's graph is consumed only bycalc_dep_statetoday, so the closure-bounded walk produces byte-identical cache keys with strictly less work. Restored the cold-install benchmark to parity withmain.StoreIndexWriterhoisted up toInstallFrozenLockfile::run— previously spawned + drained insideCreateVirtualStore::run; now it spans both the prefetch/download phase and the build phase so the WRITE-path upload can queue through it. The final batch flushes afterBuildModules::runreturns.encode_package_files_indexnow iteratesfiles,side_effects, andSideEffectsDiff.addedin sorted-key order so two pacquet writes of the same logical row produce identical bytes. Was prompted by a review on PR feat: side-effects cache WRITE path (#421 slice B) #424 pointing out thatHashMapiteration randomization would otherwise make every row look "changed" on byte-diff even when content was identical.cache_keyrefactor inBuildModules—calc_dep_statenow fires once per snapshot (before the gate check) so both the READ gate and the new WRITE site read the same value.Tests
add_files_from_dir::tests(9 cases): top-level files, nested forward-slash paths, exec-bit propagation, top-level-vs-nestednode_modules, symlink-out-of-root drop, symlink-in-root follow, directory-cycle termination, missing-root error.upload::tests(6 cases): identical / added-only / deleted-only / digest change / mode change / mixed — covers every branch of upstream'scalculateDiff.deps_graph::tests(new for slice B): empty-roots → empty subgraph; forward-closure inclusion + exclusion; cycle termination.workspace_yaml::tests:sideEffectsCacheReadonlyparses + applies; the 4-state truth table forside_effects_cache_read()/side_effects_cache_write().build_modules::tests(new):write_path_populates_side_effects_row— full WRITE-path: pre-seed a base row, run a postinstall that createsgenerated.txt, drain the writer task, assertside_effects[cache_key].addedcontainsgenerated.txtand NOT pristineindex.js. Mirrors the spirit of upstream's'a postinstall script does not modify the original sources added to the store'atpnpm/pnpm@7e3145f9fc:sideEffects.ts:189-223.write_path_disabled_skips_upload— same scenario withside_effects_cache_write: false→ row'sside_effectsstaysNone. Counterpart of upstream's gate onsideEffectsCacheWrite.upload_error_does_not_interrupt_install— postinstall produces a 0-permission file in the package dir; the walker fails onfs::readwithEACCES, surfacing asUploadError::AddFilesFromDir(_)whichBuildModulesswallows withtracing::warn!. Asserts (a) the postinstall-createdgenerated.txtis on disk and (b) the pre-seeded base row'sside_effectsstaysNone. Mirrors'uploading errors do not interrupt installation'atpnpm/pnpm@7e3145f9fc:sideEffects.ts:166-187.Out of scope
patchedDependencies(Lifecycle script subsystem: gaps vs upstream pnpm #397 item 9). Today's WRITE side stampspatch_file_hash: None; same as the existing READ side.requires_buildrecompute in the R/M/W path. Upstream recomputes from(manifest, filesMap)when the existing row's field isNone; pacquet leaves the field as-is (cold-batch downloads already populate it with a real value).Test plan
just ready(clippy clean, all tests pass)taplo format --check(clean —just readydoesn't run this, CI does)Written by an agent (Claude Code, claude-opus-4-7).