Skip to content
This repository was archived by the owner on May 14, 2026. It is now read-only.

feat(package-manager): NODE_EXTRAS ignore filter for runtime archives (#437 slice D2)#496

Merged
zkochan merged 1 commit into
mainfrom
feat/437-slice-d2
May 13, 2026
Merged

feat(package-manager): NODE_EXTRAS ignore filter for runtime archives (#437 slice D2)#496
zkochan merged 1 commit into
mainfrom
feat/437-slice-d2

Conversation

@zkochan

@zkochan zkochan commented May 13, 2026

Copy link
Copy Markdown
Member

Summary

Slice D2 of #437 — wires the per-archive ignore filter at the install dispatcher. Matches upstream's archiveFilters: { node: NODE_EXTRAS_IGNORE_PATTERN } per-package-name table.

For unscoped node, the filter strips bundled npm / corepack from the Node.js runtime archive during the CAS write. pnpm (and pacquet) install pnpm itself as the package manager, so the bundled tooling is dead weight and would also shadow the user's pnpm via node_modules/.bin/. Every other package gets None and the full archive lands in the CAS unfiltered.

  • node_extras_filter is a hand-coded port of upstream's NODE_EXTRAS_IGNORE_PATTERN regex (^(?:(?:lib/)?node_modules/(?:npm|corepack)(?:/|$)|bin/(?:npm|npx|corepack)$|(?:npm|npx|corepack)(?:\.(?:cmd|ps1))?$)). Hand-coded so pacquet-tarball doesn't need a regex engine; the three branches mirror the alternation exactly.
  • archive_filter_for(&PackageKey) is the per-package dispatcher. Returns Some(NODE_EXTRAS) only for unscoped node; everything else gets None. OnceLock-cached so per-snapshot Arc::clones share one trait object.

What's not in this slice

  • D3 — bin-link cmd-shims for runtime executables (BinaryResolution::bin), @runtime: substring handling in skip lists / reporter prefixes.
  • E--no-runtime flag.
  • F — end-to-end runtime install fixtures.

Test plan

  • node_extras_filter_matches_upstream_regex_alternations — every positive case for each branch plus the negative cases the regex deliberately doesn't match (e.g. lib/node_modules/yarn/..., bin/npm.cmd, npm.bat, src/node_modules/npm/foo). Verified by temporarily collapsing the bin/ branch to a no-op; the test fails as expected.
  • archive_filter_for_only_returns_filter_for_unscoped_node — pins the pkg.name == "node" (unscoped) key match. @foo/node, react, bun all get None.
  • All 5 install_package_by_snapshot tests pass.
  • just ready, taplo format --check, just dylint, cargo doc -D warnings green.

Written by an agent (Claude Code, claude-opus-4-7).

Summary by CodeRabbit

  • Bug Fixes

    • Improved Node.js package installation by filtering out bundled artifacts (npm and corepack) from runtime archives, ensuring cleaner installations for the unscoped node package.
  • Tests

    • Added test coverage to verify artifact filtering behavior and package detection logic.

Review Change Stack

…#437 slice D2)

Construct the per-archive ignore filter at the install dispatcher.
For unscoped `node` the filter matches upstream's
[`NODE_EXTRAS_IGNORE_PATTERN`](https://github.com/pnpm/pnpm/blob/94240bc046/engine/runtime/node-resolver/src/index.ts),
which strips bundled `npm` / `corepack` from the Node.js runtime
archive during the CAS write — pnpm (and pacquet) install pnpm
itself as the package manager, so the bundled tooling is dead
weight and would shadow the user's pnpm via `node_modules/.bin/`.

Wiring matches upstream's
[`archiveFilters: { node: NODE_EXTRAS_IGNORE_PATTERN }`](https://github.com/pnpm/pnpm/blob/94240bc046/installing/client/src/index.ts):
the per-package-name table is keyed by `pkg.name`, so `@foo/node`
and other packages keep `None` and the full archive lands
unfiltered.

Pacquet uses a hand-coded matcher rather than the upstream regex
so `pacquet-tarball` doesn't have to pull in a regex engine. The
three branches mirror the regex alternation exactly:

1. `^(?:lib/)?node_modules/(?:npm|corepack)(?:/|$)`
2. `^bin/(?:npm|npx|corepack)$`
3. `^(?:npm|npx|corepack)(?:\.(?:cmd|ps1))?$`

A `OnceLock` caches the `Arc<IgnoreEntryFilter>` so per-snapshot
clones share one trait object.

Unit tests pin every branch of the alternation plus the negative
cases (e.g. `lib/node_modules/yarn/...` is *not* matched,
`bin/npm.cmd` is *not* matched — the regex's `$` and arm-specific
extension rules are deliberately asymmetric). Verified by
temporarily collapsing the `bin/` branch to a no-op — the test
fails as expected.

Bin-link cmd-shims for the runtime executables and `@runtime:`
substring handling in skip lists / reporter prefixes are Slice
D3. End-to-end runtime install fixtures land in Slice F.
Copilot AI review requested due to automatic review settings May 13, 2026 21:20
@coderabbitai

coderabbitai Bot commented May 13, 2026

Copy link
Copy Markdown

Caution

Review failed

Pull request was closed or merged during review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: bfd4ec01-c50e-45e1-b187-910710745025

📥 Commits

Reviewing files that changed from the base of the PR and between 55083a4 and f64cead.

📒 Files selected for processing (2)
  • crates/package-manager/src/install_package_by_snapshot.rs
  • crates/package-manager/src/install_package_by_snapshot/tests.rs
📜 Recent review details
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Preserve existing method chains and pipe-trait chains; do not break them into intermediate let bindings unless there is a concrete justification such as a compilation failure, borrow checker rejection, meaningful performance improvement, or other technical necessity. Refactoring for style alone is not sufficient justification.
Choose owned vs. borrowed parameters to minimize copies; prefer borrowed types (&Path over &PathBuf, &str over &String) when it does not force extra copies.
Prefer Arc::clone(&x) and Rc::clone(&x) over x.clone() for reference-counted types to make the cost visible at the call site.
Do not use star imports inside module bodies. Write use super::{Foo, bar} instead of use super::*; for any glob whose target is a module you control. External-crate preludes (e.g., use rayon::prelude::*;) and root-of-module re-exports (e.g., pub use submodule::*; in lib.rs) are exceptions.
Follow Rust API Guidelines for naming, as documented in https://rust-lang.github.io/api-guidelines/naming.html.
Declare a newtype wrapper for any branded string type being ported from TypeScript pnpm. Do not collapse the brand into a plain String or &str; give the type its own struct so misuse is a type error.
When porting branded string types where upstream TypeScript always validates before construction, validate in the Rust port too. Construct the wrapper only via TryFrom<String> and/or FromStr; do not provide an infallible public constructor that takes an arbitrary string.
For branded string types where upstream TypeScript never validates (used purely for type-safety to prevent confusion between string slots), expose an infallible From<String> and From<&str> constructor in the Rust wrapper.
When upstream TypeScript occasionally constructs a branded type without validation (via bare as assertion), add a from_str_unchecked (or similarly named) constructor on the Rust side. Keep the validating constructor as well; `from_str_u...

Files:

  • crates/package-manager/src/install_package_by_snapshot.rs
  • crates/package-manager/src/install_package_by_snapshot/tests.rs
🧠 Learnings (4)
📚 Learning: 2026-05-07T23:19:08.272Z
Learnt from: KSXGitHub
Repo: pnpm/pacquet PR: 401
File: tasks/integrated-benchmark/src/work_env.rs:343-344
Timestamp: 2026-05-07T23:19:08.272Z
Learning: When reviewing Rust code in pnpm/pacquet for deprecated API usage, do not automatically treat `serde_saphyr::to_string` as deprecated. In `serde-saphyr` v0.0.25, `serde_saphyr::to_string` has no `#[deprecated]` attribute (the `#[deprecated]` later in `serde-saphyr-0.0.25/src/lib.rs` applies to a different function). Only flag `serde_saphyr::to_string` as deprecated if the resolved dependency version’s source shows `#[deprecated]` on that specific function.

Applied to files:

  • crates/package-manager/src/install_package_by_snapshot.rs
  • crates/package-manager/src/install_package_by_snapshot/tests.rs
📚 Learning: 2026-05-13T19:22:48.951Z
Learnt from: zkochan
Repo: pnpm/pacquet PR: 478
File: crates/package-manager/src/hoisted_dep_graph.rs:51-55
Timestamp: 2026-05-13T19:22:48.951Z
Learning: When reviewing this Rust codebase, avoid introducing/using a newtype like `PkgIdWithPatchHash` in only one module (e.g., `hoisted_dep_graph.rs`) if other related pacquet modules still represent the upstream pnpm “pkg id with patch hash” as a plain `String` (e.g., `virtual_store_layout`). For type consistency, either keep `pkg_id_with_patch_hash` as `String` here, or require a workspace-wide sweep that defines/extracts the `PkgIdWithPatchHash` newtype once and updates all call sites together (otherwise defer the refactor to a coordinated follow-up PR).

Applied to files:

  • crates/package-manager/src/install_package_by_snapshot.rs
  • crates/package-manager/src/install_package_by_snapshot/tests.rs
📚 Learning: 2026-05-13T20:09:22.171Z
Learnt from: zkochan
Repo: pnpm/pacquet PR: 486
File: crates/package-manager/src/hoisted_dep_graph.rs:472-476
Timestamp: 2026-05-13T20:09:22.171Z
Learning: In pnpm/pacquet, when generating serialized `hoisted_locations` strings (used for `.modules.yaml`), normalize path separators to forward slashes (e.g., replace `\\` with `/`). This preserves cross-platform portability: the output must not use OS-native separators because upstream pnpm’s `path.relative` can return platform-specific separators, which would break consistency with pnpm/pacquet’s other serialized formats on Windows.

Applied to files:

  • crates/package-manager/src/install_package_by_snapshot.rs
  • crates/package-manager/src/install_package_by_snapshot/tests.rs
📚 Learning: 2026-05-01T10:01:33.766Z
Learnt from: zkochan
Repo: pnpm/pacquet PR: 349
File: crates/reporter/src/tests.rs:121-121
Timestamp: 2026-05-01T10:01:33.766Z
Learning: In Rust test code, follow the repo’s CODE_STYLE_GUIDE test-logging rule: add logging (e.g., `eprintln!`/`eprintln!(...)`) so that useful diagnostic values are printed when a test fails, unless the assertion is `assert_eq!` (where the differing values are already included). Concretely, if you use assertions like `assert!`, `assert_ne!`, etc., ensure the test logs the relevant actual/expected values (or context) before/around the assertion so failures can be diagnosed without rerunning.

Applied to files:

  • crates/package-manager/src/install_package_by_snapshot/tests.rs
🔇 Additional comments (2)
crates/package-manager/src/install_package_by_snapshot.rs (1)

22-23: LGTM!

Also applies to: 281-282, 336-337, 447-532, 566-567, 584-585, 604-605

crates/package-manager/src/install_package_by_snapshot/tests.rs (1)

1-4: LGTM!

Also applies to: 6-8, 115-209


📝 Walkthrough

Walkthrough

The PR extends the binary package installation path to filter out bundled npm and corepack artifacts from unscoped node packages by implementing pnpm's NODE_EXTRAS_IGNORE_PATTERN as a hand-coded matcher and threading the resulting filter into the tarball and zip extraction pipelines during CAS download.

Changes

Node Package Artifact Filtering

Layer / File(s) Summary
Archive filtering helpers and tests
crates/package-manager/src/install_package_by_snapshot.rs, crates/package-manager/src/install_package_by_snapshot/tests.rs
node_extras_filter implements a hand-coded path matcher for bundled npm/corepack artifacts (lib/node_modules, bin/ forms, .cmd/.ps1 extensions), and archive_filter_for returns a cached IgnoreEntryFilter only for unscoped node packages. Unit tests verify both helpers with positive and negative fixture paths.
Binary CAS download integration
crates/package-manager/src/install_package_by_snapshot.rs
fetch_binary_resolution_to_cas now accepts an ignore_file_pattern parameter and passes it through to tarball and zip extraction. Call sites in InstallPackageBySnapshot::run compute and pass archive_filter_for(package_key) for both LockfileResolution::Binary and the selected Variations variant. Import statement updated to include TarballError.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related issues

  • pnpm/pacquet#433: Both changes modify install_package_by_snapshot.rs and extend InstallPackageBySnapshot::run with archive filtering logic for package installation handling.

Possibly related PRs

  • pnpm/pacquet#472: The main PR's threading of ignore_file_pattern into tar/zip CAS download paths directly builds on the retrieved PR's addition of zip extraction and ignore_file_pattern support in DownloadTarballToStore / DownloadZipArchiveToStore.
  • pnpm/pacquet#457: Both PRs evolve crates/package-manager/src/install_package_by_snapshot.rs, extending LockfileResolution::Binary and Variations handling with filtering logic in the binary download dispatch.
  • pnpm/pacquet#492: Both PRs modify install_package_by_snapshot.rs to extend the fetch_binary_resolution_to_cas helper and binary+Variations dispatch with archive filtering support.

Poem

A rabbit hops through archive trees,
Filtering extras with great ease—
No bundled npm, no corepack bloat,
Just clean node packages that float,
🐰 hop hop to the CAS!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding NODE_EXTRAS ignore filter for runtime archives in the package manager, with a reference to the upstream issue.
Description check ✅ Passed The PR description is comprehensive and includes all key sections: Summary (clearly explaining what's being implemented and why), Linked issue (references upstream behavior and implementation), Test plan (detailed test coverage), and Upstream reference (links to pnpm source code). The description is well-structured and provides adequate context.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/437-slice-d2

Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@codecov

codecov Bot commented May 13, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 86.48649% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.40%. Comparing base (c359dd7) to head (f64cead).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
...package-manager/src/install_package_by_snapshot.rs 86.48% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #496      +/-   ##
==========================================
+ Coverage   88.34%   88.40%   +0.06%     
==========================================
  Files         121      121              
  Lines       12891    13001     +110     
==========================================
+ Hits        11389    11494     +105     
- Misses       1502     1507       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

Copy link
Copy Markdown

Micro-Benchmark Results

Linux

group                          main                                   pr
-----                          ----                                   --
tarball/download_dependency    1.01     15.9±0.54ms   273.4 KB/sec    1.00     15.8±0.17ms   275.0 KB/sec

@github-actions

Copy link
Copy Markdown

Integrated-Benchmark Report (Linux)

Scenario: Frozen Lockfile

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 2.873 ± 0.602 2.567 4.568 1.11 ± 0.23
pacquet@main 2.587 ± 0.060 2.507 2.666 1.00
pnpm 6.113 ± 0.062 6.036 6.248 2.36 ± 0.06
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 2.87317483936,
      "stddev": 0.6016035091656307,
      "median": 2.70658032596,
      "user": 2.63084056,
      "system": 3.7261554400000003,
      "min": 2.56741450946,
      "max": 4.56839424846,
      "times": [
        2.78863302746,
        2.66744055246,
        2.63700183646,
        2.77917047446,
        4.56839424846,
        2.7457200994599997,
        2.78011810146,
        2.56741450946,
        2.58066280546,
        2.61719273846
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 2.5866579661599998,
      "stddev": 0.06014248180945031,
      "median": 2.59593676846,
      "user": 2.5478426599999997,
      "system": 3.6456132400000003,
      "min": 2.5069965444599998,
      "max": 2.66565329746,
      "times": [
        2.66565329746,
        2.62303287846,
        2.5214656674600002,
        2.50722847646,
        2.59156893846,
        2.66451993246,
        2.62235604946,
        2.5069965444599998,
        2.60030459846,
        2.56345327846
      ]
    },
    {
      "command": "pnpm",
      "mean": 6.11289905346,
      "stddev": 0.06190679244910986,
      "median": 6.0989098219599995,
      "user": 9.08290036,
      "system": 4.456002839999999,
      "min": 6.03637327946,
      "max": 6.24755415846,
      "times": [
        6.1237828284599995,
        6.10106023246,
        6.09149093646,
        6.11318593746,
        6.09675941146,
        6.06312284646,
        6.03637327946,
        6.18528831046,
        6.07037259346,
        6.24755415846
      ]
    }
  ]
}

Scenario: Frozen Lockfile (Hot Cache)

Command Mean [ms] Min [ms] Max [ms] Relative
pacquet@HEAD 756.2 ± 58.0 717.5 913.2 1.00
pacquet@main 806.7 ± 32.9 760.9 858.1 1.07 ± 0.09
pnpm 2636.4 ± 208.7 2509.0 3203.3 3.49 ± 0.38
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 0.75621033478,
      "stddev": 0.058012162561440414,
      "median": 0.74076697588,
      "user": 0.38468391999999996,
      "system": 1.60158832,
      "min": 0.71745643288,
      "max": 0.91318885888,
      "times": [
        0.91318885888,
        0.77251400188,
        0.71745643288,
        0.75885229588,
        0.72364100688,
        0.7339816528800001,
        0.72608381688,
        0.74755229888,
        0.74828011788,
        0.72055286488
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 0.8066924887800001,
      "stddev": 0.03286007094161814,
      "median": 0.80584799988,
      "user": 0.39535792000000003,
      "system": 1.61888072,
      "min": 0.76085144388,
      "max": 0.8580945078800001,
      "times": [
        0.8580945078800001,
        0.81753353588,
        0.81619706788,
        0.83290821488,
        0.8455747678800001,
        0.76557939588,
        0.79549893188,
        0.78860448888,
        0.7860825328800001,
        0.76085144388
      ]
    },
    {
      "command": "pnpm",
      "mean": 2.6364145615799996,
      "stddev": 0.20871347816611038,
      "median": 2.56461417738,
      "user": 3.1644257199999997,
      "system": 2.18683172,
      "min": 2.50901679688,
      "max": 3.20325764188,
      "times": [
        3.20325764188,
        2.6569867558799998,
        2.58139819588,
        2.52640535388,
        2.61962345188,
        2.50901679688,
        2.51273629688,
        2.54783015888,
        2.52171677588,
        2.68517418788
      ]
    }
  ]
}

@zkochan

zkochan commented May 13, 2026

Copy link
Copy Markdown
Member Author

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented May 13, 2026

Copy link
Copy Markdown
✅ Actions performed

Full review triggered.

@zkochan zkochan merged commit e10d0e5 into main May 13, 2026
14 of 17 checks passed
@zkochan zkochan deleted the feat/437-slice-d2 branch May 13, 2026 22:04
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants