Skip to content
This repository was archived by the owner on May 14, 2026. It is now read-only.

feat: bin#333

Merged
zkochan merged 82 commits into
mainfrom
claude/link-dependency-binaries-JWwJN
May 12, 2026
Merged

feat: bin#333
zkochan merged 82 commits into
mainfrom
claude/link-dependency-binaries-JWwJN

Conversation

@KSXGitHub

@KSXGitHub KSXGitHub commented Apr 29, 2026

Copy link
Copy Markdown
Contributor

Resolves #330.

Mirrors pnpm v11's @pnpm/bins.resolver, @pnpm/bins.linker, and @zkochan/cmd-shim (resolved at pnpm 4750fd370c and cmd-shim 0d79ca9534):

  • New pacquet-cmd-shim crate parses the bin field of a package manifest into commands. Both bin: string | object and the directories.bin glob fallback are supported, with the same path-traversal and URL-safe-name guards as upstream. Conflict resolution follows pkgOwnsBin + lexical compare.
  • Shim generator writes all three flavors per bin: Unix /bin/sh (byte-compatible with pnpm's generateShShim), Windows .cmd (CRLF, %~dp0 paths), and PowerShell .ps1 ($basedir/$exe header with Windows/POSIX-pwsh detection). Mirrors generateCmdShim and generatePwshShim minus the unused nodePath/prependToPath/progArgs branches.
  • All filesystem IO behind per-capability DI traits — FsReadHead, FsReadFile, FsReadString, FsReadDir, FsWalkFiles, FsCreateDirAll, FsWrite, FsSetExecutable, FsEnsureExecutableBits — following the pattern documented in #332. FsWrite is a thin shim over std::fs::write and is not atomic; a future atomic-write capability can build on it without renaming. Production callers turbofish the unit-struct provider ::<RealApi>; tests inject unit-struct fakes to cover IO error paths real fs can't trigger portably. Generic param is named Api (not Fs) so the same provider can grow non-fs capabilities without a rename.
  • New LinkVirtualStoreBins step walks each virtual-store slot and links its child packages' bins into the slot's own node_modules/.bin, matching linkBinsOfDependencies in pnpm's building/during-install. Slot iteration, per-slot child reads, and shim writes parallelised on rayon.
  • SymlinkDirectDependencies and InstallWithoutLockfile call the bin linker after the symlink layout is in place. Direct-dep linking and the symlink loop are exposed as free link_direct_dep_bins / symlink_direct_deps_into_node_modules functions so they're unit-testable without the mock-registry scaffold.

Tests port the relevant bins/resolver and bins/linker cases from pnpm (directories.bin discovery, scoped bin names, dangerous locations, idempotent shim skip, conflict resolution) and add fakes-injected coverage for the IO error variants.

Not in this PR: hoisted-bin precedence and lifecycle-script-created bins. Both depend on subsystems pacquet hasn't built yet — hoisting (hoistPattern / publicHoistPattern resolution + the lift step) and lifecycle scripts (exec/lifecycle/ runner + allowBuilds plumbing). The bin-linking side already does the right thing for both (shims are paths, so they pick up files generated post-extract; the precedence rule is a small addition once hoisting tags candidates as hoisted). Tracked separately in with full porting coordinates.

Performance

When the bin-linking step landed it added ~7% to the warm-cache Frozen Lockfile install (benchmark). Closing that gap took a sequence of changes that mirror how pnpm v11 avoids the same costs in its own linkBinsOfDependencies (4750fd370c):

  • Bundled manifests in index.db. extract_tarball_entries captures the parsed package.json, picks the subset pnpm caches via a port of normalizeBundledManifest (name, version, bin, directories, dependencies, lifecycle scripts, …), and stuffs it into PackageFilesIndex.manifest. prefetch_cas_paths surfaces it back as a HashMap<cache_key, Arc<Value>> so the bin linker doesn't have to re-read package.json per child off disk.
  • msgpackr-records encoder learns serde_json::Value. The encoder previously errored with ManifestNotSupported if anything ever set manifest; now it records-encodes nested JSON objects (slot-allocated per distinct key set) so useRecords: true readers see them as JS Objects rather than Maps — necessary for pnpm's manifest.bin / manifest.directories?.bin property accesses to resolve.
  • Arc<Value> for PackageBinSource.manifest. The lockfile-driven bin-link path produces slots × children clones of the parsed manifest (~6k on the integrated-benchmark fixture). Sharing via Arc turns each push into a refcount bump.
  • hasBin filter via lockfile. LinkVirtualStoreBins now takes the lockfile packages: section, pre-computes an Option<HashSet<PackageKey>> of hasBin: true keys at install start, and skips snapshot children that aren't in the set before any path-building or manifest lookup. ~95% of a real lockfile's packages don't declare a bin, so this short-circuits the bulk of the per-slot work. None is reserved for the pathological case where the lockfile lacks a packages: section (fall back to processing every child); Some(empty) is authoritative — lockfile says no package has a bin, every slot short-circuits. Mirrors pnpm's filter at building/during-install/src/index.ts:283.
  • Lockfile-driven slot iteration. run_lockfile_driven walks snapshots directly and builds slot paths lexically. The previous run_with_readdir path enumerated every slot via Api::read_dir(virtual_store_dir) and probed each slot's own package directory with a wasted read_dir (~1267 wasted open(O_DIRECTORY) + close per warm install). The readdir-driven path survives as a fallback for install_without_lockfile and retains the existence probe there (small N, no virtual-store invariant from create_virtual_dir_by_snapshot to lean on).
  • Parallel prefetch decode. StoreIndex::get_many_raw returns undecoded row bytes; the SQLite mutex releases before prefetch_cas_paths fans the msgpackr decode + integrity check across rayon.

Latest CI benchmark on this branch (job 75462998251):

Scenario pacquet@HEAD pacquet@main Relative
Frozen Lockfile 2.074s ± 102ms 2.026s ± 70ms +2.4% (within noise)
Frozen Lockfile (Hot Cache) 479.6ms ± 38ms 535.8ms ± 37ms ~10% faster than main

The follow-up #417 splits manifests into a dedicated package_manifests SQLite table so the unfiltered package_index prefetch payload stays at its original size and the manifest fetch becomes a targeted SELECT against just the hasBin: true subset.

Summary by CodeRabbit

  • New Features

    • Automatic binary linking and shimming for installed packages (root and virtual-store), with deterministic conflict resolution and idempotent updates.
    • Cross-platform shim generation (Unix shell, .cmd, PowerShell) with robust runtime detection.
  • Bug Fixes / Improvements

    • Safer bin name and path validation plus lexical path normalization to prevent unsafe links.
    • Install progress now includes a dedicated bin-linking phase.
  • Chores

    • Workspace and configuration updates.

Review Change Stack

Summary by CodeRabbit

  • New Features

    • Robust bin discovery and multi-platform shim generation with safe path handling, deterministic conflict resolution, and idempotent updates; bin linking now runs during install (direct deps and virtual-store slots).
    • Package manifest prefetching and caching improved to surface bundled manifests for faster, more reliable linking and extraction.
    • Store index/raw-decode helper added to enable parallel decoding.
  • Tests

    • Extensive tests covering bin resolution, shim generation, linking, manifest handling, and error cases.
  • Chores

    • Workspace and config updates; typo dictionary extended.

Review Change Stack

Mirrors pnpm v11's `@pnpm/bins.resolver`, `@pnpm/bins.linker`, and
`@zkochan/cmd-shim` (resolved at pnpm `4750fd370c` and cmd-shim
`0d79ca9534`):

- New `pacquet-cmd-shim` crate parses the `bin` field of a package
  manifest, resolves bin commands with the same path-traversal and
  URL-safe-name guards as upstream, and writes Unix shell shims
  byte-compatible with pnpm's `generateShShim`. Conflict resolution
  follows `pkgOwnsBin` + lexical compare.
- New `LinkVirtualStoreBins` step walks each virtual-store slot and
  links its child packages' bins into the slot's own
  `node_modules/.bin`, matching `linkBinsOfDependencies` in pnpm's
  `building/during-install`.
- `SymlinkDirectDependencies` and `InstallWithoutLockfile` now call
  the bin linker after the symlink layout is in place, so direct deps'
  bins land under `<modules_dir>/.bin/`.

Per `plans/TEST_PORTING.md` "Link Dependency Binaries", this is the
direct-dependency-bins-first slice. Hoisted-bin precedence,
`directories.bin` discovery, lifecycle-script-created bins, and
Windows `.cmd`/`.ps1` shims are deferred.
@KSXGitHub KSXGitHub requested review from a team, anonrig and zkochan April 29, 2026 13:59
@KSXGitHub

This comment was marked as resolved.

The link triggered `rustdoc::private_intra_doc_links` under
`-D warnings` (the Doc CI job), which fails CI. Replace with plain
backticks since the function is intentionally private.
@codecov

codecov Bot commented Apr 29, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 88.14815% with 112 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.57%. Comparing base (adba116) to head (6a34b28).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
crates/package-manager/src/link_bins.rs 70.73% 72 Missing ⚠️
crates/store-dir/src/msgpackr_records.rs 82.43% 13 Missing ⚠️
crates/tarball/src/lib.rs 84.50% 11 Missing ⚠️
crates/package-manager/src/create_virtual_store.rs 63.15% 7 Missing ⚠️
crates/cmd-shim/src/link_bins.rs 97.33% 4 Missing ⚠️
crates/cmd-shim/src/shim.rs 98.43% 3 Missing ⚠️
crates/cmd-shim/src/bin_resolver.rs 97.77% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #333      +/-   ##
==========================================
+ Coverage   83.68%   84.57%   +0.88%     
==========================================
  Files          69       75       +6     
  Lines        4260     5159     +899     
==========================================
+ Hits         3565     4363     +798     
- Misses        695      796     +101     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented Apr 29, 2026

Copy link
Copy Markdown

Micro-Benchmark Results

Linux

group                          main                                   pr
-----                          ----                                   --
tarball/download_dependency    1.02     17.1±0.50ms   254.1 KB/sec    1.00     16.8±0.29ms   258.8 KB/sec

@github-actions

github-actions Bot commented Apr 29, 2026

Copy link
Copy Markdown

Integrated-Benchmark Report (Linux)

Scenario: Frozen Lockfile

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 2.592 ± 0.112 2.452 2.790 1.06 ± 0.06
pacquet@main 2.443 ± 0.100 2.322 2.673 1.00
pnpm 6.254 ± 0.102 6.094 6.373 2.56 ± 0.11
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 2.5918448134800007,
      "stddev": 0.11171097798591177,
      "median": 2.56717565408,
      "user": 2.52928088,
      "system": 3.43436424,
      "min": 2.45213270608,
      "max": 2.79008502408,
      "times": [
        2.79008502408,
        2.45213270608,
        2.52115285708,
        2.6266596540800005,
        2.7688762120800003,
        2.58837483908,
        2.54597646908,
        2.5377177890800002,
        2.48773920608,
        2.5997333780800003
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 2.44288746538,
      "stddev": 0.10028409602270291,
      "median": 2.4079852925800003,
      "user": 2.40243988,
      "system": 3.3772794399999997,
      "min": 2.3217915390800004,
      "max": 2.6731706920800002,
      "times": [
        2.37978561308,
        2.5428539100800003,
        2.47831268008,
        2.40789506108,
        2.40807552408,
        2.6731706920800002,
        2.38587150108,
        2.4013009260800002,
        2.3217915390800004,
        2.42981720708
      ]
    },
    {
      "command": "pnpm",
      "mean": 6.25374860388,
      "stddev": 0.10170106945773101,
      "median": 6.28511785208,
      "user": 9.036589379999999,
      "system": 4.485877240000001,
      "min": 6.09435619108,
      "max": 6.37261110408,
      "times": [
        6.37261110408,
        6.096802386079999,
        6.34019519608,
        6.331626594079999,
        6.26531160808,
        6.205647734079999,
        6.191780877079999,
        6.334230252079999,
        6.09435619108,
        6.30492409608
      ]
    }
  ]
}

Scenario: Frozen Lockfile (Hot Cache)

Command Mean [ms] Min [ms] Max [ms] Relative
pacquet@HEAD 677.4 ± 57.7 640.5 837.2 1.00
pacquet@main 794.5 ± 58.4 742.9 910.5 1.17 ± 0.13
pnpm 2677.1 ± 82.9 2530.5 2806.1 3.95 ± 0.36
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 0.67741800416,
      "stddev": 0.0577009849500688,
      "median": 0.66179080276,
      "user": 0.35493272,
      "system": 1.4631175599999997,
      "min": 0.64045146226,
      "max": 0.83716713926,
      "times": [
        0.83716713926,
        0.66283738726,
        0.66151857626,
        0.65511925626,
        0.68866231126,
        0.66206302926,
        0.66776614326,
        0.64045146226,
        0.65557402326,
        0.64302071326
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 0.79450122656,
      "stddev": 0.058394655164998935,
      "median": 0.77183622276,
      "user": 0.26607082,
      "system": 1.40717396,
      "min": 0.74293681026,
      "max": 0.91048906026,
      "times": [
        0.91048906026,
        0.75176440826,
        0.74293681026,
        0.75151129726,
        0.86773088026,
        0.79190803726,
        0.74797063626,
        0.74664929026,
        0.82267619626,
        0.81137564926
      ]
    },
    {
      "command": "pnpm",
      "mean": 2.67705237256,
      "stddev": 0.08285584132050684,
      "median": 2.6911332857600003,
      "user": 3.1653496199999998,
      "system": 2.23074106,
      "min": 2.5305021132600003,
      "max": 2.80605876026,
      "times": [
        2.6083950982600004,
        2.6694047422600002,
        2.69129296826,
        2.80605876026,
        2.75294408426,
        2.5816265092600004,
        2.5305021132600003,
        2.71926339026,
        2.6909736032600002,
        2.72006245626
      ]
    }
  ]
}

The integrated benchmark on the 1267-package fixture showed a ~4-12%
regression vs. main on both `Frozen Lockfile` and
`Frozen Lockfile (Hot Cache)` after introducing bin linking. Each
virtual-store slot's bin walk was running serially:

- `LinkVirtualStoreBins::run` did `read_dir + is_dir` per slot inside
  a serial `for` loop. With ~1300 slots in the benchmark, the per-slot
  syscall overhead alone accounted for most of the new wall time.
- `link_bins_of_packages` wrote every shim (read shebang, write file,
  chmod) on the calling thread.
- `collect_bin_sources` (direct deps) read every dependency's
  `package.json` serially.

All three loops are work-list shaped with no shared state, so drive
them on rayon. Mirrors the per-snapshot warm batch in
`create_virtual_store.rs` and pnpm's own `Promise.all`-style fan-out
in `linkBinsOfDependencies`.
Comment thread crates/cmd-shim/src/bin_resolver.rs Outdated
claude added 3 commits April 29, 2026 15:33
Per CODE_STYLE_GUIDE.md "Unit test file layout": once a `mod tests`
block is long enough to obscure the surrounding module, it moves to
`<parent>/tests.rs`. The four test blocks introduced for bin linking
(78-130 lines each) were past that threshold.

Moved:
- crates/cmd-shim/src/bin_resolver.rs       -> .../bin_resolver/tests.rs
- crates/cmd-shim/src/shim.rs               -> .../shim/tests.rs
- crates/cmd-shim/src/link_bins.rs          -> .../link_bins/tests.rs
- crates/package-manager/src/link_bins.rs   -> .../link_bins/tests.rs

No test logic changed.
Patch coverage on the bin-linking PR was 83.41% (70 lines missing) per
codecov. Three sources of gaps:

1. Edge cases I'd skipped: every fallback extension in
   `extension_program`, the `prog: None` arm of `generate_sh_shim`, the
   `path::isAbsolute(shTarget)` branch upstream uses, the lexical
   normalize `..`-past-root and `CurDir` paths, and `parse_shebang`
   empty-prog.

2. pnpm tests not yet ported. Port the ones that translate to pacquet's
   no-`directories.bin` first iteration:
   - `bins/resolver/test/index.ts`: \$-as-command-name, dangerous bin
     names, dangerous bin locations, scoped bin name strip, scoped bin
     name with traversal.
   - `bins/linker/test/index.ts`: idempotent skip when shim already
     points at the same target.

3. Unreachable IO error paths. `search_script_runtime`'s non-`NotFound`
   error path can't be triggered with real fs portably. Apply the
   per-capability DI pattern documented in
   <#332 (comment)>:
   - New `FsReadHead` trait + `RealFs` provider.
   - `search_script_runtime<Fs: FsReadHead>` and `read_shebang<Fs>`
     bind the capability; production callers (`write_shim`) turbofish
     `::<RealFs>`.
   - Tests inject `PermissionDeniedFs` and `EmptyReadFs` to cover the
     propagate-error and zero-bytes-fallthrough paths.
   - New pure `parse_shebang_from_bytes` is independently testable.

Also extract `link_direct_dep_bins` from `SymlinkDirectDependencies::run`
into a free function under `package-manager::link_bins`, so the direct-
dep bin-linking step can be unit-tested without spinning up the
mock-registry-backed install pipeline. Adds five tests for it covering
empty deps, missing manifest, and symlink-followed manifest read.

Test count: cmd-shim 19 → 49, package-manager (lib) 28 → 35.
The new \`link_direct_dep_bins_follows_symlink_to_real_package\` test
called \`std::os::windows::fs::symlink_dir\` directly, which requires
\`SeCreateSymbolicLinkPrivilege\` and fails on Windows CI runners that
don't have it. Production code already routes through pacquet_fs's
\`symlink_dir\` (which uses junctions on Windows for exactly this
reason). Switch the test to the same helper so behavior matches the
production path on every platform.
@KSXGitHub

This comment was marked as resolved.

`Path::is_absolute("/abs/elsewhere/cli")` returns true on Unix but
false on Windows (Windows requires a drive letter), so the test forced
the wrong branch on Windows CI. The absolute-vs-relative classification
of shim targets is itself platform-dependent; pin the test to Unix
only.

This comment was marked as resolved.

@KSXGitHub KSXGitHub force-pushed the claude/link-dependency-binaries-JWwJN branch from c200ae2 to 2ffb2d1 Compare April 29, 2026 16:08
@KSXGitHub

This comment was marked as resolved.

claude added 3 commits April 29, 2026 16:29
Closes the rest of the bin-resolver / cmd-shim feature parity gap with
pnpm v11 within this PR's scope.

- \`directories.bin\` fallback (\`bin_resolver.rs\`):
  walkdir over every regular file under the package's nominated dir,
  with the same \`is_subdir\` traversal guard pnpm uses. Missing dir
  degrades to an empty list. Ports five upstream tests:
  index.ts:37 / :150 / :160 + path-traversal.test.ts.

- Windows \`.cmd\` and \`.ps1\` shim generation (\`shim.rs\`):
  ports \`generateCmdShim\` and \`generatePwshShim\` from
  @zkochan/cmd-shim 0d79ca9534. CRLF line endings on the .cmd flavor;
  \$basedir+\$exe header on the .ps1 flavor with Windows/POSIX-pwsh
  detection. Both exec arms (with-prog / no-prog) ported. Skips the
  \`nodeExecPath\` / \`nodePath\` / \`prependToPath\` / \`progArgs\`
  branches we don't use.
- \`link_bins.rs\` writes all three shim flavors per bin so a project
  installed on Linux remains usable on Windows after a clone. Idempotent
  short-circuit only fires when the canonical .sh shim already points
  at the same target.

- Extract \`symlink_direct_deps_into_node_modules\` from
  \`SymlinkDirectDependencies::run\`. Same shape as the
  \`link_direct_dep_bins\` extraction this PR already does — closes
  the unit-test gap that codecov flagged at 44.11% on
  symlink_direct_dependencies.rs without needing the mock-registry
  scaffold.

Test count: cmd-shim 49 → 59, package-manager (lib) 35 → 37.
Extends the per-capability DI pattern from \`FsReadHead\` (PR #333) to
the rest of the bin-linking IO surface. New traits in
\`crates/cmd-shim/src/fs_capabilities.rs\`:

- \`FsReadDir\` — list a directory's entries
- \`FsReadFile\` — read a file into Vec<u8>  (package.json reads)
- \`FsReadString\` — read a file into String (warm-skip shim probe)
- \`FsCreateDirAll\` — mkdir -p
- \`FsWriteAtomic\` — write a file replacing if it exists
- \`FsSetPermissions\` — chmod 0o755 / OR-in 0o111

\`RealFs\` provides all of them; \`link_bins\`, \`link_bins_of_packages\`,
\`write_shim\`, \`collect_packages_in_modules_dir\`, and \`read_package\`
take a generic \`Fs: ...\` bound binding only the capabilities they
consume. Production callers (\`InstallWithoutLockfile\`,
\`SymlinkDirectDependencies\`, \`LinkVirtualStoreBins::run\`) turbofish
\`::<RealFs>\`. \`LinkVirtualStoreBins\` keeps a non-generic
\`run\` and exposes \`run_with::<Fs>\` for tests.

Adds five new fakes-injected tests covering error paths real fs can't
trigger portably:

- \`link_bins\` propagates non-NotFound \`read_dir\` failure on
  modules_dir → \`CreateBinDir\` variant
- \`link_bins_of_packages\` propagates \`create_dir_all\` failure →
  \`CreateBinDir\`
- \`link_bins_of_packages\` propagates write failure on the .sh shim →
  \`WriteShim\`
- \`link_bins_of_packages\` propagates chmod failure → \`Chmod\`
- \`LinkVirtualStoreBins::run_with\` propagates non-NotFound read_dir
  failure on the virtual store → \`ReadVirtualStore\`

Each fake is the per-test unit-struct shape documented at
#332 (comment 4345054524) — no \`&self\`, only the
capability methods that get called are non-trivial; the rest are
\`unreachable!()\` to assert the call path.

Test count: cmd-shim 59 → 63, package-manager (lib) 37 → 38.
…cases

Local cargo-llvm-cov on the cmd-shim crate showed the following
uncovered lines after the broader DI work:

- bin_resolver.rs: walkdir-filter unsafe-name continue, empty-key
  guard inside is_safe_bin_name, CurDir branch in lexical_normalize
- link_bins.rs: pick_winner (true, false) arm, ProbeShimSource
  propagation, ReadManifest propagation

Added focused tests:

- empty bin key in object form is rejected
- directories.bin filters files whose basename fails the URL-safe
  guard (file with a space)
- directories.bin handles ./<dir> via the CurDir branch
- ownership_breaks_bin_conflicts_when_existing_owns covers the
  (true, false) pick_winner arm
- DI fakes for ProbeShimSource and ReadManifest error variants

Coverage on cmd-shim went 93.99% -> 95.32% line, 93.58% -> 94.57%
region. Remaining gaps are platform-conditional or compiler-elided
empty match arms (Component::CurDir => {}, Windows .cmd absolute-path
branch) that aren't fruitful targets without changing the
implementation shape.

Test count: cmd-shim 63 -> 69.

This comment was marked as resolved.

…main neutrality

Renames the generic parameter and concrete provider type from \`Fs\`/
\`RealFs\` to \`Api\`/\`RealApi\` across pacquet-cmd-shim and the
pacquet-package-manager call sites. Also renames the module
\`fs_capabilities.rs\` -> \`capabilities.rs\`.

The trait names keep their \`Fs\` prefix (\`FsReadDir\`, \`FsReadFile\`,
\`FsReadString\`, \`FsReadHead\`, \`FsCreateDirAll\`, \`FsWriteAtomic\`,
\`FsSetPermissions\`) — domain prefixes belong on traits, not on the
provider.

Reasoning, per the revised DI reference at
#332 (comment 4345054524):

The reference uses \`RealApi\` to model a single provider that
implements every capability the install pipeline needs, regardless
of domain. Crates that today only have fs-side capabilities (this
one, modules-yaml) will likely cross domains in the next milestones
— lifecycle scripts need \`GetNodeExecPath\` or similar, fixture-
based install tests need \`GetCurrentTime\` for \`prunedAt\`. Doing the
rename now avoids a churning rename when the first non-\`Fs*\`
capability lands.

No semantic changes; tests all still pass (cmd-shim 69, package-
manager-lib 38).
@KSXGitHub KSXGitHub mentioned this pull request Apr 29, 2026
3 tasks
claude added 2 commits April 29, 2026 17:08
The latest revision of the DI reference comment
(#332 issuecomment-4345054524) added principle 6:

> Provider names are domain-neutral; trait names are domain-scoped.
> The generic and its production impl are \`Api\` and \`RealApi\`
> regardless of how many domains they cover.

That principle is about the *production* provider. Test fakes
follow a different rule the same revision spells out:

> Test fakes describe what behaviour they fake, not their provider role.

The upstream example (\`BadYamlFs\`) names a single-trait fake by
behaviour + domain of the trait it fakes. My earlier rename had
turned \`PermissionDeniedFs\` / \`EmptyReadFs\` into
\`PermissionDeniedApi\` / \`EmptyReadApi\` — that was wrong: those
fakes implement only \`FsReadHead\`, not the full provider role,
so the \`Api\` suffix overstates what they do. Revert to the
behaviour+domain shape that matches \`BadYamlFs\`.

Multi-trait fakes (\`FailingProbe\`, \`DenyManifestRead\`, etc.)
already follow behaviour-only naming and don't need a domain
suffix — they describe the failure scenario directly.
…aviour + trait domain"

This reverts commit d234176. The DI reference principle is being
revised to keep the \`Api\` suffix on single-trait test fakes for
consistency with the production provider naming, so restore
\`PermissionDeniedApi\` / \`EmptyReadApi\` until the principle update
lands.
Two doc fixes from the Copilot review on PR #333
(pullrequestreview-4267779783 / -4267945452):

1. `normalize_bundled_manifest` claimed `{ name, version }`-shaped
   manifests typically return `None` because those fields are "NOT
   in the pick list". Wrong on both counts: `name` is in
   `BUNDLED_MANIFEST_FIELDS` and `version` is special-cased first.
   A real npm-published `package.json` always carries both, so the
   typical case is `Some(...)`, not `None`. Updated to describe
   when `None` actually fires (non-object input, all fields
   absent or `null`).

2. `PrefetchedManifests` claimed `None` entries are stored to
   distinguish "no manifest stored" from "package wasn't prefetched
   at all", but the type is `HashMap<String, Arc<Value>>` — there's
   no way to express `None`. Rewrote: a missing key is either
   case, and callers cross-reference with [`PrefetchedCasPaths`]
   from the same [`PrefetchResult`] when they need to tell them
   apart.

No code change.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 31 changed files in this pull request and generated 2 comments.

Comment thread crates/tarball/src/lib.rs
Comment thread crates/tarball/src/lib.rs
@zkochan

zkochan commented May 11, 2026

Copy link
Copy Markdown
Member

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented May 11, 2026

Copy link
Copy Markdown
✅ Actions performed

Full review triggered.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/tarball/src/tests.rs (1)

160-230: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid live registry calls in unit tests.

packages_under_orgs_should_work and should_throw_error_on_checksum_mismatch now depend on registry.npmjs.org, so DNS/rate-limit/outage noise can fail this suite even when the tarball code is fine. This file already has a hermetic pattern via FASTIFY_ERROR_TARBALL and mockito; please switch these two tests to that fixture too.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/tarball/src/tests.rs` around lines 160 - 230, Both tests call out to
registry.npmjs.org; switch them to use the existing hermetic fixture and mockito
instead. For packages_under_orgs_should_work and
should_throw_error_on_checksum_mismatch, replace the literal package_url with
the FASTIFY_ERROR_TARBALL fixture and register a mockito::mock that serves that
tarball (matching path and response used elsewhere in this file), and ensure the
http client used by DownloadTarballToStore points to the mockito server (as
other tests do). Keep the same package_id, integrity, and size; reuse the
existing FASTIFY_ERROR_TARBALL + mock setup pattern so the tests are fully
hermetic.
🧹 Nitpick comments (2)
crates/tarball/src/tests.rs (1)

405-530: ⚡ Quick win

Add coverage for PrefetchResult.manifests.

These updated prefetch tests only assert prefetched.cas_paths. A regression in manifest extraction/splitting would still pass here even though downstream bin-linking now consumes prefetched.manifests. A small fixture row with manifest: Some(...) would close that gap.

As per coding guidelines, "When porting behavior from pnpm, port the relevant pnpm tests too as Rust tests whenever they translate to match test coverage and prove behavioral parity."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/tarball/src/tests.rs` around lines 405 - 530, Add a small manifest
fixture to the seeded index rows and assert PrefetchResult.manifests is
populated: in the tests that call prefetch_cas_paths (e.g.,
prefetch_cas_paths_omits_failed_integrity_entries and
prefetch_cas_paths_skips_filesystem_checks_when_verify_disabled) set
PackageFilesIndex.manifest = Some(your_manifest_string_or_bytes) when building
`entry`, then after `prefetched` is returned check
`prefetched.manifests.get(&index_key)` exists and contains the expected manifest
value; this verifies prefetch_cas_paths' manifest-extraction/splitting path
alongside cas_paths.
crates/package-manager/src/link_bins.rs (1)

193-202: ⚡ Quick win

Don't treat packages: {} the same as “metadata unavailable”.

build_has_bin_set() returns an empty set both when packages is None and when the lockfile actually has a packages: section but no entry advertises hasBin: true. run_lockfile_driven() then interprets that empty set as “disable filtering”, so a no-bin workspace falls back to walking every child manifest even though the lockfile already proved no shims can be produced.

Returning Option<HashSet<PackageKey>> or a separate has_package_metadata flag would keep the fast-path active for that case.

Also applies to: 230-234

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/package-manager/src/link_bins.rs` around lines 193 - 202, The function
build_has_bin_set currently conflates "no package metadata available" with
"there are packages but none advertise has_bin", so change build_has_bin_set to
return Option<HashSet<PackageKey>> (None = metadata unavailable, Some(set) =
known set possibly empty), update its callers (e.g., run_lockfile_driven and any
other uses) to treat Some(empty) as the fast-path "no shims" case and only
disable filtering when the function returns None, and adjust types/signatures
where it was previously assumed to be HashSet to propagate the Option and handle
the None vs Some(empty) distinction.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/cmd-shim/src/link_bins.rs`:
- Around line 306-320: The current no-op check uses cmd_exists/ps1_exists which
only tests file existence and lets stale/corrupt .cmd/.ps1 siblings bypass
repair; change those to validate file contents like sh_marker_ok does: replace
cmd_exists and ps1_exists with content checks (e.g.
matches!(Api::read_to_string(&cmd_path), Ok(existing) if
is_cmd_pointing_at(&existing, target_path)) and similarly for ps1_path) or, if
helpers aren’t present, read the files with Api::read_to_string and verify they
contain/resolve to target_path (or match the expected template); then compute
already_correct = sh_marker_ok && cmd_marker_ok && ps1_marker_ok so we only skip
when all three sh/cmd/ps1 actually point at target_path.

In `@crates/cmd-shim/src/shim.rs`:
- Around line 101-104: The current parse_shebang_from_bytes uses trim_start(),
which incorrectly accepts shebangs not at byte 0; change it so the first line is
taken exactly as-is (no leading-trim) and only passed to parse_shebang when it
begins with "#!" at byte 0. Concretely, in parse_shebang_from_bytes remove
trim_start(), extract the first line (e.g.,
head.split('\n').next().unwrap_or("") and trim_end_matches('\r')), then check
first_line.starts_with("#!") — if true call parse_shebang(first_line) otherwise
return None; keep function names parse_shebang_from_bytes and parse_shebang to
locate the change.

In `@crates/package-manager/src/link_bins.rs`:
- Around line 397-421: find_slot_own_package_dir now returns a syntactic PathBuf
even when the actual package dir (slot/node_modules/<own>) doesn’t exist,
causing link_bins_excluding to create phantom <own>/node_modules/.bin shims;
modify link_bins_excluding to track whether the existing walk actually
encountered the exclude (i.e. found a real own package directory) and if no
exclude was found, bail out before writing shims; specifically, inside
link_bins_excluding (where it currently calls find_slot_own_package_dir and
collects sibling bins) add a boolean flag (e.g. saw_exclude) that is set only
when the directory exists on disk or when an exclude entry was observed during
the traversal, and check that flag before creating the <own>/node_modules/.bin
directory or writing any shims so slots without a real own package are skipped.
- Around line 462-465: The code currently swallows all errors from
Api::read_dir(&path) inside the name_str.starts_with('@') branch by using an
unconditional continue; change this so only a NotFound (concurrent delete) is
skipped while other errors are propagated. Specifically, replace the current let
Ok(scope_entries) = Api::read_dir(&path) else { continue; } with a match on
Api::read_dir(&path) that: binds Ok(scope_entries) as before; if Err(e) and
e.kind() == std::io::ErrorKind::NotFound then continue; otherwise propagate or
return the error (e.g., return Err(e) or use ?), so failures like
PermissionDenied/EIO are surfaced instead of dropping the whole scope.

---

Outside diff comments:
In `@crates/tarball/src/tests.rs`:
- Around line 160-230: Both tests call out to registry.npmjs.org; switch them to
use the existing hermetic fixture and mockito instead. For
packages_under_orgs_should_work and should_throw_error_on_checksum_mismatch,
replace the literal package_url with the FASTIFY_ERROR_TARBALL fixture and
register a mockito::mock that serves that tarball (matching path and response
used elsewhere in this file), and ensure the http client used by
DownloadTarballToStore points to the mockito server (as other tests do). Keep
the same package_id, integrity, and size; reuse the existing
FASTIFY_ERROR_TARBALL + mock setup pattern so the tests are fully hermetic.

---

Nitpick comments:
In `@crates/package-manager/src/link_bins.rs`:
- Around line 193-202: The function build_has_bin_set currently conflates "no
package metadata available" with "there are packages but none advertise
has_bin", so change build_has_bin_set to return Option<HashSet<PackageKey>>
(None = metadata unavailable, Some(set) = known set possibly empty), update its
callers (e.g., run_lockfile_driven and any other uses) to treat Some(empty) as
the fast-path "no shims" case and only disable filtering when the function
returns None, and adjust types/signatures where it was previously assumed to be
HashSet to propagate the Option and handle the None vs Some(empty) distinction.

In `@crates/tarball/src/tests.rs`:
- Around line 405-530: Add a small manifest fixture to the seeded index rows and
assert PrefetchResult.manifests is populated: in the tests that call
prefetch_cas_paths (e.g., prefetch_cas_paths_omits_failed_integrity_entries and
prefetch_cas_paths_skips_filesystem_checks_when_verify_disabled) set
PackageFilesIndex.manifest = Some(your_manifest_string_or_bytes) when building
`entry`, then after `prefetched` is returned check
`prefetched.manifests.get(&index_key)` exists and contains the expected manifest
value; this verifies prefetch_cas_paths' manifest-extraction/splitting path
alongside cas_paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: f6f0f425-a087-48b3-951b-524ee483e6de

📥 Commits

Reviewing files that changed from the base of the PR and between adba116 and 846e1d3.

⛔ Files ignored due to path filters (5)
  • Cargo.lock is excluded by !**/*.lock
  • crates/cli/tests/snapshots/add__should_install_all_dependencies.snap is excluded by !**/*.snap
  • crates/cli/tests/snapshots/add__should_symlink_correctly.snap is excluded by !**/*.snap
  • crates/cli/tests/snapshots/install__should_install_dependencies.snap is excluded by !**/*.snap
  • crates/package-manager/src/install/snapshots/pacquet_package_manager__install__tests__should_install_dependencies.snap is excluded by !**/*.snap
📒 Files selected for processing (26)
  • .typos.toml
  • Cargo.toml
  • crates/cmd-shim/Cargo.toml
  • crates/cmd-shim/src/bin_resolver.rs
  • crates/cmd-shim/src/bin_resolver/tests.rs
  • crates/cmd-shim/src/capabilities.rs
  • crates/cmd-shim/src/lib.rs
  • crates/cmd-shim/src/link_bins.rs
  • crates/cmd-shim/src/link_bins/tests.rs
  • crates/cmd-shim/src/path_util.rs
  • crates/cmd-shim/src/shim.rs
  • crates/cmd-shim/src/shim/tests.rs
  • crates/package-manager/Cargo.toml
  • crates/package-manager/src/create_virtual_store.rs
  • crates/package-manager/src/install_frozen_lockfile.rs
  • crates/package-manager/src/install_without_lockfile.rs
  • crates/package-manager/src/lib.rs
  • crates/package-manager/src/link_bins.rs
  • crates/package-manager/src/link_bins/tests.rs
  • crates/package-manager/src/symlink_direct_dependencies.rs
  • crates/store-dir/src/msgpackr_records.rs
  • crates/store-dir/src/msgpackr_records/tests.rs
  • crates/store-dir/src/store_index.rs
  • crates/tarball/Cargo.toml
  • crates/tarball/src/lib.rs
  • crates/tarball/src/tests.rs
📜 Review details
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

**/*.rs: Log emissions are part of 'match pnpm'—when porting a function that fires pnpm:<channel> events through globalLogger, logger.debug(...), or streamParser.write(...), mirror the call site, payload, and ordering so @pnpm/cli.default-reporter parses pacquet's NDJSON the same way
Declare a newtype wrapper for branded string types instead of collapsing into plain String or &str to preserve the distinction between validating and non-validating brands
If upstream always validates before construction of a branded string type, implement validation via TryFrom<String> and/or FromStr without providing an infallible public constructor
If upstream never validates a branded string type, expose an infallible From<String> constructor to provide type-safety without runtime validation
When upstream occasionally constructs a branded string type without validation via bare as assertion, expose a from_str_unchecked constructor so callers can opt into the same unchecked path explicitly
Match upstream serde behavior for branded types—use #[serde(try_from = "String")] for deserialization to go through the validator and #[serde(into = "String")] for serialization
Use #[derive(derive_more::From)] and #[derive(derive_more::Into)] for mechanical conversion impls on branded types rather than handwriting impl blocks
Model string-literal unions (like 'auto' | 'always' | 'never') as Rust enums instead of newtype wrappers
Template literal types (like ${string}@${string}) should be treated as branded string types with the same validation discipline applied
Follow the code style guide in CODE_STYLE_GUIDE.md covering code-level conventions not enforced by tooling: imports, modules, naming, ownership and borrowing, parameter type selection, trait bounds, pattern matching, pipe-trait, error handling, test layout, and cloning of Arc and Rc
Warnings are errors in cargo clippy --locked -- --deny warnings, and should not be silenced with `#[allow(....

Files:

  • crates/cmd-shim/src/path_util.rs
  • crates/cmd-shim/src/lib.rs
  • crates/package-manager/src/lib.rs
  • crates/tarball/src/tests.rs
  • crates/package-manager/src/symlink_direct_dependencies.rs
  • crates/store-dir/src/store_index.rs
  • crates/cmd-shim/src/shim/tests.rs
  • crates/package-manager/src/link_bins/tests.rs
  • crates/package-manager/src/install_without_lockfile.rs
  • crates/store-dir/src/msgpackr_records/tests.rs
  • crates/package-manager/src/link_bins.rs
  • crates/store-dir/src/msgpackr_records.rs
  • crates/tarball/src/lib.rs
  • crates/cmd-shim/src/bin_resolver.rs
  • crates/package-manager/src/create_virtual_store.rs
  • crates/package-manager/src/install_frozen_lockfile.rs
  • crates/cmd-shim/src/link_bins.rs
  • crates/cmd-shim/src/capabilities.rs
  • crates/cmd-shim/src/shim.rs
  • crates/cmd-shim/src/link_bins/tests.rs
  • crates/cmd-shim/src/bin_resolver/tests.rs
🧠 Learnings (3)
📚 Learning: 2026-05-07T23:19:08.272Z
Learnt from: KSXGitHub
Repo: pnpm/pacquet PR: 401
File: tasks/integrated-benchmark/src/work_env.rs:343-344
Timestamp: 2026-05-07T23:19:08.272Z
Learning: When reviewing Rust code in pnpm/pacquet for deprecated API usage, do not automatically treat `serde_saphyr::to_string` as deprecated. In `serde-saphyr` v0.0.25, `serde_saphyr::to_string` has no `#[deprecated]` attribute (the `#[deprecated]` later in `serde-saphyr-0.0.25/src/lib.rs` applies to a different function). Only flag `serde_saphyr::to_string` as deprecated if the resolved dependency version’s source shows `#[deprecated]` on that specific function.

Applied to files:

  • crates/cmd-shim/src/path_util.rs
  • crates/cmd-shim/src/lib.rs
  • crates/package-manager/src/lib.rs
  • crates/tarball/src/tests.rs
  • crates/package-manager/src/symlink_direct_dependencies.rs
  • crates/store-dir/src/store_index.rs
  • crates/cmd-shim/src/shim/tests.rs
  • crates/package-manager/src/link_bins/tests.rs
  • crates/package-manager/src/install_without_lockfile.rs
  • crates/store-dir/src/msgpackr_records/tests.rs
  • crates/package-manager/src/link_bins.rs
  • crates/store-dir/src/msgpackr_records.rs
  • crates/tarball/src/lib.rs
  • crates/cmd-shim/src/bin_resolver.rs
  • crates/package-manager/src/create_virtual_store.rs
  • crates/package-manager/src/install_frozen_lockfile.rs
  • crates/cmd-shim/src/link_bins.rs
  • crates/cmd-shim/src/capabilities.rs
  • crates/cmd-shim/src/shim.rs
  • crates/cmd-shim/src/link_bins/tests.rs
  • crates/cmd-shim/src/bin_resolver/tests.rs
📚 Learning: 2026-05-01T10:01:33.766Z
Learnt from: zkochan
Repo: pnpm/pacquet PR: 349
File: crates/reporter/src/tests.rs:121-121
Timestamp: 2026-05-01T10:01:33.766Z
Learning: In Rust test code, follow the repo’s CODE_STYLE_GUIDE test-logging rule: add logging (e.g., `eprintln!`/`eprintln!(...)`) so that useful diagnostic values are printed when a test fails, unless the assertion is `assert_eq!` (where the differing values are already included). Concretely, if you use assertions like `assert!`, `assert_ne!`, etc., ensure the test logs the relevant actual/expected values (or context) before/around the assertion so failures can be diagnosed without rerunning.

Applied to files:

  • crates/tarball/src/tests.rs
  • crates/cmd-shim/src/shim/tests.rs
  • crates/package-manager/src/link_bins/tests.rs
  • crates/store-dir/src/msgpackr_records/tests.rs
  • crates/cmd-shim/src/link_bins/tests.rs
  • crates/cmd-shim/src/bin_resolver/tests.rs
📚 Learning: 2026-05-01T10:01:33.766Z
Learnt from: zkochan
Repo: pnpm/pacquet PR: 349
File: crates/reporter/src/tests.rs:121-121
Timestamp: 2026-05-01T10:01:33.766Z
Learning: In this repo’s Rust tests (run via `cargo nextest`), stdout/stderr from passing tests is captured and is only shown when a test fails. As a result, `eprintln!` (and similar stderr logging) in these tests should generally not be flagged as CI “noise” on the happy path; CI output should only appear for failing tests.

Applied to files:

  • crates/tarball/src/tests.rs
🔇 Additional comments (8)
.typos.toml (1)

10-10: Looks good.
Allowlisting pn is appropriate for pnpm-related terminology and avoids noisy typo false positives.

crates/tarball/Cargo.toml (1)

26-26: Dependency update is appropriate.
Adding rayon here matches the parallelization work and keeps the manifest consistent with usage.

crates/cmd-shim/Cargo.toml (1)

1-23: Nice crate manifest setup.
The dependency set and workspace wiring look clean and consistent for the new pacquet-cmd-shim crate.

Cargo.toml (1)

17-17: Workspace dependency registration looks correct.
This cleanly exposes pacquet-cmd-shim to member crates via workspace = true.

crates/package-manager/Cargo.toml (1)

14-14: Integration dependency is correctly declared.
pacquet-cmd-shim is wired in the right place for package-manager bin-linking integration.

Also applies to: 36-36

crates/cmd-shim/src/path_util.rs (1)

28-50: Path normalization implementation is solid.
The anchored/unanchored .. handling in lexical_normalize and is_anchored is well-structured and matches the intended lexical containment behavior.

crates/store-dir/src/msgpackr_records.rs (1)

639-850: Great extension of the encoder for manifest support.
The JSON record-encoding path and write_int addition fit the existing msgpackr records model and keep slot-collision handling consistent.

Also applies to: 1019-1043

crates/store-dir/src/msgpackr_records/tests.rs (1)

597-730: Excellent coverage for the new manifest encoding path.
These tests exercise the right interop-critical behaviors (round-trip, nested-object records, shape reuse, and mixed-field payloads).

Comment thread crates/cmd-shim/src/link_bins.rs Outdated
Comment thread crates/cmd-shim/src/shim.rs
Comment thread crates/package-manager/src/link_bins.rs
Comment thread crates/package-manager/src/link_bins.rs
zkochan added 2 commits May 12, 2026 01:27
Two fixes from CodeRabbit's full review on PR #333 (review
4267999547 at 23:23):

1. **Restore the slot's-own-pkg-dir probe in the readdir
   fallback** (`r3222744362`). The lockfile-driven path doesn't
   need it because `create_virtual_dir_by_snapshot` enforces the
   `<slot>/node_modules/<pkg>` invariant before bin-linking — that
   was the rationale for dropping the probe in 6e079bf (1267
   wasted readdirs on warm install). The readdir fallback path
   (`InstallWithoutLockfile` only) has no such guarantee: if a
   slot's `node_modules/<pkg>` is missing but sibling deps are
   still present, `link_bins_excluding` would `create_dir_all` the
   phantom `<pkg>` chain to hold shims, leaving an orphan package
   directory on disk. Re-add `Api::read_dir(&self_pkg_dir).is_err()`
   gating before calling `link_bins_excluding` *only* in this
   fallback path — visits ~direct-deps slots (small N), so the
   probe cost is trivial here.

2. **Surface non-`NotFound` errors on scope read_dir**
   (`r3222744370`). `link_bins_excluding` was swallowing every
   error on `Api::read_dir(<scope-dir>)` with a bare
   `let Ok(...) else continue`. A `PermissionDenied`, `EIO`, or
   AppArmor deny there would silently drop every bin under the
   scope. Match the `NotFound`-only-skip pattern the rest of the
   file uses (`modules_dir` read, manifest read).

All 13 link_bins tests still pass. fmt + clippy clean.
CodeRabbit's review (4267999547) flagged two pre-existing issues in
the cmd-shim crate that the perf work didn't touch but are worth
fixing alongside the others (and @zkochan explicitly authorized
edits in this file).

1. **Byte-compare `.cmd` / `.ps1` against the generated body before
   skipping the rewrite** (`r3222744353`). `link_bins.rs::write_shim`'s
   `already_correct` gate was checking only that the `.cmd` and
   `.ps1` files existed (`.is_ok()` on a `read_to_string`), so a
   stale or hand-edited `.cmd` pointing at the wrong target would
   slip through and never get repaired on warm reinstalls.
   Generated bodies are stable across pacquet versions (only the
   target-path segment moves), so byte equality is a sound
   equivalence check — same `matches!` shape the `.sh` flavor uses
   but comparing against the freshly generated body instead of a
   marker.

2. **Don't accept shebangs after leading whitespace**
   (`r3222744356`). `shim.rs::parse_shebang_from_bytes` was
   running `String::from_utf8_lossy(bytes).trim_start()` before
   looking at the first line. The kernel and upstream cmd-shim
   only honour `#!` at byte 0 of the file, so a script that just
   happens to start with whitespace and then has `#!` later would
   have been misclassified as a valid shebang and select the
   wrong runtime. Drop the `trim_start()`; let `parse_shebang`'s
   `strip_prefix("#!")` match at column 0 of the first line, as
   upstream's `^#!\s*...` regex implicitly does.

All 444 workspace tests still pass.
Copilot AI review requested due to automatic review settings May 11, 2026 23:30

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 31 changed files in this pull request and generated 1 comment.

Comment thread crates/cmd-shim/src/link_bins.rs Outdated
zkochan added 2 commits May 12, 2026 01:35
The doc comment I added in 1e3232e linked `[\`parse_shebang\`]` from
`parse_shebang_from_bytes` (public) to `parse_shebang` (private),
which trips `rustdoc::private_intra_doc_links` under `-D warnings`
when CI runs `cargo doc` *without* `--document-private-items`. The
link only resolved because my local run used the private-items
flag.

Replace the intra-doc link with plain backticks. Same meaning to
the reader, no broken rustdoc target.
Mirror site of the same fix `5194d9bb` made in
`crates/package-manager/src/link_bins.rs` — Copilot flagged the
parallel pattern in `collect_packages_in_modules_dir`
(<#333 (comment)>).

Was:

    let Ok(scope_entries) = Api::read_dir(&path) else {
        continue;
    };

Swallowed every error on a scope directory. `PermissionDenied`,
`EIO`, or AppArmor deny would silently drop every bin under that
scope. Match the file's own `modules_dir` read policy: skip only
`NotFound` (concurrent delete), surface everything else as
`LinkBinsError::ReadModulesDir`.
Copilot AI review requested due to automatic review settings May 11, 2026 23:36

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 31 changed files in this pull request and generated 2 comments.

Comment thread crates/tarball/src/lib.rs Outdated
Comment thread crates/package-manager/src/link_bins.rs Outdated
Two Copilot findings on the latest review:

1. **Bundled manifest must be last-wins on duplicate `package.json`
   tar entries** (`r3222807498`). pnpm's `addFilesFromTarball` at
   [`store/cafs/src/addFilesFromTarball.ts:41-43`](https://github.com/pnpm/pnpm/blob/4750fd370c/store/cafs/src/addFilesFromTarball.ts#L41-L43)
   unconditionally re-assigns `manifestBuffer = fileBuffer` on every
   `package.json` entry — same overwrite semantics as
   `filesIndex.set(...)`, so `manifest` and `files` stay consistent
   when a tarball publishes duplicate paths. Pacquet was first-wins
   via `pkg_files_idx.manifest.is_none()`, so a divergent later
   `package.json` would leave a stale earlier manifest in the index
   row. Drop the guard, parse + assign on every `package.json`
   entry, clear to `None` on parse failure (matches pnpm's "parse
   once at end, undefined on failure" via early degrade).

2. **`unfiltered = has_bin_set.is_empty()` conflated two
   semantically different cases** (`r3222807548`):
   - lockfile `packages:` section missing — we have no info,
     fall back to processing all children.
   - lockfile present but no entry has `hasBin: true` — the
     lockfile *says* no package has a bin and every slot should
     short-circuit.
   Pnpm's filter at `during-install/src/index.ts:283` returns
   `dep.hasBin && dep.installable !== false`; on a lockfile with
   no `hasBin: true` entries pnpm rejects every child. Change
   `build_has_bin_set` to return `Option<HashSet<PackageKey>>`:
   `None` = info absent (process all), `Some(set)` = info present
   and authoritative (filter via the set, even when empty).

All 444 workspace tests pass.

@zkochan zkochan left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance regression has been fixed.

@zkochan zkochan merged commit 6862e70 into main May 12, 2026
16 checks passed
@zkochan zkochan deleted the claude/link-dependency-binaries-JWwJN branch May 12, 2026 00:07
zkochan added a commit that referenced this pull request May 14, 2026
…op-level bin re-link (#342) (#523)

Two related bin-linking behaviors deferred from #333.

Behavior 1 — hoisted-bin precedence:
- Add `BinOrigin { Direct, Hoisted }` discriminator to `PackageBinSource`.
- New top tier in `pick_winner`: Direct wins outright over Hoisted
  regardless of ownership / lexical order. Mirrors upstream's
  `preferDirectCmds` partition at
  https://github.com/pnpm/pnpm/blob/4750fd370c/bins/linker/src/index.ts#L92.
- New `link_top_level_bins(modules_dir, direct, hoisted)` helper in
  `pacquet-package-manager` mixes both candidate lists into one
  `link_bins_of_packages` call so the new tier resolves conflicts in
  a single pass — previously the two passes (SymlinkDirectDependencies
  for direct + hoist pass for publicly-hoisted) wrote shims
  independently and a hoisted bin could shadow a direct one when its
  package name was lexically smaller.

Behavior 2 — lifecycle-script-created bins:
- Add a post-`BuildModules` per-importer top-level bin link pass.
  Re-reading each direct dep's `package.json` after lifecycle
  scripts run picks up bins generated by `postinstall` (the
  `@pnpm.e2e/generated-bins` upstream fixture). Idempotent for
  unchanged shims via `is_shim_pointing_at`.
- Mirrors upstream's `linkBinsOfImporter` pass that runs after
  `buildModules` at
  https://github.com/pnpm/pnpm/blob/4750fd370c/installing/deps-installer/src/install/index.ts#L1539.

Supporting changes:
- `PackageBinSource::new(location, manifest)` constructor + `with_origin`
  builder so existing call sites don't have to spell out the new field.
- Public `direct_dep_names_for_importer` helper extracted from
  `SymlinkDirectDependencies` so the post-build pass uses the same
  filter (skipped / first-wins / link_only) as the symlink phase.
- `InstallFrozenLockfileError::TopLevelBinLink` for the new failure
  surface.

Tests:
- `direct_origin_wins_over_hoisted_regardless_of_lexical` — pins the
  new tier overrides lexical ordering.
- `hoisted_origin_loses_to_existing_direct` — pins both arms of the
  new tier (Direct incumbent vs Hoisted candidate).
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Link dependency binaries

4 participants