feat: bin#333
Conversation
Mirrors pnpm v11's `@pnpm/bins.resolver`, `@pnpm/bins.linker`, and `@zkochan/cmd-shim` (resolved at pnpm `4750fd370c` and cmd-shim `0d79ca9534`): - New `pacquet-cmd-shim` crate parses the `bin` field of a package manifest, resolves bin commands with the same path-traversal and URL-safe-name guards as upstream, and writes Unix shell shims byte-compatible with pnpm's `generateShShim`. Conflict resolution follows `pkgOwnsBin` + lexical compare. - New `LinkVirtualStoreBins` step walks each virtual-store slot and links its child packages' bins into the slot's own `node_modules/.bin`, matching `linkBinsOfDependencies` in pnpm's `building/during-install`. - `SymlinkDirectDependencies` and `InstallWithoutLockfile` now call the bin linker after the symlink layout is in place, so direct deps' bins land under `<modules_dir>/.bin/`. Per `plans/TEST_PORTING.md` "Link Dependency Binaries", this is the direct-dependency-bins-first slice. Hoisted-bin precedence, `directories.bin` discovery, lifecycle-script-created bins, and Windows `.cmd`/`.ps1` shims are deferred.
This comment was marked as resolved.
This comment was marked as resolved.
The link triggered `rustdoc::private_intra_doc_links` under `-D warnings` (the Doc CI job), which fails CI. Replace with plain backticks since the function is intentionally private.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #333 +/- ##
==========================================
+ Coverage 83.68% 84.57% +0.88%
==========================================
Files 69 75 +6
Lines 4260 5159 +899
==========================================
+ Hits 3565 4363 +798
- Misses 695 796 +101 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Micro-Benchmark ResultsLinux |
Integrated-Benchmark Report (Linux)Scenario: Frozen Lockfile
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 2.5918448134800007,
"stddev": 0.11171097798591177,
"median": 2.56717565408,
"user": 2.52928088,
"system": 3.43436424,
"min": 2.45213270608,
"max": 2.79008502408,
"times": [
2.79008502408,
2.45213270608,
2.52115285708,
2.6266596540800005,
2.7688762120800003,
2.58837483908,
2.54597646908,
2.5377177890800002,
2.48773920608,
2.5997333780800003
]
},
{
"command": "pacquet@main",
"mean": 2.44288746538,
"stddev": 0.10028409602270291,
"median": 2.4079852925800003,
"user": 2.40243988,
"system": 3.3772794399999997,
"min": 2.3217915390800004,
"max": 2.6731706920800002,
"times": [
2.37978561308,
2.5428539100800003,
2.47831268008,
2.40789506108,
2.40807552408,
2.6731706920800002,
2.38587150108,
2.4013009260800002,
2.3217915390800004,
2.42981720708
]
},
{
"command": "pnpm",
"mean": 6.25374860388,
"stddev": 0.10170106945773101,
"median": 6.28511785208,
"user": 9.036589379999999,
"system": 4.485877240000001,
"min": 6.09435619108,
"max": 6.37261110408,
"times": [
6.37261110408,
6.096802386079999,
6.34019519608,
6.331626594079999,
6.26531160808,
6.205647734079999,
6.191780877079999,
6.334230252079999,
6.09435619108,
6.30492409608
]
}
]
}Scenario: Frozen Lockfile (Hot Cache)
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 0.67741800416,
"stddev": 0.0577009849500688,
"median": 0.66179080276,
"user": 0.35493272,
"system": 1.4631175599999997,
"min": 0.64045146226,
"max": 0.83716713926,
"times": [
0.83716713926,
0.66283738726,
0.66151857626,
0.65511925626,
0.68866231126,
0.66206302926,
0.66776614326,
0.64045146226,
0.65557402326,
0.64302071326
]
},
{
"command": "pacquet@main",
"mean": 0.79450122656,
"stddev": 0.058394655164998935,
"median": 0.77183622276,
"user": 0.26607082,
"system": 1.40717396,
"min": 0.74293681026,
"max": 0.91048906026,
"times": [
0.91048906026,
0.75176440826,
0.74293681026,
0.75151129726,
0.86773088026,
0.79190803726,
0.74797063626,
0.74664929026,
0.82267619626,
0.81137564926
]
},
{
"command": "pnpm",
"mean": 2.67705237256,
"stddev": 0.08285584132050684,
"median": 2.6911332857600003,
"user": 3.1653496199999998,
"system": 2.23074106,
"min": 2.5305021132600003,
"max": 2.80605876026,
"times": [
2.6083950982600004,
2.6694047422600002,
2.69129296826,
2.80605876026,
2.75294408426,
2.5816265092600004,
2.5305021132600003,
2.71926339026,
2.6909736032600002,
2.72006245626
]
}
]
} |
The integrated benchmark on the 1267-package fixture showed a ~4-12% regression vs. main on both `Frozen Lockfile` and `Frozen Lockfile (Hot Cache)` after introducing bin linking. Each virtual-store slot's bin walk was running serially: - `LinkVirtualStoreBins::run` did `read_dir + is_dir` per slot inside a serial `for` loop. With ~1300 slots in the benchmark, the per-slot syscall overhead alone accounted for most of the new wall time. - `link_bins_of_packages` wrote every shim (read shebang, write file, chmod) on the calling thread. - `collect_bin_sources` (direct deps) read every dependency's `package.json` serially. All three loops are work-list shaped with no shared state, so drive them on rayon. Mirrors the per-snapshot warm batch in `create_virtual_store.rs` and pnpm's own `Promise.all`-style fan-out in `linkBinsOfDependencies`.
Per CODE_STYLE_GUIDE.md "Unit test file layout": once a `mod tests` block is long enough to obscure the surrounding module, it moves to `<parent>/tests.rs`. The four test blocks introduced for bin linking (78-130 lines each) were past that threshold. Moved: - crates/cmd-shim/src/bin_resolver.rs -> .../bin_resolver/tests.rs - crates/cmd-shim/src/shim.rs -> .../shim/tests.rs - crates/cmd-shim/src/link_bins.rs -> .../link_bins/tests.rs - crates/package-manager/src/link_bins.rs -> .../link_bins/tests.rs No test logic changed.
Patch coverage on the bin-linking PR was 83.41% (70 lines missing) per
codecov. Three sources of gaps:
1. Edge cases I'd skipped: every fallback extension in
`extension_program`, the `prog: None` arm of `generate_sh_shim`, the
`path::isAbsolute(shTarget)` branch upstream uses, the lexical
normalize `..`-past-root and `CurDir` paths, and `parse_shebang`
empty-prog.
2. pnpm tests not yet ported. Port the ones that translate to pacquet's
no-`directories.bin` first iteration:
- `bins/resolver/test/index.ts`: \$-as-command-name, dangerous bin
names, dangerous bin locations, scoped bin name strip, scoped bin
name with traversal.
- `bins/linker/test/index.ts`: idempotent skip when shim already
points at the same target.
3. Unreachable IO error paths. `search_script_runtime`'s non-`NotFound`
error path can't be triggered with real fs portably. Apply the
per-capability DI pattern documented in
<#332 (comment)>:
- New `FsReadHead` trait + `RealFs` provider.
- `search_script_runtime<Fs: FsReadHead>` and `read_shebang<Fs>`
bind the capability; production callers (`write_shim`) turbofish
`::<RealFs>`.
- Tests inject `PermissionDeniedFs` and `EmptyReadFs` to cover the
propagate-error and zero-bytes-fallthrough paths.
- New pure `parse_shebang_from_bytes` is independently testable.
Also extract `link_direct_dep_bins` from `SymlinkDirectDependencies::run`
into a free function under `package-manager::link_bins`, so the direct-
dep bin-linking step can be unit-tested without spinning up the
mock-registry-backed install pipeline. Adds five tests for it covering
empty deps, missing manifest, and symlink-followed manifest read.
Test count: cmd-shim 19 → 49, package-manager (lib) 28 → 35.
The new \`link_direct_dep_bins_follows_symlink_to_real_package\` test called \`std::os::windows::fs::symlink_dir\` directly, which requires \`SeCreateSymbolicLinkPrivilege\` and fails on Windows CI runners that don't have it. Production code already routes through pacquet_fs's \`symlink_dir\` (which uses junctions on Windows for exactly this reason). Switch the test to the same helper so behavior matches the production path on every platform.
This comment was marked as resolved.
This comment was marked as resolved.
`Path::is_absolute("/abs/elsewhere/cli")` returns true on Unix but
false on Windows (Windows requires a drive letter), so the test forced
the wrong branch on Windows CI. The absolute-vs-relative classification
of shim targets is itself platform-dependent; pin the test to Unix
only.
This comment was marked as resolved.
This comment was marked as resolved.
c200ae2 to
2ffb2d1
Compare
This comment was marked as resolved.
This comment was marked as resolved.
Closes the rest of the bin-resolver / cmd-shim feature parity gap with pnpm v11 within this PR's scope. - \`directories.bin\` fallback (\`bin_resolver.rs\`): walkdir over every regular file under the package's nominated dir, with the same \`is_subdir\` traversal guard pnpm uses. Missing dir degrades to an empty list. Ports five upstream tests: index.ts:37 / :150 / :160 + path-traversal.test.ts. - Windows \`.cmd\` and \`.ps1\` shim generation (\`shim.rs\`): ports \`generateCmdShim\` and \`generatePwshShim\` from @zkochan/cmd-shim 0d79ca9534. CRLF line endings on the .cmd flavor; \$basedir+\$exe header on the .ps1 flavor with Windows/POSIX-pwsh detection. Both exec arms (with-prog / no-prog) ported. Skips the \`nodeExecPath\` / \`nodePath\` / \`prependToPath\` / \`progArgs\` branches we don't use. - \`link_bins.rs\` writes all three shim flavors per bin so a project installed on Linux remains usable on Windows after a clone. Idempotent short-circuit only fires when the canonical .sh shim already points at the same target. - Extract \`symlink_direct_deps_into_node_modules\` from \`SymlinkDirectDependencies::run\`. Same shape as the \`link_direct_dep_bins\` extraction this PR already does — closes the unit-test gap that codecov flagged at 44.11% on symlink_direct_dependencies.rs without needing the mock-registry scaffold. Test count: cmd-shim 49 → 59, package-manager (lib) 35 → 37.
Extends the per-capability DI pattern from \`FsReadHead\` (PR #333) to the rest of the bin-linking IO surface. New traits in \`crates/cmd-shim/src/fs_capabilities.rs\`: - \`FsReadDir\` — list a directory's entries - \`FsReadFile\` — read a file into Vec<u8> (package.json reads) - \`FsReadString\` — read a file into String (warm-skip shim probe) - \`FsCreateDirAll\` — mkdir -p - \`FsWriteAtomic\` — write a file replacing if it exists - \`FsSetPermissions\` — chmod 0o755 / OR-in 0o111 \`RealFs\` provides all of them; \`link_bins\`, \`link_bins_of_packages\`, \`write_shim\`, \`collect_packages_in_modules_dir\`, and \`read_package\` take a generic \`Fs: ...\` bound binding only the capabilities they consume. Production callers (\`InstallWithoutLockfile\`, \`SymlinkDirectDependencies\`, \`LinkVirtualStoreBins::run\`) turbofish \`::<RealFs>\`. \`LinkVirtualStoreBins\` keeps a non-generic \`run\` and exposes \`run_with::<Fs>\` for tests. Adds five new fakes-injected tests covering error paths real fs can't trigger portably: - \`link_bins\` propagates non-NotFound \`read_dir\` failure on modules_dir → \`CreateBinDir\` variant - \`link_bins_of_packages\` propagates \`create_dir_all\` failure → \`CreateBinDir\` - \`link_bins_of_packages\` propagates write failure on the .sh shim → \`WriteShim\` - \`link_bins_of_packages\` propagates chmod failure → \`Chmod\` - \`LinkVirtualStoreBins::run_with\` propagates non-NotFound read_dir failure on the virtual store → \`ReadVirtualStore\` Each fake is the per-test unit-struct shape documented at #332 (comment 4345054524) — no \`&self\`, only the capability methods that get called are non-trivial; the rest are \`unreachable!()\` to assert the call path. Test count: cmd-shim 59 → 63, package-manager (lib) 37 → 38.
…cases
Local cargo-llvm-cov on the cmd-shim crate showed the following
uncovered lines after the broader DI work:
- bin_resolver.rs: walkdir-filter unsafe-name continue, empty-key
guard inside is_safe_bin_name, CurDir branch in lexical_normalize
- link_bins.rs: pick_winner (true, false) arm, ProbeShimSource
propagation, ReadManifest propagation
Added focused tests:
- empty bin key in object form is rejected
- directories.bin filters files whose basename fails the URL-safe
guard (file with a space)
- directories.bin handles ./<dir> via the CurDir branch
- ownership_breaks_bin_conflicts_when_existing_owns covers the
(true, false) pick_winner arm
- DI fakes for ProbeShimSource and ReadManifest error variants
Coverage on cmd-shim went 93.99% -> 95.32% line, 93.58% -> 94.57%
region. Remaining gaps are platform-conditional or compiler-elided
empty match arms (Component::CurDir => {}, Windows .cmd absolute-path
branch) that aren't fruitful targets without changing the
implementation shape.
Test count: cmd-shim 63 -> 69.
This comment was marked as resolved.
This comment was marked as resolved.
…main neutrality Renames the generic parameter and concrete provider type from \`Fs\`/ \`RealFs\` to \`Api\`/\`RealApi\` across pacquet-cmd-shim and the pacquet-package-manager call sites. Also renames the module \`fs_capabilities.rs\` -> \`capabilities.rs\`. The trait names keep their \`Fs\` prefix (\`FsReadDir\`, \`FsReadFile\`, \`FsReadString\`, \`FsReadHead\`, \`FsCreateDirAll\`, \`FsWriteAtomic\`, \`FsSetPermissions\`) — domain prefixes belong on traits, not on the provider. Reasoning, per the revised DI reference at #332 (comment 4345054524): The reference uses \`RealApi\` to model a single provider that implements every capability the install pipeline needs, regardless of domain. Crates that today only have fs-side capabilities (this one, modules-yaml) will likely cross domains in the next milestones — lifecycle scripts need \`GetNodeExecPath\` or similar, fixture- based install tests need \`GetCurrentTime\` for \`prunedAt\`. Doing the rename now avoids a churning rename when the first non-\`Fs*\` capability lands. No semantic changes; tests all still pass (cmd-shim 69, package- manager-lib 38).
The latest revision of the DI reference comment (#332 issuecomment-4345054524) added principle 6: > Provider names are domain-neutral; trait names are domain-scoped. > The generic and its production impl are \`Api\` and \`RealApi\` > regardless of how many domains they cover. That principle is about the *production* provider. Test fakes follow a different rule the same revision spells out: > Test fakes describe what behaviour they fake, not their provider role. The upstream example (\`BadYamlFs\`) names a single-trait fake by behaviour + domain of the trait it fakes. My earlier rename had turned \`PermissionDeniedFs\` / \`EmptyReadFs\` into \`PermissionDeniedApi\` / \`EmptyReadApi\` — that was wrong: those fakes implement only \`FsReadHead\`, not the full provider role, so the \`Api\` suffix overstates what they do. Revert to the behaviour+domain shape that matches \`BadYamlFs\`. Multi-trait fakes (\`FailingProbe\`, \`DenyManifestRead\`, etc.) already follow behaviour-only naming and don't need a domain suffix — they describe the failure scenario directly.
…aviour + trait domain" This reverts commit d234176. The DI reference principle is being revised to keep the \`Api\` suffix on single-trait test fakes for consistency with the production provider naming, so restore \`PermissionDeniedApi\` / \`EmptyReadApi\` until the principle update lands.
0bab2a5 to
37fd996
Compare
Two doc fixes from the Copilot review on PR #333 (pullrequestreview-4267779783 / -4267945452): 1. `normalize_bundled_manifest` claimed `{ name, version }`-shaped manifests typically return `None` because those fields are "NOT in the pick list". Wrong on both counts: `name` is in `BUNDLED_MANIFEST_FIELDS` and `version` is special-cased first. A real npm-published `package.json` always carries both, so the typical case is `Some(...)`, not `None`. Updated to describe when `None` actually fires (non-object input, all fields absent or `null`). 2. `PrefetchedManifests` claimed `None` entries are stored to distinguish "no manifest stored" from "package wasn't prefetched at all", but the type is `HashMap<String, Arc<Value>>` — there's no way to express `None`. Rewrote: a missing key is either case, and callers cross-reference with [`PrefetchedCasPaths`] from the same [`PrefetchResult`] when they need to tell them apart. No code change.
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/tarball/src/tests.rs (1)
160-230:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAvoid live registry calls in unit tests.
packages_under_orgs_should_workandshould_throw_error_on_checksum_mismatchnow depend onregistry.npmjs.org, so DNS/rate-limit/outage noise can fail this suite even when the tarball code is fine. This file already has a hermetic pattern viaFASTIFY_ERROR_TARBALLandmockito; please switch these two tests to that fixture too.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/tarball/src/tests.rs` around lines 160 - 230, Both tests call out to registry.npmjs.org; switch them to use the existing hermetic fixture and mockito instead. For packages_under_orgs_should_work and should_throw_error_on_checksum_mismatch, replace the literal package_url with the FASTIFY_ERROR_TARBALL fixture and register a mockito::mock that serves that tarball (matching path and response used elsewhere in this file), and ensure the http client used by DownloadTarballToStore points to the mockito server (as other tests do). Keep the same package_id, integrity, and size; reuse the existing FASTIFY_ERROR_TARBALL + mock setup pattern so the tests are fully hermetic.
🧹 Nitpick comments (2)
crates/tarball/src/tests.rs (1)
405-530: ⚡ Quick winAdd coverage for
PrefetchResult.manifests.These updated prefetch tests only assert
prefetched.cas_paths. A regression in manifest extraction/splitting would still pass here even though downstream bin-linking now consumesprefetched.manifests. A small fixture row withmanifest: Some(...)would close that gap.As per coding guidelines, "When porting behavior from pnpm, port the relevant pnpm tests too as Rust tests whenever they translate to match test coverage and prove behavioral parity."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/tarball/src/tests.rs` around lines 405 - 530, Add a small manifest fixture to the seeded index rows and assert PrefetchResult.manifests is populated: in the tests that call prefetch_cas_paths (e.g., prefetch_cas_paths_omits_failed_integrity_entries and prefetch_cas_paths_skips_filesystem_checks_when_verify_disabled) set PackageFilesIndex.manifest = Some(your_manifest_string_or_bytes) when building `entry`, then after `prefetched` is returned check `prefetched.manifests.get(&index_key)` exists and contains the expected manifest value; this verifies prefetch_cas_paths' manifest-extraction/splitting path alongside cas_paths.crates/package-manager/src/link_bins.rs (1)
193-202: ⚡ Quick winDon't treat
packages: {}the same as “metadata unavailable”.
build_has_bin_set()returns an empty set both whenpackagesisNoneand when the lockfile actually has apackages:section but no entry advertiseshasBin: true.run_lockfile_driven()then interprets that empty set as “disable filtering”, so a no-bin workspace falls back to walking every child manifest even though the lockfile already proved no shims can be produced.Returning
Option<HashSet<PackageKey>>or a separatehas_package_metadataflag would keep the fast-path active for that case.Also applies to: 230-234
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/package-manager/src/link_bins.rs` around lines 193 - 202, The function build_has_bin_set currently conflates "no package metadata available" with "there are packages but none advertise has_bin", so change build_has_bin_set to return Option<HashSet<PackageKey>> (None = metadata unavailable, Some(set) = known set possibly empty), update its callers (e.g., run_lockfile_driven and any other uses) to treat Some(empty) as the fast-path "no shims" case and only disable filtering when the function returns None, and adjust types/signatures where it was previously assumed to be HashSet to propagate the Option and handle the None vs Some(empty) distinction.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/cmd-shim/src/link_bins.rs`:
- Around line 306-320: The current no-op check uses cmd_exists/ps1_exists which
only tests file existence and lets stale/corrupt .cmd/.ps1 siblings bypass
repair; change those to validate file contents like sh_marker_ok does: replace
cmd_exists and ps1_exists with content checks (e.g.
matches!(Api::read_to_string(&cmd_path), Ok(existing) if
is_cmd_pointing_at(&existing, target_path)) and similarly for ps1_path) or, if
helpers aren’t present, read the files with Api::read_to_string and verify they
contain/resolve to target_path (or match the expected template); then compute
already_correct = sh_marker_ok && cmd_marker_ok && ps1_marker_ok so we only skip
when all three sh/cmd/ps1 actually point at target_path.
In `@crates/cmd-shim/src/shim.rs`:
- Around line 101-104: The current parse_shebang_from_bytes uses trim_start(),
which incorrectly accepts shebangs not at byte 0; change it so the first line is
taken exactly as-is (no leading-trim) and only passed to parse_shebang when it
begins with "#!" at byte 0. Concretely, in parse_shebang_from_bytes remove
trim_start(), extract the first line (e.g.,
head.split('\n').next().unwrap_or("") and trim_end_matches('\r')), then check
first_line.starts_with("#!") — if true call parse_shebang(first_line) otherwise
return None; keep function names parse_shebang_from_bytes and parse_shebang to
locate the change.
In `@crates/package-manager/src/link_bins.rs`:
- Around line 397-421: find_slot_own_package_dir now returns a syntactic PathBuf
even when the actual package dir (slot/node_modules/<own>) doesn’t exist,
causing link_bins_excluding to create phantom <own>/node_modules/.bin shims;
modify link_bins_excluding to track whether the existing walk actually
encountered the exclude (i.e. found a real own package directory) and if no
exclude was found, bail out before writing shims; specifically, inside
link_bins_excluding (where it currently calls find_slot_own_package_dir and
collects sibling bins) add a boolean flag (e.g. saw_exclude) that is set only
when the directory exists on disk or when an exclude entry was observed during
the traversal, and check that flag before creating the <own>/node_modules/.bin
directory or writing any shims so slots without a real own package are skipped.
- Around line 462-465: The code currently swallows all errors from
Api::read_dir(&path) inside the name_str.starts_with('@') branch by using an
unconditional continue; change this so only a NotFound (concurrent delete) is
skipped while other errors are propagated. Specifically, replace the current let
Ok(scope_entries) = Api::read_dir(&path) else { continue; } with a match on
Api::read_dir(&path) that: binds Ok(scope_entries) as before; if Err(e) and
e.kind() == std::io::ErrorKind::NotFound then continue; otherwise propagate or
return the error (e.g., return Err(e) or use ?), so failures like
PermissionDenied/EIO are surfaced instead of dropping the whole scope.
---
Outside diff comments:
In `@crates/tarball/src/tests.rs`:
- Around line 160-230: Both tests call out to registry.npmjs.org; switch them to
use the existing hermetic fixture and mockito instead. For
packages_under_orgs_should_work and should_throw_error_on_checksum_mismatch,
replace the literal package_url with the FASTIFY_ERROR_TARBALL fixture and
register a mockito::mock that serves that tarball (matching path and response
used elsewhere in this file), and ensure the http client used by
DownloadTarballToStore points to the mockito server (as other tests do). Keep
the same package_id, integrity, and size; reuse the existing
FASTIFY_ERROR_TARBALL + mock setup pattern so the tests are fully hermetic.
---
Nitpick comments:
In `@crates/package-manager/src/link_bins.rs`:
- Around line 193-202: The function build_has_bin_set currently conflates "no
package metadata available" with "there are packages but none advertise
has_bin", so change build_has_bin_set to return Option<HashSet<PackageKey>>
(None = metadata unavailable, Some(set) = known set possibly empty), update its
callers (e.g., run_lockfile_driven and any other uses) to treat Some(empty) as
the fast-path "no shims" case and only disable filtering when the function
returns None, and adjust types/signatures where it was previously assumed to be
HashSet to propagate the Option and handle the None vs Some(empty) distinction.
In `@crates/tarball/src/tests.rs`:
- Around line 405-530: Add a small manifest fixture to the seeded index rows and
assert PrefetchResult.manifests is populated: in the tests that call
prefetch_cas_paths (e.g., prefetch_cas_paths_omits_failed_integrity_entries and
prefetch_cas_paths_skips_filesystem_checks_when_verify_disabled) set
PackageFilesIndex.manifest = Some(your_manifest_string_or_bytes) when building
`entry`, then after `prefetched` is returned check
`prefetched.manifests.get(&index_key)` exists and contains the expected manifest
value; this verifies prefetch_cas_paths' manifest-extraction/splitting path
alongside cas_paths.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: f6f0f425-a087-48b3-951b-524ee483e6de
⛔ Files ignored due to path filters (5)
Cargo.lockis excluded by!**/*.lockcrates/cli/tests/snapshots/add__should_install_all_dependencies.snapis excluded by!**/*.snapcrates/cli/tests/snapshots/add__should_symlink_correctly.snapis excluded by!**/*.snapcrates/cli/tests/snapshots/install__should_install_dependencies.snapis excluded by!**/*.snapcrates/package-manager/src/install/snapshots/pacquet_package_manager__install__tests__should_install_dependencies.snapis excluded by!**/*.snap
📒 Files selected for processing (26)
.typos.tomlCargo.tomlcrates/cmd-shim/Cargo.tomlcrates/cmd-shim/src/bin_resolver.rscrates/cmd-shim/src/bin_resolver/tests.rscrates/cmd-shim/src/capabilities.rscrates/cmd-shim/src/lib.rscrates/cmd-shim/src/link_bins.rscrates/cmd-shim/src/link_bins/tests.rscrates/cmd-shim/src/path_util.rscrates/cmd-shim/src/shim.rscrates/cmd-shim/src/shim/tests.rscrates/package-manager/Cargo.tomlcrates/package-manager/src/create_virtual_store.rscrates/package-manager/src/install_frozen_lockfile.rscrates/package-manager/src/install_without_lockfile.rscrates/package-manager/src/lib.rscrates/package-manager/src/link_bins.rscrates/package-manager/src/link_bins/tests.rscrates/package-manager/src/symlink_direct_dependencies.rscrates/store-dir/src/msgpackr_records.rscrates/store-dir/src/msgpackr_records/tests.rscrates/store-dir/src/store_index.rscrates/tarball/Cargo.tomlcrates/tarball/src/lib.rscrates/tarball/src/tests.rs
📜 Review details
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Log emissions are part of 'match pnpm'—when porting a function that firespnpm:<channel>events throughglobalLogger,logger.debug(...), orstreamParser.write(...), mirror the call site, payload, and ordering so@pnpm/cli.default-reporterparses pacquet's NDJSON the same way
Declare a newtype wrapper for branded string types instead of collapsing into plainStringor&strto preserve the distinction between validating and non-validating brands
If upstream always validates before construction of a branded string type, implement validation viaTryFrom<String>and/orFromStrwithout providing an infallible public constructor
If upstream never validates a branded string type, expose an infallibleFrom<String>constructor to provide type-safety without runtime validation
When upstream occasionally constructs a branded string type without validation via bareasassertion, expose afrom_str_uncheckedconstructor so callers can opt into the same unchecked path explicitly
Match upstream serde behavior for branded types—use#[serde(try_from = "String")]for deserialization to go through the validator and#[serde(into = "String")]for serialization
Use#[derive(derive_more::From)]and#[derive(derive_more::Into)]for mechanical conversion impls on branded types rather than handwritingimplblocks
Model string-literal unions (like'auto' | 'always' | 'never') as Rustenums instead of newtype wrappers
Template literal types (like${string}@${string}) should be treated as branded string types with the same validation discipline applied
Follow the code style guide inCODE_STYLE_GUIDE.mdcovering code-level conventions not enforced by tooling: imports, modules, naming, ownership and borrowing, parameter type selection, trait bounds, pattern matching,pipe-trait, error handling, test layout, and cloning ofArcandRc
Warnings are errors incargo clippy --locked -- --deny warnings, and should not be silenced with `#[allow(....
Files:
crates/cmd-shim/src/path_util.rscrates/cmd-shim/src/lib.rscrates/package-manager/src/lib.rscrates/tarball/src/tests.rscrates/package-manager/src/symlink_direct_dependencies.rscrates/store-dir/src/store_index.rscrates/cmd-shim/src/shim/tests.rscrates/package-manager/src/link_bins/tests.rscrates/package-manager/src/install_without_lockfile.rscrates/store-dir/src/msgpackr_records/tests.rscrates/package-manager/src/link_bins.rscrates/store-dir/src/msgpackr_records.rscrates/tarball/src/lib.rscrates/cmd-shim/src/bin_resolver.rscrates/package-manager/src/create_virtual_store.rscrates/package-manager/src/install_frozen_lockfile.rscrates/cmd-shim/src/link_bins.rscrates/cmd-shim/src/capabilities.rscrates/cmd-shim/src/shim.rscrates/cmd-shim/src/link_bins/tests.rscrates/cmd-shim/src/bin_resolver/tests.rs
🧠 Learnings (3)
📚 Learning: 2026-05-07T23:19:08.272Z
Learnt from: KSXGitHub
Repo: pnpm/pacquet PR: 401
File: tasks/integrated-benchmark/src/work_env.rs:343-344
Timestamp: 2026-05-07T23:19:08.272Z
Learning: When reviewing Rust code in pnpm/pacquet for deprecated API usage, do not automatically treat `serde_saphyr::to_string` as deprecated. In `serde-saphyr` v0.0.25, `serde_saphyr::to_string` has no `#[deprecated]` attribute (the `#[deprecated]` later in `serde-saphyr-0.0.25/src/lib.rs` applies to a different function). Only flag `serde_saphyr::to_string` as deprecated if the resolved dependency version’s source shows `#[deprecated]` on that specific function.
Applied to files:
crates/cmd-shim/src/path_util.rscrates/cmd-shim/src/lib.rscrates/package-manager/src/lib.rscrates/tarball/src/tests.rscrates/package-manager/src/symlink_direct_dependencies.rscrates/store-dir/src/store_index.rscrates/cmd-shim/src/shim/tests.rscrates/package-manager/src/link_bins/tests.rscrates/package-manager/src/install_without_lockfile.rscrates/store-dir/src/msgpackr_records/tests.rscrates/package-manager/src/link_bins.rscrates/store-dir/src/msgpackr_records.rscrates/tarball/src/lib.rscrates/cmd-shim/src/bin_resolver.rscrates/package-manager/src/create_virtual_store.rscrates/package-manager/src/install_frozen_lockfile.rscrates/cmd-shim/src/link_bins.rscrates/cmd-shim/src/capabilities.rscrates/cmd-shim/src/shim.rscrates/cmd-shim/src/link_bins/tests.rscrates/cmd-shim/src/bin_resolver/tests.rs
📚 Learning: 2026-05-01T10:01:33.766Z
Learnt from: zkochan
Repo: pnpm/pacquet PR: 349
File: crates/reporter/src/tests.rs:121-121
Timestamp: 2026-05-01T10:01:33.766Z
Learning: In Rust test code, follow the repo’s CODE_STYLE_GUIDE test-logging rule: add logging (e.g., `eprintln!`/`eprintln!(...)`) so that useful diagnostic values are printed when a test fails, unless the assertion is `assert_eq!` (where the differing values are already included). Concretely, if you use assertions like `assert!`, `assert_ne!`, etc., ensure the test logs the relevant actual/expected values (or context) before/around the assertion so failures can be diagnosed without rerunning.
Applied to files:
crates/tarball/src/tests.rscrates/cmd-shim/src/shim/tests.rscrates/package-manager/src/link_bins/tests.rscrates/store-dir/src/msgpackr_records/tests.rscrates/cmd-shim/src/link_bins/tests.rscrates/cmd-shim/src/bin_resolver/tests.rs
📚 Learning: 2026-05-01T10:01:33.766Z
Learnt from: zkochan
Repo: pnpm/pacquet PR: 349
File: crates/reporter/src/tests.rs:121-121
Timestamp: 2026-05-01T10:01:33.766Z
Learning: In this repo’s Rust tests (run via `cargo nextest`), stdout/stderr from passing tests is captured and is only shown when a test fails. As a result, `eprintln!` (and similar stderr logging) in these tests should generally not be flagged as CI “noise” on the happy path; CI output should only appear for failing tests.
Applied to files:
crates/tarball/src/tests.rs
🔇 Additional comments (8)
.typos.toml (1)
10-10: Looks good.
Allowlistingpnis appropriate for pnpm-related terminology and avoids noisy typo false positives.crates/tarball/Cargo.toml (1)
26-26: Dependency update is appropriate.
Addingrayonhere matches the parallelization work and keeps the manifest consistent with usage.crates/cmd-shim/Cargo.toml (1)
1-23: Nice crate manifest setup.
The dependency set and workspace wiring look clean and consistent for the newpacquet-cmd-shimcrate.Cargo.toml (1)
17-17: Workspace dependency registration looks correct.
This cleanly exposespacquet-cmd-shimto member crates viaworkspace = true.crates/package-manager/Cargo.toml (1)
14-14: Integration dependency is correctly declared.
pacquet-cmd-shimis wired in the right place for package-manager bin-linking integration.Also applies to: 36-36
crates/cmd-shim/src/path_util.rs (1)
28-50: Path normalization implementation is solid.
The anchored/unanchored..handling inlexical_normalizeandis_anchoredis well-structured and matches the intended lexical containment behavior.crates/store-dir/src/msgpackr_records.rs (1)
639-850: Great extension of the encoder formanifestsupport.
The JSON record-encoding path andwrite_intaddition fit the existing msgpackr records model and keep slot-collision handling consistent.Also applies to: 1019-1043
crates/store-dir/src/msgpackr_records/tests.rs (1)
597-730: Excellent coverage for the new manifest encoding path.
These tests exercise the right interop-critical behaviors (round-trip, nested-object records, shape reuse, and mixed-field payloads).
Two fixes from CodeRabbit's full review on PR #333 (review 4267999547 at 23:23): 1. **Restore the slot's-own-pkg-dir probe in the readdir fallback** (`r3222744362`). The lockfile-driven path doesn't need it because `create_virtual_dir_by_snapshot` enforces the `<slot>/node_modules/<pkg>` invariant before bin-linking — that was the rationale for dropping the probe in 6e079bf (1267 wasted readdirs on warm install). The readdir fallback path (`InstallWithoutLockfile` only) has no such guarantee: if a slot's `node_modules/<pkg>` is missing but sibling deps are still present, `link_bins_excluding` would `create_dir_all` the phantom `<pkg>` chain to hold shims, leaving an orphan package directory on disk. Re-add `Api::read_dir(&self_pkg_dir).is_err()` gating before calling `link_bins_excluding` *only* in this fallback path — visits ~direct-deps slots (small N), so the probe cost is trivial here. 2. **Surface non-`NotFound` errors on scope read_dir** (`r3222744370`). `link_bins_excluding` was swallowing every error on `Api::read_dir(<scope-dir>)` with a bare `let Ok(...) else continue`. A `PermissionDenied`, `EIO`, or AppArmor deny there would silently drop every bin under the scope. Match the `NotFound`-only-skip pattern the rest of the file uses (`modules_dir` read, manifest read). All 13 link_bins tests still pass. fmt + clippy clean.
CodeRabbit's review (4267999547) flagged two pre-existing issues in the cmd-shim crate that the perf work didn't touch but are worth fixing alongside the others (and @zkochan explicitly authorized edits in this file). 1. **Byte-compare `.cmd` / `.ps1` against the generated body before skipping the rewrite** (`r3222744353`). `link_bins.rs::write_shim`'s `already_correct` gate was checking only that the `.cmd` and `.ps1` files existed (`.is_ok()` on a `read_to_string`), so a stale or hand-edited `.cmd` pointing at the wrong target would slip through and never get repaired on warm reinstalls. Generated bodies are stable across pacquet versions (only the target-path segment moves), so byte equality is a sound equivalence check — same `matches!` shape the `.sh` flavor uses but comparing against the freshly generated body instead of a marker. 2. **Don't accept shebangs after leading whitespace** (`r3222744356`). `shim.rs::parse_shebang_from_bytes` was running `String::from_utf8_lossy(bytes).trim_start()` before looking at the first line. The kernel and upstream cmd-shim only honour `#!` at byte 0 of the file, so a script that just happens to start with whitespace and then has `#!` later would have been misclassified as a valid shebang and select the wrong runtime. Drop the `trim_start()`; let `parse_shebang`'s `strip_prefix("#!")` match at column 0 of the first line, as upstream's `^#!\s*...` regex implicitly does. All 444 workspace tests still pass.
The doc comment I added in 1e3232e linked `[\`parse_shebang\`]` from `parse_shebang_from_bytes` (public) to `parse_shebang` (private), which trips `rustdoc::private_intra_doc_links` under `-D warnings` when CI runs `cargo doc` *without* `--document-private-items`. The link only resolved because my local run used the private-items flag. Replace the intra-doc link with plain backticks. Same meaning to the reader, no broken rustdoc target.
Mirror site of the same fix `5194d9bb` made in `crates/package-manager/src/link_bins.rs` — Copilot flagged the parallel pattern in `collect_packages_in_modules_dir` (<#333 (comment)>). Was: let Ok(scope_entries) = Api::read_dir(&path) else { continue; }; Swallowed every error on a scope directory. `PermissionDenied`, `EIO`, or AppArmor deny would silently drop every bin under that scope. Match the file's own `modules_dir` read policy: skip only `NotFound` (concurrent delete), surface everything else as `LinkBinsError::ReadModulesDir`.
Two Copilot findings on the latest review: 1. **Bundled manifest must be last-wins on duplicate `package.json` tar entries** (`r3222807498`). pnpm's `addFilesFromTarball` at [`store/cafs/src/addFilesFromTarball.ts:41-43`](https://github.com/pnpm/pnpm/blob/4750fd370c/store/cafs/src/addFilesFromTarball.ts#L41-L43) unconditionally re-assigns `manifestBuffer = fileBuffer` on every `package.json` entry — same overwrite semantics as `filesIndex.set(...)`, so `manifest` and `files` stay consistent when a tarball publishes duplicate paths. Pacquet was first-wins via `pkg_files_idx.manifest.is_none()`, so a divergent later `package.json` would leave a stale earlier manifest in the index row. Drop the guard, parse + assign on every `package.json` entry, clear to `None` on parse failure (matches pnpm's "parse once at end, undefined on failure" via early degrade). 2. **`unfiltered = has_bin_set.is_empty()` conflated two semantically different cases** (`r3222807548`): - lockfile `packages:` section missing — we have no info, fall back to processing all children. - lockfile present but no entry has `hasBin: true` — the lockfile *says* no package has a bin and every slot should short-circuit. Pnpm's filter at `during-install/src/index.ts:283` returns `dep.hasBin && dep.installable !== false`; on a lockfile with no `hasBin: true` entries pnpm rejects every child. Change `build_has_bin_set` to return `Option<HashSet<PackageKey>>`: `None` = info absent (process all), `Some(set)` = info present and authoritative (filter via the set, even when empty). All 444 workspace tests pass.
zkochan
left a comment
There was a problem hiding this comment.
The performance regression has been fixed.
…op-level bin re-link (#342) (#523) Two related bin-linking behaviors deferred from #333. Behavior 1 — hoisted-bin precedence: - Add `BinOrigin { Direct, Hoisted }` discriminator to `PackageBinSource`. - New top tier in `pick_winner`: Direct wins outright over Hoisted regardless of ownership / lexical order. Mirrors upstream's `preferDirectCmds` partition at https://github.com/pnpm/pnpm/blob/4750fd370c/bins/linker/src/index.ts#L92. - New `link_top_level_bins(modules_dir, direct, hoisted)` helper in `pacquet-package-manager` mixes both candidate lists into one `link_bins_of_packages` call so the new tier resolves conflicts in a single pass — previously the two passes (SymlinkDirectDependencies for direct + hoist pass for publicly-hoisted) wrote shims independently and a hoisted bin could shadow a direct one when its package name was lexically smaller. Behavior 2 — lifecycle-script-created bins: - Add a post-`BuildModules` per-importer top-level bin link pass. Re-reading each direct dep's `package.json` after lifecycle scripts run picks up bins generated by `postinstall` (the `@pnpm.e2e/generated-bins` upstream fixture). Idempotent for unchanged shims via `is_shim_pointing_at`. - Mirrors upstream's `linkBinsOfImporter` pass that runs after `buildModules` at https://github.com/pnpm/pnpm/blob/4750fd370c/installing/deps-installer/src/install/index.ts#L1539. Supporting changes: - `PackageBinSource::new(location, manifest)` constructor + `with_origin` builder so existing call sites don't have to spell out the new field. - Public `direct_dep_names_for_importer` helper extracted from `SymlinkDirectDependencies` so the post-build pass uses the same filter (skipped / first-wins / link_only) as the symlink phase. - `InstallFrozenLockfileError::TopLevelBinLink` for the new failure surface. Tests: - `direct_origin_wins_over_hoisted_regardless_of_lexical` — pins the new tier overrides lexical ordering. - `hoisted_origin_loses_to_existing_direct` — pins both arms of the new tier (Direct incumbent vs Hoisted candidate).
Resolves #330.
Mirrors pnpm v11's
@pnpm/bins.resolver,@pnpm/bins.linker, and@zkochan/cmd-shim(resolved at pnpm4750fd370cand cmd-shim0d79ca9534):pacquet-cmd-shimcrate parses thebinfield of a package manifest into commands. Bothbin: string | objectand thedirectories.binglob fallback are supported, with the same path-traversal and URL-safe-name guards as upstream. Conflict resolution followspkgOwnsBin+ lexical compare./bin/sh(byte-compatible with pnpm'sgenerateShShim), Windows.cmd(CRLF,%~dp0paths), and PowerShell.ps1($basedir/$exeheader with Windows/POSIX-pwsh detection). MirrorsgenerateCmdShimandgeneratePwshShimminus the unusednodePath/prependToPath/progArgsbranches.FsReadHead,FsReadFile,FsReadString,FsReadDir,FsWalkFiles,FsCreateDirAll,FsWrite,FsSetExecutable,FsEnsureExecutableBits— following the pattern documented in #332.FsWriteis a thin shim overstd::fs::writeand is not atomic; a future atomic-write capability can build on it without renaming. Production callers turbofish the unit-struct provider::<RealApi>; tests inject unit-struct fakes to cover IO error paths real fs can't trigger portably. Generic param is namedApi(notFs) so the same provider can grow non-fs capabilities without a rename.LinkVirtualStoreBinsstep walks each virtual-store slot and links its child packages' bins into the slot's ownnode_modules/.bin, matchinglinkBinsOfDependenciesin pnpm'sbuilding/during-install. Slot iteration, per-slot child reads, and shim writes parallelised on rayon.SymlinkDirectDependenciesandInstallWithoutLockfilecall the bin linker after the symlink layout is in place. Direct-dep linking and the symlink loop are exposed as freelink_direct_dep_bins/symlink_direct_deps_into_node_modulesfunctions so they're unit-testable without the mock-registry scaffold.Tests port the relevant
bins/resolverandbins/linkercases from pnpm (directories.bindiscovery, scoped bin names, dangerous locations, idempotent shim skip, conflict resolution) and add fakes-injected coverage for the IO error variants.Not in this PR: hoisted-bin precedence and lifecycle-script-created bins. Both depend on subsystems pacquet hasn't built yet — hoisting (
hoistPattern/publicHoistPatternresolution + the lift step) and lifecycle scripts (exec/lifecycle/runner +allowBuildsplumbing). The bin-linking side already does the right thing for both (shims are paths, so they pick up files generated post-extract; the precedence rule is a small addition once hoisting tags candidates as hoisted). Tracked separately in with full porting coordinates.Performance
When the bin-linking step landed it added ~7% to the warm-cache Frozen Lockfile install (benchmark). Closing that gap took a sequence of changes that mirror how pnpm v11 avoids the same costs in its own
linkBinsOfDependencies(4750fd370c):index.db.extract_tarball_entriescaptures the parsedpackage.json, picks the subset pnpm caches via a port ofnormalizeBundledManifest(name,version,bin,directories,dependencies, lifecyclescripts, …), and stuffs it intoPackageFilesIndex.manifest.prefetch_cas_pathssurfaces it back as aHashMap<cache_key, Arc<Value>>so the bin linker doesn't have to re-readpackage.jsonper child off disk.serde_json::Value. The encoder previously errored withManifestNotSupportedif anything ever setmanifest; now it records-encodes nested JSON objects (slot-allocated per distinct key set) souseRecords: truereaders see them as JSObjects rather thanMaps — necessary for pnpm'smanifest.bin/manifest.directories?.binproperty accesses to resolve.Arc<Value>forPackageBinSource.manifest. The lockfile-driven bin-link path producesslots × childrenclones of the parsed manifest (~6k on the integrated-benchmark fixture). Sharing viaArcturns each push into a refcount bump.hasBinfilter via lockfile.LinkVirtualStoreBinsnow takes the lockfilepackages:section, pre-computes anOption<HashSet<PackageKey>>ofhasBin: truekeys at install start, and skips snapshot children that aren't in the set before any path-building or manifest lookup. ~95% of a real lockfile's packages don't declare a bin, so this short-circuits the bulk of the per-slot work.Noneis reserved for the pathological case where the lockfile lacks apackages:section (fall back to processing every child);Some(empty)is authoritative — lockfile says no package has a bin, every slot short-circuits. Mirrors pnpm's filter atbuilding/during-install/src/index.ts:283.run_lockfile_drivenwalkssnapshotsdirectly and builds slot paths lexically. The previousrun_with_readdirpath enumerated every slot viaApi::read_dir(virtual_store_dir)and probed each slot's own package directory with a wastedread_dir(~1267 wastedopen(O_DIRECTORY) + closeper warm install). The readdir-driven path survives as a fallback forinstall_without_lockfileand retains the existence probe there (small N, no virtual-store invariant fromcreate_virtual_dir_by_snapshotto lean on).StoreIndex::get_many_rawreturns undecoded row bytes; the SQLite mutex releases beforeprefetch_cas_pathsfans the msgpackr decode + integrity check across rayon.Latest CI benchmark on this branch (job 75462998251):
The follow-up #417 splits manifests into a dedicated
package_manifestsSQLite table so the unfilteredpackage_indexprefetch payload stays at its original size and the manifest fetch becomes a targeted SELECT against just thehasBin: truesubset.Summary by CodeRabbit
New Features
Bug Fixes / Improvements
Chores
Summary by CodeRabbit
New Features
Tests
Chores