Skip to content
This repository was archived by the owner on May 14, 2026. It is now read-only.

feat: .modules.yaml#332

Merged
KSXGitHub merged 48 commits into
mainfrom
claude/modules-yaml-write-verify-I9yCq
May 7, 2026
Merged

feat: .modules.yaml#332
KSXGitHub merged 48 commits into
mainfrom
claude/modules-yaml-write-verify-I9yCq

Conversation

@KSXGitHub

@KSXGitHub KSXGitHub commented Apr 29, 2026

Copy link
Copy Markdown
Contributor

Resolves #331.

Summary

Implement read_modules_manifest and write_modules_manifest in pacquet-modules-yaml to mirror pnpm v11's installing/modules-yaml package. The manifest lives at <node_modules>/.modules.yaml; the on-disk format is JSON (which YAML accepts), so reads use serde_saphyr and writes emit serde_json::to_string_pretty to match pnpm byte-for-byte.

Behavior parity covers: empty/missing file → Ok(None); legacy shamefullyHoist + hoistedAliases translation; defaults for prunedAt (HTTP-date) and virtualStoreDirMaxLength (120); pre-write skipped sort; non-Windows virtualStoreDir relativization via pathdiff::diff_paths (matches Node path.relative output for non-descendant targets, where Windows preserves absolute due to junctions); legacy hoistedAliases cleanup on write; recursive modules_dir creation. Errors flow through pacquet-diagnostics/miette.

Modules (named to match upstream's exported Modules type) is a serde-derived struct mirroring upstream's ModulesRaw (with sub-types IncludedDependencies, NodeLinker, HoistKind, AllowBuildValue, LayoutVersion, DepPath); every field carries #[serde(default)] so legacy fixtures still parse, and optional fields use Option + skip_serializing_if = "Option::is_none" so the on-disk JSON drops absent values. LayoutVersion is a unit type pinned to upstream's LAYOUT_VERSION = 5 via #[serde(try_from = "u32", into = "u32")]; reading a manifest with any other version fails at parse time. DepPath is a transparent newtype around String mirroring upstream's DepPath branded type — no validation (upstream has none either), private inner field with From<String> / Into<String> via derive_more, used as the key type for hoisted_aliases. ignored_builds is Option<IndexSet<DepPath>> so the array↔set conversion (upstream's new Set(array) and Array.from(set)) happens at the field level — IndexSet (insertion-ordered) is the only set type that round-trips byte-for-byte against JS Set's iteration order. Filesystem and clock I/O go through four per-capability traits (FsReadToString, FsCreateDirAll, FsWrite, Clock) with a domain-neutral RealApi production impl — zero-cost DI so each function binds only the capabilities it consumes and tests inject fakes for I/O error paths and the prunedAt fill-in branch. Production callers turbofish ::<RealApi>. New workspace deps: httpdate = "1.0.3", indexmap = "2.14.0" (with the serde feature), pathdiff = "0.2.3", and dev-only chrono = "0.4.44".

Test plan

  • 18/18 in cargo test -p pacquet-modules-yaml, split across three integration-test files: tests/index.rs (5 ports of upstream installing/modules-yaml/test/index.ts), tests/real_fs.rs (5 pacquet-side tests using a real tempfile::tempdir()), and tests/fakes.rs (8 capability-trait DI tests for I/O error variants, LayoutVersion rejection, IndexSet dedup, and Clock fill-in).
  • Per-test sanity check (per plans/TEST_PORTING.md): broke each behavior in turn and confirmed only the targeted test fails. Bound minimality verified.
  • cargo fmt --check, cargo check --locked, cargo clippy --locked --all-targets -- --deny warnings clean.

Written by an agent (Claude Code, claude-opus-4-7).

Summary by CodeRabbit

  • New Features

    • Full support for reading and writing pnpm .modules.yaml manifests with legacy→modern field mapping, defaults filling, and normalized virtual-store directory handling; outputs virtualStoreDir deterministically.
    • Manifest writes now deterministically sort skipped entries.
  • Bug Fixes

    • Improved error mapping for read/write failures and rejection of incompatible layout versions.
  • Tests

    • Added extensive integration and unit tests covering round-trip IO, legacy formats, error cases, and deterministic serialization.

Implement `read_modules_manifest` and `write_modules_manifest` to mirror
pnpm v11's `installing/modules-yaml` package. The on-disk format is JSON
(which YAML accepts), so reads use `serde_saphyr` and writes emit
`serde_json::to_string_pretty` to match pnpm exactly.

Also drops the test stubs in `tests/index.rs` so the previously
known-failure ports run against the real API.

Refs: https://github.com/pnpm/pnpm/blob/1819226b51/installing/modules-yaml/src/index.ts
@codecov

codecov Bot commented Apr 29, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.94%. Comparing base (b79cc1c) to head (e62670d).
⚠️ Report is 6 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #332      +/-   ##
==========================================
+ Coverage   81.03%   81.94%   +0.91%     
==========================================
  Files          64       65       +1     
  Lines        3517     3728     +211     
==========================================
+ Hits         2850     3055     +205     
- Misses        667      673       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented Apr 29, 2026

Copy link
Copy Markdown

Micro-Benchmark Results

Linux

group                          main                                   pr
-----                          ----                                   --
tarball/download_dependency    1.01     16.0±0.47ms   271.5 KB/sec    1.00     15.8±0.56ms   274.5 KB/sec

@KSXGitHub KSXGitHub marked this pull request as draft April 29, 2026 14:05
@KSXGitHub KSXGitHub requested review from a team, anthonyshew and zkochan April 29, 2026 14:05
@github-actions

github-actions Bot commented Apr 29, 2026

Copy link
Copy Markdown

Integrated-Benchmark Report (Linux)

Scenario: Frozen Lockfile

Command Mean [s] Min [s] Max [s] Relative
pacquet@HEAD 2.062 ± 0.066 1.986 2.156 1.01 ± 0.05
pacquet@main 2.039 ± 0.066 1.949 2.146 1.00
pnpm 5.724 ± 0.080 5.539 5.803 2.81 ± 0.10
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 2.0617330580800006,
      "stddev": 0.06577358199924335,
      "median": 2.0470777894800003,
      "user": 2.57920462,
      "system": 2.0274487999999997,
      "min": 1.9861607399800003,
      "max": 2.15649621798,
      "times": [
        2.14940763298,
        2.15649621798,
        1.9861607399800003,
        2.14567728598,
        2.05869895098,
        1.9966027089800003,
        2.04008915698,
        2.0013417009800003,
        2.05406642198,
        2.02878976398
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 2.03914653008,
      "stddev": 0.06584161461398033,
      "median": 2.0396610289800003,
      "user": 2.55565562,
      "system": 1.9893866,
      "min": 1.9494154329800002,
      "max": 2.14626059198,
      "times": [
        2.04579306298,
        2.14626059198,
        1.9494154329800002,
        2.1079924339800002,
        1.9890555379800001,
        2.1028092159800003,
        2.04844672998,
        1.9527554669800002,
        2.03352899498,
        2.0154078329800003
      ]
    },
    {
      "command": "pnpm",
      "mean": 5.72428982428,
      "stddev": 0.07990654980640748,
      "median": 5.74380308748,
      "user": 9.063627120000001,
      "system": 2.8184161999999993,
      "min": 5.53851621798,
      "max": 5.80271572098,
      "times": [
        5.795031148980001,
        5.76080125198,
        5.76874519098,
        5.80271572098,
        5.72680492298,
        5.53851621798,
        5.69347092198,
        5.67996285998,
        5.68758830198,
        5.78926170498
      ]
    }
  ]
}

Scenario: Frozen Lockfile (Hot Cache)

Command Mean [ms] Min [ms] Max [ms] Relative
pacquet@HEAD 533.4 ± 40.2 500.0 638.0 1.01 ± 0.09
pacquet@main 529.5 ± 25.2 492.4 573.4 1.00
pnpm 2337.9 ± 110.1 2195.2 2553.5 4.42 ± 0.30
BENCHMARK_REPORT.json
{
  "results": [
    {
      "command": "pacquet@HEAD",
      "mean": 0.53344543314,
      "stddev": 0.040216264911030995,
      "median": 0.5235111248400001,
      "user": 0.23533621999999998,
      "system": 0.82520136,
      "min": 0.50001644284,
      "max": 0.6379567098400001,
      "times": [
        0.6379567098400001,
        0.53436023184,
        0.5522087938400001,
        0.5210873448400001,
        0.5206081058400001,
        0.5071741648400001,
        0.50032110184,
        0.50001644284,
        0.53478653084,
        0.52593490484
      ]
    },
    {
      "command": "pacquet@main",
      "mean": 0.5295100428399999,
      "stddev": 0.025204054985440507,
      "median": 0.52631512534,
      "user": 0.23013111999999997,
      "system": 0.83940556,
      "min": 0.49242567684000005,
      "max": 0.57335544084,
      "times": [
        0.57335544084,
        0.55162994484,
        0.50633852584,
        0.5315526478400001,
        0.50641098484,
        0.5192640108400001,
        0.55361601784,
        0.52107760284,
        0.49242567684000005,
        0.5394295758400001
      ]
    },
    {
      "command": "pnpm",
      "mean": 2.3378864743399994,
      "stddev": 0.1100942024076884,
      "median": 2.31087939684,
      "user": 2.9919679199999996,
      "system": 1.3580606600000003,
      "min": 2.19516838784,
      "max": 2.5534627468399997,
      "times": [
        2.5534627468399997,
        2.39134887184,
        2.2397893608399997,
        2.2302069678399996,
        2.3111069938399997,
        2.29870842384,
        2.43585911384,
        2.31065179984,
        2.41256207684,
        2.19516838784
      ]
    }
  ]
}

Adds three Rust-side tests for branches pnpm only exercises through
install-level integration tests (`pnpm/test/monorepo/index.ts:1467-1545`,
etc.), which depend on the install pipeline that has not yet been
ported:

- `read_preserves_absolute_virtual_store_dir` covers L66-L70
- `write_sorts_skipped_array` covers L117
- `write_removes_null_public_hoist_pattern` covers L123-L125

Each test was verified to fail when its targeted behavior is removed.

Copy link
Copy Markdown
Contributor Author

Note from coverage follow-up.

While verifying that read_preserves_absolute_virtual_store_dir actually catches breakage in resolve_virtual_store_dir, the first attempt — replacing the is_absolute() branch with the relative arm so it always runs modules_dir.join(stored_path) — did not fail the test. Reason: Rust's Path::join replaces the receiver when the argument is absolute (/a/b.join(/c/d) → /c/d), whereas pnpm's path.join('/a/b', '/c/d') returns /a/b/c/d. So in Rust the explicit is_absolute() check is logically redundant — both branches produce the same value for the inputs we care about.

Kept the explicit check for structural parity with upstream L66-L70 (per the porting cardinal rule). The test verifies semantic parity (absolute path survives the read-time normalization) which is what we actually want to guard, even if it doesn't distinguish the two structurally-different but observationally-equivalent code paths.

Replacing the whole match arm with a fixed sibling path (modules_dir.join(".pnpm")) does fail the test, so the test is meaningful — just not for that specific simplification.


Generated by Claude Code

Introduce an `FsApi` trait with static methods (`read_to_string`,
`create_dir_all`, `write`) and a unit-struct `RealFs` production
impl. Make `read_modules_manifest` and `write_modules_manifest`
generic over `<Fs: FsApi>`; production callers turbofish `RealFs`.

This mirrors the pattern at
https://github.com/KSXGitHub/parallel-disk-usage/blob/2aa39917f9/src/app/hdd.rs#L25-L74
and lets tests substitute fakes that return specific `io::Error`s,
unlocking the previously-unreachable error variants:

- `ReadFile` other-error
- `ParseYaml` parse failure
- null-document → `Ok(None)`
- `CreateDir` failure
- `WriteFile` failure

Each fake-driven test was verified to fail when its targeted
behavior is removed. Patch coverage on the file rises from 89.17%
to 92.35%; remaining gaps are defensive fallbacks for malformed
inputs (non-object `hoistedAliases`, non-string sort entries) that
upstream does not unit-test either.
@KSXGitHub

Copy link
Copy Markdown
Contributor Author

About the Dependency Injection pattern

parallel-disk-usage has 3 methods (read_to_string, create_dir_all, write) in one FsApi because its codebase was small and its branching was uncomplicated.

But logically speaking, shouldn't they be 3 traits instead?

parallel-disk-usage has 2 generic types implementing 2 mockable traits (Fs: FsApi, Disk: DiskApi). But I think the more elegant pattern should have been 1 generic type implementing all traits (Api: Canonicalize + PathExists + ReadLink + GetDiskKind + GetDiskName + GetMountPoint).

parallel-disk-usage has DiskApi methods taking &self. But that's stupid. The more logical structure should have been to declare associate types (GetDiskKind::Disk, GetDiskName::Disk, GetMountPoint::Disk).

Questions

Claude Code,

Do you understand my critiques?

Can you picture the pattern I want to have?

What do you think about my critiques?

Do they make sense?

Can they be applied to this PR?

Would there be any downsides?

Would they cause complexity to explode?

What is your suggestions?

Are there existing libraries or framework for such a pattern?

Tasks

Take the critiques as suggestions, and implement the pattern.

Apply interface segregation to the `FsApi` trait introduced in the
previous commit. Each filesystem capability gets its own trait
(`FsReadToString`, `FsCreateDirAll`, `FsWrite`) so each function
declares only what it actually consumes:

- `read_modules_manifest<Fs: FsReadToString>`
- `write_modules_manifest<Fs: FsCreateDirAll + FsWrite>`

Tests now implement only the trait(s) the function actually calls,
which removes three of the five `unreachable!()` guards from the
fakes. The bounds are verified to be minimal — replacing
`FsReadToString` with `FsCreateDirAll` on `read_modules_manifest`
correctly fails to compile because `Fs::read_to_string` is no
longer in scope.

Refines the pattern from
https://github.com/KSXGitHub/parallel-disk-usage/blob/2aa39917f9/src/app/hdd.rs#L25-L35
where one combined `FsApi` covers all three capabilities.

KSXGitHub commented Apr 29, 2026

Copy link
Copy Markdown
Contributor Author

Dependency injection in pacquet — pattern reference

Posting this as a permalink for future PRs that need DI. Synthesizes the critiques from the comment above with what landed in this PR.

Principles

  1. One trait per capability. Each side-effecting operation gets its own trait. Functions bind only what they consume; test fakes implement only what gets called. No lumped FsApi { read, mkdir, write, ... } with unreachable!() filling the unused methods.

  2. One generic parameter satisfying multiple bounds. When a function needs several capabilities, compose them: <Api: FsCreateDirAll + FsWrite>. Resist the multi-parameter form (<Disk: DiskApi, Fs: FsApi>) — the generic is a capability provider, not a capability domain, so it should never be named after one domain.

  3. No &self on capability methods unless instance state is genuinely needed. Production impls are unit structs (struct RealApi;); fakes are unit structs scoped to a test fn. Use statics declared inside the test fn for per-test scenario data — that keeps fakes stateless from the trait's perspective.

  4. Associated types instead of &self when a capability operates over a data type. If a capability would otherwise force callers to pass an instance just to extract data, lift the data type to an associated type. The capability stays a static-method namespace; the associated type lets one provider describe how to operate over a chosen data shape.

  5. Capability traits live on the implementor, not the data. The trait FsReadToString is implemented by RealApi (a capability provider), not by Path (a value). This keeps the capability impl swappable without changing call sites.

  6. Provider names are domain-neutral; trait names are domain-scoped. The generic and its production impl are Api and RealApi regardless of how many domains they cover. The traits keep a domain prefix (Fs*, GetDisk*, etc.) so a reader can see which domain a bound belongs to without chasing definitions.

  7. Production callers turbofish the real impl explicitly. read_modules_manifest::<RealApi>(dir). Defaults on free-fn type parameters are unstable; an explicit turbofish is the price of zero-cost DI on stable Rust.

Filesystem capabilities (this PR)

The pacquet-modules-yaml crate exposes three per-capability traits and one production impl:

pub trait FsReadToString {
    fn read_to_string(path: &Path) -> io::Result<String>;
}

pub trait FsCreateDirAll {
    fn create_dir_all(path: &Path) -> io::Result<()>;
}

pub trait FsWrite {
    fn write(path: &Path, contents: &[u8]) -> io::Result<()>;
}

// One provider for the whole codebase. Today it carries only the three
// filesystem capabilities; future PRs add disk, network, time, etc. impls
// to the *same* struct so callers never juggle multiple providers.
pub struct RealApi;
impl FsReadToString for RealApi { /* delegates to std::fs::read_to_string */ }
impl FsCreateDirAll for RealApi { /* delegates to std::fs::create_dir_all */ }
impl FsWrite        for RealApi { /* delegates to std::fs::write           */ }

Public API binds minimal capabilities:

pub fn read_modules_manifest<Api: FsReadToString>(...) -> Result<...>
pub fn write_modules_manifest<Api: FsCreateDirAll + FsWrite>(...) -> Result<...>

Test fakes describe what behaviour they fake, not their provider role:

struct BadYamlFs;
impl FsReadToString for BadYamlFs {
    fn read_to_string(_: &Path) -> io::Result<String> {
        Ok("{ this is not valid yaml or json".to_string())
    }
}
let err = read_modules_manifest::<BadYamlFs>(modules_dir).expect_err("expected");
assert!(matches!(err, ReadModulesManifestError::ParseYaml { .. }));

The fake declares only FsReadToString because read_modules_manifest's bound is Api: FsReadToString. No FsWrite or FsCreateDirAll impls are needed; no unreachable!() guards.

Hypothetical disk capabilities (future)

When pacquet needs disk inspection (e.g., HDD detection for thread tuning), the same principles apply. The reference implementation in parallel-disk-usage lumps three methods into one trait DiskApi and uses &self, which couples the capability namespace to a sysinfo::Disk instance. The pacquet-style refinement:

// One trait per capability. Each uses an associated `Disk` type so the
// capability provider doesn't have to be the data itself.
pub trait GetDiskKind {
    type Disk;
    fn get_disk_kind(disk: &Self::Disk) -> DiskKind;
}

pub trait GetDiskName {
    type Disk;
    fn get_disk_name(disk: &Self::Disk) -> &OsStr;
}

pub trait GetMountPoint {
    type Disk;
    fn get_mount_point(disk: &Self::Disk) -> &Path;
}

// The same `RealApi` from the filesystem section gains more impls.
// Production code never imports a second provider type.
impl GetDiskKind    for RealApi { type Disk = sysinfo::Disk; fn get_disk_kind(d: &Self::Disk)   -> DiskKind { d.kind()        } }
impl GetDiskName    for RealApi { type Disk = sysinfo::Disk; fn get_disk_name(d: &Self::Disk)   -> &OsStr   { d.name()        } }
impl GetMountPoint  for RealApi { type Disk = sysinfo::Disk; fn get_mount_point(d: &Self::Disk) -> &Path    { d.mount_point() } }
impl FsCanonicalize for RealApi { fn canonicalize(p: &Path) -> io::Result<PathBuf> { fs::canonicalize(p) } }

A function that needs all these capabilities binds them under one generic parameter and ties the disk-capability associated types together with equality bounds:

fn any_path_is_in_hdd<Api>(paths: &[PathBuf], disks: &[<Api as GetDiskKind>::Disk]) -> bool
where
    Api: GetDiskKind
       + GetDiskName<Disk = <Api as GetDiskKind>::Disk>
       + GetMountPoint<Disk = <Api as GetDiskKind>::Disk>
       + FsCanonicalize, // a future capability trait
{ /* ... */ }

The <Disk = ...> bound is what locks all three disk-capability type Disk selections together so the function operates on a single &[Self::Disk] slice — without re-introducing a lumped super-trait. Tests provide their own FakeApi that implements every capability the function needs, sets type Disk = FakeDisk;, and defines FakeDisk inside the test — no &self, no per-test instance plumbing, no second generic for filesystem.

If a capability genuinely is an instance method (e.g., a network client carrying a connection pool), pass it as a separate argument — DI is for capabilities, not for stateful runtime objects.

When DI is not needed

These tests should not use DI; a real fixture is simpler:

  • Anything where the real filesystem can produce the input cheaply (fixture files in tests/fixtures/).
  • Anything that exercises only pure logic — is_present_string, derive_hoisted_dependencies, etc.

DI is for unreachable-via-real-fs error paths and for behavior whose triggering conditions are awkward to set up on disk.

Existing crates surveyed

  • mockall: generates per-trait mock impls. Good fit for this pattern; overkill at our current scale (3 traits, 5 fakes).
  • shaku: runtime DI container. Not idiomatic; we want compile-time, zero-cost dispatch.
  • The pattern itself is mtl-style capability typeclasses from Haskell, ported to Rust with traits + generics.

Generated by Claude Code

KSXGitHub

This comment was marked as resolved.

@KSXGitHub KSXGitHub changed the title feat(modules-yaml): port read/write of .modules.yaml feat: .modules.yaml Apr 29, 2026
KSXGitHub pushed a commit that referenced this pull request Apr 29, 2026
Patch coverage on the bin-linking PR was 83.41% (70 lines missing) per
codecov. Three sources of gaps:

1. Edge cases I'd skipped: every fallback extension in
   `extension_program`, the `prog: None` arm of `generate_sh_shim`, the
   `path::isAbsolute(shTarget)` branch upstream uses, the lexical
   normalize `..`-past-root and `CurDir` paths, and `parse_shebang`
   empty-prog.

2. pnpm tests not yet ported. Port the ones that translate to pacquet's
   no-`directories.bin` first iteration:
   - `bins/resolver/test/index.ts`: \$-as-command-name, dangerous bin
     names, dangerous bin locations, scoped bin name strip, scoped bin
     name with traversal.
   - `bins/linker/test/index.ts`: idempotent skip when shim already
     points at the same target.

3. Unreachable IO error paths. `search_script_runtime`'s non-`NotFound`
   error path can't be triggered with real fs portably. Apply the
   per-capability DI pattern documented in
   <#332 (comment)>:
   - New `FsReadHead` trait + `RealFs` provider.
   - `search_script_runtime<Fs: FsReadHead>` and `read_shebang<Fs>`
     bind the capability; production callers (`write_shim`) turbofish
     `::<RealFs>`.
   - Tests inject `PermissionDeniedFs` and `EmptyReadFs` to cover the
     propagate-error and zero-bytes-fallthrough paths.
   - New pure `parse_shebang_from_bytes` is independently testable.

Also extract `link_direct_dep_bins` from `SymlinkDirectDependencies::run`
into a free function under `package-manager::link_bins`, so the direct-
dep bin-linking step can be unit-tested without spinning up the
mock-registry-backed install pipeline. Adds five tests for it covering
empty deps, missing manifest, and symlink-followed manifest read.

Test count: cmd-shim 19 → 49, package-manager (lib) 28 → 35.
@KSXGitHub KSXGitHub mentioned this pull request Apr 29, 2026
KSXGitHub pushed a commit that referenced this pull request Apr 29, 2026
Future AI agents will routinely hit codecov failures on patches that
add new error-handling code (IO errors, environment-specific branches).
Without a written-down reference they tend to either silence the gap
or add elaborate one-off mocks instead of applying the per-capability
DI pattern this codebase already uses.

Cross-link the two #332 comments that define the pattern:

- Critique of an earlier lumped-trait shape:
  #332 (comment 4344962816)
- Reference implementation and principles:
  #332 (comment 4345054524)

Spell out when the pattern applies (gaps that real-fs fixtures or a
free-function extraction can't close) and when it doesn't (when the
patch already has perfect coverage).
@KSXGitHub KSXGitHub mentioned this pull request Apr 29, 2026
KSXGitHub pushed a commit that referenced this pull request Apr 29, 2026
Extends the per-capability DI pattern from \`FsReadHead\` (PR #333) to
the rest of the bin-linking IO surface. New traits in
\`crates/cmd-shim/src/fs_capabilities.rs\`:

- \`FsReadDir\` — list a directory's entries
- \`FsReadFile\` — read a file into Vec<u8>  (package.json reads)
- \`FsReadString\` — read a file into String (warm-skip shim probe)
- \`FsCreateDirAll\` — mkdir -p
- \`FsWriteAtomic\` — write a file replacing if it exists
- \`FsSetPermissions\` — chmod 0o755 / OR-in 0o111

\`RealFs\` provides all of them; \`link_bins\`, \`link_bins_of_packages\`,
\`write_shim\`, \`collect_packages_in_modules_dir\`, and \`read_package\`
take a generic \`Fs: ...\` bound binding only the capabilities they
consume. Production callers (\`InstallWithoutLockfile\`,
\`SymlinkDirectDependencies\`, \`LinkVirtualStoreBins::run\`) turbofish
\`::<RealFs>\`. \`LinkVirtualStoreBins\` keeps a non-generic
\`run\` and exposes \`run_with::<Fs>\` for tests.

Adds five new fakes-injected tests covering error paths real fs can't
trigger portably:

- \`link_bins\` propagates non-NotFound \`read_dir\` failure on
  modules_dir → \`CreateBinDir\` variant
- \`link_bins_of_packages\` propagates \`create_dir_all\` failure →
  \`CreateBinDir\`
- \`link_bins_of_packages\` propagates write failure on the .sh shim →
  \`WriteShim\`
- \`link_bins_of_packages\` propagates chmod failure → \`Chmod\`
- \`LinkVirtualStoreBins::run_with\` propagates non-NotFound read_dir
  failure on the virtual store → \`ReadVirtualStore\`

Each fake is the per-test unit-struct shape documented at
#332 (comment 4345054524) — no \`&self\`, only the
capability methods that get called are non-trivial; the rest are
\`unreachable!()\` to assert the call path.

Test count: cmd-shim 59 → 63, package-manager (lib) 37 → 38.
@KSXGitHub

This comment was marked as resolved.

This comment was marked as resolved.

KSXGitHub pushed a commit that referenced this pull request Apr 29, 2026
…main neutrality

Renames the generic parameter and concrete provider type from \`Fs\`/
\`RealFs\` to \`Api\`/\`RealApi\` across pacquet-cmd-shim and the
pacquet-package-manager call sites. Also renames the module
\`fs_capabilities.rs\` -> \`capabilities.rs\`.

The trait names keep their \`Fs\` prefix (\`FsReadDir\`, \`FsReadFile\`,
\`FsReadString\`, \`FsReadHead\`, \`FsCreateDirAll\`, \`FsWriteAtomic\`,
\`FsSetPermissions\`) — domain prefixes belong on traits, not on the
provider.

Reasoning, per the revised DI reference at
#332 (comment 4345054524):

The reference uses \`RealApi\` to model a single provider that
implements every capability the install pipeline needs, regardless
of domain. Crates that today only have fs-side capabilities (this
one, modules-yaml) will likely cross domains in the next milestones
— lifecycle scripts need \`GetNodeExecPath\` or similar, fixture-
based install tests need \`GetCurrentTime\` for \`prunedAt\`. Doing the
rename now avoids a churning rename when the first non-\`Fs*\`
capability lands.

No semantic changes; tests all still pass (cmd-shim 69, package-
manager-lib 38).

This comment was marked as resolved.

…al `Api`/`RealApi`

The capability-trait DI principle says the generic parameter is a
*capability provider*, not a *capability domain* — naming it `Fs` would
become a lie the moment a non-filesystem capability (disk, network, time)
joins the same provider type.

Rename:

- `pub struct RealFs` → `pub struct RealApi`
- `<Fs: FsReadToString>` → `<Api: FsReadToString>`
- `<Fs: FsCreateDirAll + FsWrite>` → `<Api: FsCreateDirAll + FsWrite>`
- `Fs::method(...)` → `Api::method(...)` at call sites
- All turbofish in tests: `::<RealFs>` → `::<RealApi>`

Trait names keep the `Fs*` prefix because they ARE domain-scoped, which
helps a reader see at a glance which domain a bound covers. Test fakes
keep their scenario-descriptive names (`BadYamlFs`, `FailingMkdirFs`,
etc.) per the carve-out for fakes describing what behaviour they fake.
KSXGitHub pushed a commit that referenced this pull request Apr 29, 2026
The latest revision of the DI reference comment
(#332 issuecomment-4345054524) added principle 6:

> Provider names are domain-neutral; trait names are domain-scoped.
> The generic and its production impl are \`Api\` and \`RealApi\`
> regardless of how many domains they cover.

That principle is about the *production* provider. Test fakes
follow a different rule the same revision spells out:

> Test fakes describe what behaviour they fake, not their provider role.

The upstream example (\`BadYamlFs\`) names a single-trait fake by
behaviour + domain of the trait it fakes. My earlier rename had
turned \`PermissionDeniedFs\` / \`EmptyReadFs\` into
\`PermissionDeniedApi\` / \`EmptyReadApi\` — that was wrong: those
fakes implement only \`FsReadHead\`, not the full provider role,
so the \`Api\` suffix overstates what they do. Revert to the
behaviour+domain shape that matches \`BadYamlFs\`.

Multi-trait fakes (\`FailingProbe\`, \`DenyManifestRead\`, etc.)
already follow behaviour-only naming and don't need a domain
suffix — they describe the failure scenario directly.
KSXGitHub pushed a commit that referenced this pull request Apr 29, 2026
Adopt the trait-per-capability dependency-injection pattern from
[`#332`'s reference comment][1] for environment variable
access used by `${VAR}` substitution in `.npmrc`. Replaces the
`Fn(&str) -> Option<String>` closure parameter on `env_replace`,
`NpmrcAuth::from_ini`, and `Npmrc::current` with a single
`Api: EnvVar` bound; production callers turbofish `RealApi`
explicitly.

Why: keep one DI shape across the codebase as filesystem, disk, and
other process-global capabilities follow. Tests carry their fakes as
zero-sized unit structs (per-test scenario data goes in `static`s),
so coverage of every branch still drops out of pure logic without
mutating the real environment.

Also document the principle in `plans/PORTING_GUIDE.md` so future
ports start from the same template.

[1]: #332 (comment)
Test fakes are capability providers (just partial ones), so the
domain-neutral provider naming principle applies to them too — they
shouldn't carry a domain-tagged suffix. The previous commit renamed
the production `RealFs` → `RealApi` but left the fakes' `*Fs`
suffix in place; that exemption was not in the principle, only in
the prose around it.

Rename:

- `FailingFs`        → `FailingRead`     (clarifies which capability fails)
- `BadYamlFs`        → `BadYamlContent`  (describes the input shape)
- `NullDocFs`        → `NullDocContent`
- `FailingMkdirFs`   → `FailingMkdir`
- `FailingWriteFs`   → `FailingWrite`

Each name still describes the scenario the fake stages, without
pre-committing the type to a domain.
@KSXGitHub KSXGitHub mentioned this pull request Apr 29, 2026
5 tasks
@KSXGitHub KSXGitHub marked this pull request as ready for review May 7, 2026 19:49
Copilot AI review requested due to automatic review settings May 7, 2026 19:49

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements .modules.yaml manifest support in pacquet-modules-yaml, adding a typed Modules model plus read_modules_manifest/write_modules_manifest APIs intended to mirror pnpm v11’s installing/modules-yaml behavior.

Changes:

  • Added Modules + supporting types (DepPath, LayoutVersion, etc.) and DI capability traits (FsReadToString, FsCreateDirAll, FsWrite, Clock) with a RealApi production implementation.
  • Implemented manifest read/write logic: missing/empty/null handling, legacy shamefullyHoist translation, defaulting prunedAt/virtualStoreDirMaxLength, skipped sorting, and (non-Windows) virtualStoreDir relativization.
  • Replaced known-failure tests with a full test suite and updated fixtures/dependencies to support the new behavior.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
crates/modules-yaml/src/lib.rs Core implementation of .modules.yaml model, read/write functions, error types, and DI traits.
crates/modules-yaml/tests/index.rs Migrated from known-failure stubs to real tests covering parity + extra edge/error-paths.
crates/modules-yaml/tests/fixtures/old-shamefully-hoist/.modules.yaml Updated legacy fixture layoutVersion.
crates/modules-yaml/tests/fixtures/old-no-shamefully-hoist/.modules.yaml Updated legacy fixture layoutVersion.
crates/modules-yaml/Cargo.toml Added required deps (serde, serde-saphyr, diagnostics, indexmap, httpdate, etc.) and new test deps.
Cargo.toml Added workspace dependencies for httpdate, indexmap, and chrono (dev).
Cargo.lock Lockfile updates for the new dependencies.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +246 to +286
/// The unit type carries no data: its existence is the value. It serializes
/// as the integer `5` and deserializes only when the on-disk value is
/// exactly `5`. Any other version causes a deserialization error, mirroring
/// upstream's `checkCompatibility` reaction at
/// <https://github.com/pnpm/pnpm/blob/1819226b51/installing/deps-installer/src/install/checkCompatibility/index.ts#L18-L22>,
/// which throws `ModulesBreakingChangeError` for a missing or mismatched
/// `layoutVersion`. Wrapping this in [`Option`] on [`Modules`]
/// distinguishes "missing" (legacy, breaking change) from "present and
/// matching".
///
/// The `#[serde(try_from = "u32", into = "u32")]` proxy lets us reuse
/// serde's number deserializer, while the [`TryFrom`] impl owns the
/// "is this version supported" decision and returns
/// [`UnsupportedLayoutVersionError`].
#[derive(Debug, Clone, Copy, PartialEq, Eq, Default, Serialize, Deserialize)]
#[serde(try_from = "u32", into = "u32")]
pub struct LayoutVersion;

impl LayoutVersion {
/// The single layout version pacquet supports.
const VALUE: u32 = 5;
}

impl From<LayoutVersion> for u32 {
fn from(_: LayoutVersion) -> u32 {
LayoutVersion::VALUE
}
}

impl TryFrom<u32> for LayoutVersion {
type Error = UnsupportedLayoutVersionError;

fn try_from(value: u32) -> Result<Self, Self::Error> {
if value == LayoutVersion::VALUE {
Ok(Self)
} else {
Err(UnsupportedLayoutVersionError { found: value })
}
}
}

devDependencies: true
optionalDependencies: true
layoutVersion: 4
layoutVersion: 5
devDependencies: true
optionalDependencies: true
layoutVersion: 4
layoutVersion: 5

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 4 was from old pacquet code. Not from upstream. Upstream was 1. But 1 is rejected by checkCompatibility regardless. It is fine for pacquet to be stricter earlier.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
Cargo.toml (1)

44-45: ⚡ Quick win

Keep these new dependencies crate-local unless a second crate actually uses them.

In this PR, httpdate, indexmap, and chrono only appear to support pacquet-modules-yaml, so promoting them to workspace scope grows shared surface area without a visible second consumer. Prefer versioning them in crates/modules-yaml/Cargo.toml until another crate needs the same dependency.

As per coding guidelines, "Add new dependencies to the specific crate that needs them, not to the workspace root or shared crates, unless multiple crates actually depend on the dependency."

Also applies to: 88-88

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Cargo.toml` around lines 44 - 45, The workspace Cargo.toml currently promotes
httpdate, indexmap (with serde feature), and chrono to workspace scope though
they are only used by the modules-yaml crate; remove those entries from the
workspace Cargo.toml and add them as regular dependencies in the modules-yaml
crate's Cargo.toml (crates/modules-yaml/Cargo.toml) with the same versions and
features (httpdate = "1.0.3", indexmap = { version = "2.14.0", features =
["serde"] }, chrono = "…") so the dependencies remain crate-local until another
crate needs them; ensure you delete the duplicate workspace entries (including
the chrono entry referenced on the other line) to avoid version conflicts.
crates/modules-yaml/src/lib.rs (1)

110-121: ⚡ Quick win

Use the repo’s explicit branded-string serde boundary for DepPath.

DepPath crosses manifest JSON/YAML boundaries, but #[serde(transparent)] skips the String boundary pattern the repo asks for on branded wrappers. Please switch this to explicit String conversion on serde, and add From<&str> alongside From<String> so the public construction API matches the guideline too.

As per coding guidelines, "When upstream never validates a branded string type, expose an infallible From<String> and From<&str> constructor for type-safety without validation" and "For branded string types that cross JSON, YAML, or INI boundaries (manifest files, lockfiles, state files, config files), wire serde validation using #[serde(try_from = "String")] for deserialization and #[serde(into = "String")] for serialization."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/modules-yaml/src/lib.rs` around lines 110 - 121, Replace the
#[serde(transparent)] pattern on DepPath with the repo branded-string serde
boundary by changing the attribute to #[serde(try_from = "String", into =
"String")], then add infallible public constructors and serde conversions:
implement From<String> for DepPath, From<&str> for DepPath, and From<DepPath>
for String (or From<DepPath> -> String) and implement TryFrom<String> for
DepPath that simply returns Ok(DepPath(s)) so serde deserialization uses the
try_from path; keep the existing as_str() impl and derives but remove the
transparent attribute so DepPath crosses manifest boundaries using the explicit
String conversions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/modules-yaml/src/lib.rs`:
- Around line 256-283: Remove the serde proxy deserialization that rejects
unknown numbers during YAML parse: drop #[serde(try_from = "u32", into = "u32")]
and make LayoutVersion hold the raw u32 (e.g., pub struct LayoutVersion(u32) or
a field like value: u32) so serde will deserialize any numeric value; update
From<LayoutVersion> for u32 to return the inner value and keep/adjust
TryFrom<u32> or add an explicit validate/check_compatibility(&self) ->
Result<(), UnsupportedLayoutVersionError> that enforces LayoutVersion::VALUE
after parsing (call that validator in the post-parse compatibility check path
instead of during Deserialize).
- Around line 440-447: The current rewrite_virtual_store_dir_relative function
uses strip_prefix and falls back to the absolute path for non-descendant cases;
change it to compute a proper relative path (like Node's path.relative) when
strip_prefix fails by using a path-diffing utility (e.g., pathdiff::diff_paths)
to produce relative segments from modules_dir to manifest.virtual_store_dir,
falling back to the original stored_path if diff_paths returns None; update the
manifest assignment in rewrite_virtual_store_dir_relative to use that result and
add a unit test that sets manifest.virtual_store_dir to a sibling/parent
absolute path (e.g., "../.pnpm-store") to assert the serialized value is the
relative form; if needed, add pathdiff to Cargo.toml.

---

Nitpick comments:
In `@Cargo.toml`:
- Around line 44-45: The workspace Cargo.toml currently promotes httpdate,
indexmap (with serde feature), and chrono to workspace scope though they are
only used by the modules-yaml crate; remove those entries from the workspace
Cargo.toml and add them as regular dependencies in the modules-yaml crate's
Cargo.toml (crates/modules-yaml/Cargo.toml) with the same versions and features
(httpdate = "1.0.3", indexmap = { version = "2.14.0", features = ["serde"] },
chrono = "…") so the dependencies remain crate-local until another crate needs
them; ensure you delete the duplicate workspace entries (including the chrono
entry referenced on the other line) to avoid version conflicts.

In `@crates/modules-yaml/src/lib.rs`:
- Around line 110-121: Replace the #[serde(transparent)] pattern on DepPath with
the repo branded-string serde boundary by changing the attribute to
#[serde(try_from = "String", into = "String")], then add infallible public
constructors and serde conversions: implement From<String> for DepPath,
From<&str> for DepPath, and From<DepPath> for String (or From<DepPath> ->
String) and implement TryFrom<String> for DepPath that simply returns
Ok(DepPath(s)) so serde deserialization uses the try_from path; keep the
existing as_str() impl and derives but remove the transparent attribute so
DepPath crosses manifest boundaries using the explicit String conversions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 0bd068c8-7d9e-423b-984d-7eed4787864a

📥 Commits

Reviewing files that changed from the base of the PR and between b79cc1c and f1807ce.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • Cargo.toml
  • crates/modules-yaml/Cargo.toml
  • crates/modules-yaml/src/lib.rs
  • crates/modules-yaml/tests/fixtures/old-no-shamefully-hoist/.modules.yaml
  • crates/modules-yaml/tests/fixtures/old-shamefully-hoist/.modules.yaml
  • crates/modules-yaml/tests/index.rs

Comment thread crates/modules-yaml/src/lib.rs
Comment thread crates/modules-yaml/src/lib.rs
claude added 2 commits May 7, 2026 19:59
…URLs

Three concerns addressed in one commit:

- Split `tests/index.rs` (~525 lines) along the natural seam between
  upstream ports and pacquet-side additions, then split the pacquet
  tail by data source. Three files now:

    - `tests/index.rs` keeps the five direct ports of upstream's
      `installing/modules-yaml/test/index.ts` and preserves the
      `index.ts` ↔ `index.rs` filename mapping.
    - `tests/real_fs.rs` holds the four pacquet-side tests that
      exercise behavior branches via real `tempfile::tempdir()` writes
      and reads (absolute `virtualStoreDir`, `skipped` sort, null
      `publicHoistPattern`, `DepPath` transparent serialization).
    - `tests/fakes.rs` holds the eight pacquet-side tests that drive
      behavior through the capability-trait DI fakes (the five I/O
      error paths, `LayoutVersion` rejection, `ignoredBuilds` dedup,
      `prunedAt` clock fill-in).

  `manifest_from_json` is duplicated in `tests/index.rs` and
  `tests/real_fs.rs` (each file uses it independently); `tests/fakes.rs`
  doesn't need it. The recent pipe-trait refactors on
  `read_legacy_shamefully_hoist_*`,
  `read_preserves_absolute_virtual_store_dir`, and the round-trip test
  are preserved verbatim and `read_empty_modules_manifest_returns_none`
  picks up the same chain shape for consistency.

- Pin the workspace `chrono` dependency to `0.4.44` (the version the
  lockfile resolved). Other workspace deps follow the same patch-level
  pinning convention (`tar = "0.4.45"`, `dunce = "1.0.5"`, etc.).

- Wrap every bare `https://...` URL in a Rust comment or doc comment in
  angle brackets so they render as autolinks. Five `// Ported from <URL>`
  markers, the install-pipeline link in `tests/real_fs.rs`'s preamble,
  the parallel-disk-usage link in `tests/fakes.rs`'s preamble, and the
  upstream permalinks inside per-test docstrings all use the
  angle-bracketed form now.
…dant virtualStoreDir

Address two valid AI-bot review findings (rejecting four others as
moot, decided, or contradicting prior maintainer guidance):

- **Demote `httpdate`, `indexmap`, and `chrono` from workspace
  `[workspace.dependencies]` to `pacquet-modules-yaml`'s direct
  dependencies.** AGENTS.md says "Add a new dependency to the
  specific crate that needs it, not to the workspace root or to a
  shared crate unless multiple crates actually depend on it." Each
  of these is referenced by exactly one crate today
  (`pacquet-modules-yaml`); promotion to workspace scope was
  premature. Versions and feature lists move with them
  (`indexmap` keeps `serde`, `chrono` keeps
  `default-features = false` + `clock`).

- **Use `pathdiff::diff_paths` instead of `Path::strip_prefix` for
  `rewrite_virtual_store_dir_relative`.** Upstream's writer at
  installing/modules-yaml/src/index.ts:132-135 calls
  `path.relative(modulesDir, saveModules.virtualStoreDir)`, which
  produces `..` segments for non-descendant targets (sibling, parent,
  etc.). `Path::strip_prefix` only handles descendants and falls back
  to the absolute path otherwise, which would diverge from upstream's
  output for non-default `virtualStoreDir` values. `pathdiff` is the
  Rust ecosystem's `path.relative` equivalent and is now a
  crate-local runtime dep (`0.2.3`).

  Add `write_relativizes_non_descendant_virtual_store_dir` in
  `tests/real_fs.rs`: a sibling absolute `virtualStoreDir` two
  directories up from `modules_dir` should serialize as
  `../../.pnpm-store`, not as the absolute path. Test gated on
  `#[cfg(not(windows))]` because Windows preserves absolute paths to
  avoid junction breakage (see the `cfg!(windows)` guard in
  `write_modules_manifest`).
Copilot AI review requested due to automatic review settings May 7, 2026 20:18

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/modules-yaml/Cargo.toml (1)

23-23: 💤 Low value

serde_json appears in both [dependencies] and [dev-dependencies].

Line 23 adds serde_json as a regular dependency, and line 30 repeats it in dev-dependencies. The dev-dependency entry is redundant since the crate is already a compile-time dependency. Consider removing line 30.

♻️ Proposed fix
 [dev-dependencies]
 pacquet-testing-utils = { workspace = true }

 chrono            = { version = "0.4.44", default-features = false, features = ["clock"] }
 pretty_assertions = { workspace = true }
-serde_json        = { workspace = true }
 tempfile          = { workspace = true }
 text-block-macros = { workspace = true }

Also applies to: 30-30

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/modules-yaml/Cargo.toml` at line 23, The Cargo.toml currently lists
serde_json in both [dependencies] and [dev-dependencies]; remove the duplicate
dev-dependencies entry so serde_json only appears once (keep the existing
serde_json = { workspace = true } in [dependencies]) and ensure no other
dev-only crates are accidentally removed; update the [dev-dependencies] section
to drop the serde_json line.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@crates/modules-yaml/Cargo.toml`:
- Line 23: The Cargo.toml currently lists serde_json in both [dependencies] and
[dev-dependencies]; remove the duplicate dev-dependencies entry so serde_json
only appears once (keep the existing serde_json = { workspace = true } in
[dependencies]) and ensure no other dev-only crates are accidentally removed;
update the [dev-dependencies] section to drop the serde_json line.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 1f61dc1d-3248-4e88-a6af-c64db5b61aeb

📥 Commits

Reviewing files that changed from the base of the PR and between 1451723 and 389fd55.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • crates/modules-yaml/Cargo.toml
  • crates/modules-yaml/src/lib.rs
  • crates/modules-yaml/tests/real_fs.rs

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.

) -> Result<(), WriteModulesError>
where
Api: FsCreateDirAll + FsWrite,
{
Comment on lines +369 to +383
let parsed: Option<Modules> =
content.pipe_as_ref(serde_saphyr::from_str).map_err(|source| {
ReadModulesError::ParseYaml { path: manifest_path.clone(), source: Box::new(source) }
})?;
let Some(mut manifest) = parsed else { return Ok(None) };
apply_legacy_shamefully_hoist(&mut manifest);
resolve_virtual_store_dir(&mut manifest, modules_dir);
if manifest.pruned_at.is_empty() {
manifest.pruned_at = httpdate::fmt_http_date(Api::now());
}
if manifest.virtual_store_dir_max_length == 0 {
manifest.virtual_store_dir_max_length = DEFAULT_VIRTUAL_STORE_DIR_MAX_LENGTH;
}
Ok(Some(manifest))
}
Comment on lines 26 to +32
pacquet-testing-utils = { workspace = true }

pipe-trait = { workspace = true }
chrono = { version = "0.4.44", default-features = false, features = ["clock"] }
pretty_assertions = { workspace = true }
serde_json = { workspace = true }
tempfile = { workspace = true }
text-block-macros = { workspace = true }
`serde_json` was listed in both [dependencies] and [dev-dependencies]
with identical `{ workspace = true }` specs; the dev entry adds
nothing because the regular dependency is already visible to test
code. Drop the duplicate.
The earlier "scope deps crate-local" commit demoted `httpdate`,
`indexmap`, `chrono`, and `pathdiff` to crate-local declarations on
the strength of AGENTS.md's literal text ("Add a new dependency to
the specific crate that needs it... unless multiple crates actually
depend on it") and a CodeRabbit nit.

Audit afterwards: five existing single-consumer deps in pacquet
already live in `[workspace.dependencies]` (`gethostname`,
`os_display`, `reflink-copy`, `rusqlite`, `zune-inflate`). The
codebase's actual convention is "workspace-scoped by default,
regardless of consumer count" — stricter than the AGENTS.md text.
The earlier demotion broke that consistency.

Move all four deps back to `[workspace.dependencies]` and reference
them via `{ workspace = true }` from `pacquet-modules-yaml`.
`pathdiff` keeps its actual usage in `rewrite_virtual_store_dir_relative`
because the *behavior* it provides (Node-style `path.relative` with
`..` segments for non-descendant targets) is correct and small enough
not to be worth hand-rolling — pathdiff is ~150 lines, pure std, no
transitive deps, and the algorithm has subtle edge cases (mixed
absolute/relative, `..` components, current-dir handling) that the
crate has already tested.
Copilot AI review requested due to automatic review settings May 7, 2026 20:34
@KSXGitHub KSXGitHub enabled auto-merge (squash) May 7, 2026 20:38
@KSXGitHub KSXGitHub merged commit c4a439d into main May 7, 2026
13 checks passed
@KSXGitHub KSXGitHub deleted the claude/modules-yaml-write-verify-I9yCq branch May 7, 2026 20:38

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 3 comments.

Comment on lines +400 to +404
pub fn write_modules_manifest<Api>(
modules_dir: &Path,
mut manifest: Modules,
) -> Result<(), WriteModulesError>
where
Comment on lines +320 to +324
pub enum ReadModulesError {
#[display("Failed to read {path:?}: {source}")]
#[diagnostic(code(pacquet_modules_yaml::read_io))]
ReadFile { path: PathBuf, source: io::Error },

Comment on lines +336 to +344
CreateDir { path: PathBuf, source: io::Error },

#[display("Failed to serialize manifest: {_0}")]
#[diagnostic(code(pacquet_modules_yaml::serialize_json))]
SerializeJson(serde_json::Error),

#[display("Failed to write {path:?}: {source}")]
#[diagnostic(code(pacquet_modules_yaml::write_io))]
WriteFile { path: PathBuf, source: io::Error },
@KSXGitHub

Copy link
Copy Markdown
Contributor Author

Responses by Claude Code to Copilot's and CodeRabbit's reviews

First pass

Confirmed: only pacquet-modules-yaml declares those three. CodeRabbit #4 is valid per AGENTS.md.

Now scope the assessment. Verdicts:

# Source Topic Verdict
1 Copilot LayoutVersion strict-on-parse Reject — decided design, documented
2 Copilot fixture layoutVersion: 4 → 5 Reject — coverage moved, not lost
3 Copilot same as #2 Reject — same
4 CodeRabbit demote httpdate/indexmap/chrono to crate-local Apply — AGENTS.md explicit, no other consumer
5 CodeRabbit replace transparent with try_from/into on DepPath Reject — contradicts maintainer's earlier explicit instruction (remove From<&str> for allocation visibility); transparent is the non-validating-brand idiom
6 CodeRabbit same as #1 Reject — same
7 CodeRabbit implement path.relative() for non-descendant virtualStoreDir Apply — genuine pnpm divergence; cardinal rule says match upstream

Second pass

Three new Copilot comments + the CodeRabbit nit. Verdicts:

# Source Topic Verdict
1 CodeRabbit duplicate serde_json dev-dep Apply — pure noise (both { workspace = true })
2 Copilot writer should auto-fill layout_version: Some(LayoutVersion) Reject
3 Copilot reader should reject missing layoutVersion Reject
4 Copilot drop chrono; use SystemTime::UNIX_EPOCH + Duration::from_secs(...) Reject — contradicts maintainer's explicit instruction

I personally rejected the moving of deps from workspace scope to crate scope.

zkochan added a commit that referenced this pull request May 12, 2026
Resolves #330.

Mirrors pnpm v11's `@pnpm/bins.resolver`, `@pnpm/bins.linker`, and `@zkochan/cmd-shim` (resolved at pnpm `4750fd370c` and cmd-shim `0d79ca9534`):

- New `pacquet-cmd-shim` crate parses the `bin` field of a package manifest into commands. Both `bin: string | object` and the `directories.bin` glob fallback are supported, with the same path-traversal and URL-safe-name guards as upstream. Conflict resolution follows `pkgOwnsBin` + lexical compare.
- Shim generator writes all three flavors per bin: Unix `/bin/sh` (byte-compatible with pnpm's `generateShShim`), Windows `.cmd` (CRLF, `%~dp0` paths), and PowerShell `.ps1` (`$basedir`/`$exe` header with Windows/POSIX-pwsh detection). Mirrors `generateCmdShim` and `generatePwshShim` minus the unused `nodePath`/`prependToPath`/`progArgs` branches.
- All filesystem IO behind per-capability DI traits — `FsReadHead`, `FsReadFile`, `FsReadString`, `FsReadDir`, `FsWalkFiles`, `FsCreateDirAll`, `FsWrite`, `FsSetExecutable`, `FsEnsureExecutableBits` — following the pattern documented in [#332](#332 (comment)). `FsWrite` is a thin shim over `std::fs::write` and is **not atomic**; a future atomic-write capability can build on it without renaming. Production callers turbofish the unit-struct provider `::<RealApi>`; tests inject unit-struct fakes to cover IO error paths real fs can't trigger portably. Generic param is named `Api` (not `Fs`) so the same provider can grow non-fs capabilities without a rename.
- New `LinkVirtualStoreBins` step walks each virtual-store slot and links its child packages' bins into the slot's own `node_modules/.bin`, matching `linkBinsOfDependencies` in pnpm's `building/during-install`. Slot iteration, per-slot child reads, and shim writes parallelised on rayon.
- `SymlinkDirectDependencies` and `InstallWithoutLockfile` call the bin linker after the symlink layout is in place. Direct-dep linking and the symlink loop are exposed as free `link_direct_dep_bins` / `symlink_direct_deps_into_node_modules` functions so they're unit-testable without the mock-registry scaffold.

Tests port the relevant `bins/resolver` and `bins/linker` cases from pnpm (`directories.bin` discovery, scoped bin names, dangerous locations, idempotent shim skip, conflict resolution) and add fakes-injected coverage for the IO error variants.

**Not in this PR**: hoisted-bin precedence and lifecycle-script-created bins. Both depend on subsystems pacquet hasn't built yet — hoisting (`hoistPattern` / `publicHoistPattern` resolution + the lift step) and lifecycle scripts (`exec/lifecycle/` runner + `allowBuilds` plumbing). The bin-linking side already does the right thing for both (shims are paths, so they pick up files generated post-extract; the precedence rule is a small addition once hoisting tags candidates as hoisted). Tracked separately in  with full porting coordinates.



## Performance

When the bin-linking step landed it added ~7% to the warm-cache Frozen Lockfile install ([benchmark](#333 (comment))). Closing that gap took a sequence of changes that mirror how pnpm v11 avoids the same costs in its own `linkBinsOfDependencies` ([4750fd370c](https://github.com/pnpm/pnpm/blob/4750fd370c/building/during-install/src/index.ts#L258-L309)):

- **Bundled manifests in `index.db`.** `extract_tarball_entries` captures the parsed `package.json`, picks the subset pnpm caches via a port of `normalizeBundledManifest` (`name`, `version`, `bin`, `directories`, `dependencies`, lifecycle `scripts`, …), and stuffs it into `PackageFilesIndex.manifest`. `prefetch_cas_paths` surfaces it back as a `HashMap<cache_key, Arc<Value>>` so the bin linker doesn't have to re-read `package.json` per child off disk.
- **msgpackr-records encoder learns `serde_json::Value`.** The encoder previously errored with `ManifestNotSupported` if anything ever set `manifest`; now it records-encodes nested JSON objects (slot-allocated per distinct key set) so `useRecords: true` readers see them as JS `Object`s rather than `Map`s — necessary for pnpm's `manifest.bin` / `manifest.directories?.bin` property accesses to resolve.
- **`Arc<Value>` for `PackageBinSource.manifest`.** The lockfile-driven bin-link path produces `slots × children` clones of the parsed manifest (~6k on the integrated-benchmark fixture). Sharing via `Arc` turns each push into a refcount bump.
- **`hasBin` filter via lockfile.** `LinkVirtualStoreBins` now takes the lockfile `packages:` section, pre-computes an `Option<HashSet<PackageKey>>` of `hasBin: true` keys at install start, and skips snapshot children that aren't in the set *before* any path-building or manifest lookup. ~95% of a real lockfile's packages don't declare a bin, so this short-circuits the bulk of the per-slot work. `None` is reserved for the pathological case where the lockfile lacks a `packages:` section (fall back to processing every child); `Some(empty)` is authoritative — lockfile says no package has a bin, every slot short-circuits. Mirrors pnpm's filter at `building/during-install/src/index.ts:283`.
- **Lockfile-driven slot iteration.** `run_lockfile_driven` walks `snapshots` directly and builds slot paths lexically. The previous `run_with_readdir` path enumerated every slot via `Api::read_dir(virtual_store_dir)` and probed each slot's own package directory with a wasted `read_dir` (~1267 wasted `open(O_DIRECTORY) + close` per warm install). The readdir-driven path survives as a fallback for `install_without_lockfile` and retains the existence probe there (small N, no virtual-store invariant from `create_virtual_dir_by_snapshot` to lean on).
- **Parallel prefetch decode.** `StoreIndex::get_many_raw` returns undecoded row bytes; the SQLite mutex releases before `prefetch_cas_paths` fans the msgpackr decode + integrity check across rayon.

Latest CI benchmark on this branch ([job 75462998251](https://github.com/pnpm/pacquet/actions/runs/25701679172/job/75462998251)):

| Scenario | pacquet@HEAD | pacquet@main | Relative |
|---|---|---|---|
| Frozen Lockfile | 2.074s ± 102ms | 2.026s ± 70ms | +2.4% (within noise) |
| Frozen Lockfile (Hot Cache) | **479.6ms ± 38ms** | **535.8ms ± 37ms** | **~10% faster than main** |

The follow-up #417 splits manifests into a dedicated `package_manifests` SQLite table so the unfiltered `package_index` prefetch payload stays at its original size and the manifest fetch becomes a targeted SELECT against just the `hasBin: true` subset.

---------

Co-authored-by: Zoltan Kochan <z@kochan.io>
zkochan added a commit that referenced this pull request May 13, 2026
)

Slice 2 of #438 (the `nodeLinker: 'hoisted'` umbrella). Smaller than expected — when I looked, the schema field was already in place from #332's original `.modules.yaml` port, so the work reduced to pinning behavior rather than building plumbing.

Two round-trip tests + a doc-comment on the field.

## What was actually missing

The umbrella's §"Pacquet's current state" table claimed `hoistedLocations` was "Missing entirely" — that was wrong. The field is at `crates/modules-yaml/src/lib.rs:212` and serde-handles read/write correctly. What was missing:

- **A test pinning the wire shape.** Nothing was stopping a future edit from breaking the camelCase mapping, the optional-with-skip-on-None behavior, or the `Record<string, string[]>` shape upstream relies on.
- **A doc-comment** explaining what the field is for. Other Slice-related fields (`hoisted_dependencies`, `hoisted_aliases`) have full docstrings; this one was silent.

## Tests added

Both live in `crates/modules-yaml/tests/real_fs.rs` (pacquet-side, no upstream counterpart — upstream's `installing/modules-yaml/test/index.ts` doesn't exercise `hoistedLocations` directly):

- `hoisted_locations_round_trips` — a manifest with `hoistedLocations` populated survives write+read; the raw on-disk JSON keeps the per-depPath array shape (multi-entry arrays included, to pin the nested-conflict layout).
- `absent_hoisted_locations_is_omitted_on_write` — `None` (the state every pacquet install writes today) serializes as the field *absent* from the file, matching upstream's `JSON.stringify(undefined)` behavior. Guards against accidental `Option::is_some` regressions on the `skip_serializing_if`.

## Regression catch verified

Per `plans/TEST_PORTING.md`'s "break the subject to verify the test catches it" convention, I temporarily added `#[serde(rename = "wrongName")]` to the field. `hoisted_locations_round_trips` failed at the `expect("present")` on the deserialized value (the camelCase key no longer mapped). Reverted before commit.

## Type-shape decision

Upstream's actual `ModulesRaw.hoistedLocations` is `Record<string, string[]> | undefined` — *not* `Record<DepPath, string[]>` despite the values being populated from depPaths internally. The umbrella's scope item ("`Option<BTreeMap<DepPath, Vec<String>>>`") was inconsistent with the upstream schema; I kept the existing `BTreeMap<String, Vec<String>>` because:

1. It already matches the upstream wire shape exactly.
2. Same pattern as the `hoisted_dependencies` field below, which deliberately keeps `String` keys (per its doc-comment at lines 148-153) because pnpm's `DepPath | ProjectId` union can't be statically disambiguated.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

.modules.yaml write and verify

4 participants