feat(git-fetcher): install git-hosted tarballs via preparePackage + packlist (#436 §C)#451
Conversation
…acklist (#436 §C) Builds on §B+D (#446). Adds the second half of the git-hosted install path: `TarballResolution { gitHosted: true }` lockfile entries now run through a new `GitHostedTarballFetcher` after the existing `DownloadTarballToStore` settles their bytes into CAS. Mirrors pnpm's [`gitHostedTarballFetcher.ts`](https://github.com/pnpm/pnpm/blob/94240bc046/fetching/tarball-fetcher/src/gitHostedTarballFetcher.ts): 1. Materialize the CAS-resident files into a writable temp dir (`fs::copy` per file — using hardlinks here would let prepare scripts mutate shared CAS entries). 2. Run the existing `prepare_package` port — same `<pm>-install` + `prepublish` / `prepack` / `publish` dispatch §B+D shipped, same `GIT_DEP_PREPARE_NOT_ALLOWED` gate. 3. Run the existing minimal `packlist` over the prepared tree to drop source maps, test fixtures, etc. that ship in the raw archive but not in the published file set. 4. Re-import the filtered file set back into CAS and hand the `HashMap<String, PathBuf>` to `CreateVirtualDirBySnapshot`. The CAS-import helpers (`import_into_cas`, `is_file_executable`, `map_write_cas`) and the new `materialize_into` lift out of `fetcher.rs` into a shared `cas_io` module so both fetchers consume the same code path. No public-API change there — the helpers stay `pub(crate)`. Supporting changes: - `TarballResolution.path: Option<String>` mirrors pnpm's `TarballResolution.path` so a lockfile pinning a git-hosted tarball from a monorepo (`path: packages/sub`) deserializes without tripping `deny_unknown_fields`. The fetcher threads this through to `prepare_package`'s `safe_join_path` so the prepare and packlist run inside the right sub-directory. - The install dispatcher hoists the `allow_build` closure and the `ScriptsPrependNodePath` conversion above the resolution match so both the `Git` arm and the new `gitHosted: true` post-pass borrow from the same locals — no temp closures across `.await`. Out of scope (still open on #436): - Warm prefetch for git-hosted slots — neither fetcher writes `gitHostedStoreIndexKey` rows to `index.db` yet, so a re-install re-runs the clone + prepare + packlist. Correct, but slow on the second run. - Two-slot CAS layout (`${filesIndexFile}\traw` + final). Upstream optimizes "raw file set == prepared file set" by promoting the raw entry in place. Pacquet always re-imports today; hash-dedup means duplicate work but not duplicate storage. - §E — full integration-test matrix in `plans/TEST_PORTING.md`. This PR ships four crate-level tests for `GitHostedTarballFetcher` (pass-through, files-field filter, `GIT_DEP_PREPARE_NOT_ALLOWED`, CAS-isolation regression).
📝 WalkthroughWalkthroughThis PR adds ChangesShared CAS I/O and Git-Hosted Tarball Fetcher
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
crates/git-fetcher/src/tarball_fetcher/tests.rs (1)
61-64: ⚡ Quick winAdd one diagnostic log before non-
assert_eq!membership assertions.These
assert!checks would be easier to debug with the actual key set printed once before asserting.Proposed tweak
let keys: Vec<&str> = received.cas_paths.keys().map(String::as_str).collect(); + eprintln!("PACKLIST OUTPUT KEYS: {keys:?}"); assert!(keys.contains(&"dist/index.js")); assert!(keys.contains(&"package.json"), "package.json always included"); assert!(!keys.contains(&"src/index.ts"), "src excluded by files field"); assert!(!keys.contains(&"test/foo.test.js"), "test excluded by files field");Based on learnings: “In Rust test code… if you use assertions like
assert!,assert_ne!, etc., ensure the test logs the relevant actual/expected values (or context) before/around the assertion.”Also applies to: 109-113
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/git-fetcher/src/tarball_fetcher/tests.rs` around lines 61 - 64, Add a diagnostic log that prints the actual key set before each non-assert_eq! membership assertion in the test so failures show context; for example, before the assertions that check received.cas_paths.contains_key("package.json"), contains_key("index.js"), contains_key("README.md") (and the similar group around lines 109-113), emit a single debug line like eprintln!("cas_keys = {:?}", received.cas_paths.keys().collect::<Vec<_>>()) or use dbg!(&received.cas_paths) so the test output shows the actual keys prior to the assert! calls.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/git-fetcher/src/cas_io.rs`:
- Around line 42-43: The code currently joins untrusted CAS-relative paths
(variable rel) into target_dir (creating target via target_dir.join(...)) which
allows path traversal (.., absolute paths, or prefixes) to escape the intended
tree; fix by validating/sanitizing rel before joining: parse
Path::new(rel).components() and reject or error if any component is ParentDir,
RootDir, or Prefix, and ensure every component is a normal segment, then
construct the path by iteratively pushing those Normal components onto
target_dir (or use a safe join loop) before using target.parent(); apply the
same validation to the other occurrence around lines 77-78 so both joins only
use vetted path segments.
---
Nitpick comments:
In `@crates/git-fetcher/src/tarball_fetcher/tests.rs`:
- Around line 61-64: Add a diagnostic log that prints the actual key set before
each non-assert_eq! membership assertion in the test so failures show context;
for example, before the assertions that check
received.cas_paths.contains_key("package.json"), contains_key("index.js"),
contains_key("README.md") (and the similar group around lines 109-113), emit a
single debug line like eprintln!("cas_keys = {:?}",
received.cas_paths.keys().collect::<Vec<_>>()) or use dbg!(&received.cas_paths)
so the test output shows the actual keys prior to the assert! calls.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 9698cdd4-e788-4f15-a960-c37269eacf12
📒 Files selected for processing (8)
crates/git-fetcher/src/cas_io.rscrates/git-fetcher/src/fetcher.rscrates/git-fetcher/src/lib.rscrates/git-fetcher/src/tarball_fetcher.rscrates/git-fetcher/src/tarball_fetcher/tests.rscrates/lockfile/src/resolution.rscrates/lockfile/src/resolution/tests.rscrates/package-manager/src/install_package_by_snapshot.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Agent
- GitHub Check: Run benchmark on ubuntu-latest
- GitHub Check: Run benchmark on ubuntu-latest
- GitHub Check: Code Coverage
- GitHub Check: Lint and Test (macos-latest)
- GitHub Check: Lint and Test (windows-latest)
- GitHub Check: Lint and Test (ubuntu-latest)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Preserve existing method chains andpipe-traitchains; do not break them into intermediateletbindings unless there is a concrete justification such as a compilation failure, borrow checker rejection, meaningful performance improvement, or other technical necessity. Refactoring for style alone is not sufficient justification.
Choose owned vs. borrowed parameters to minimize copies; prefer borrowed types (&Pathover&PathBuf,&strover&String) when it does not force extra copies.
PreferArc::clone(&x)andRc::clone(&x)overx.clone()for reference-counted types to make the cost visible at the call site.
Do not use star imports inside module bodies. Writeuse super::{Foo, bar}instead ofuse super::*;for any glob whose target is a module you control. External-crate preludes (e.g.,use rayon::prelude::*;) and root-of-module re-exports (e.g.,pub use submodule::*;inlib.rs) are exceptions.
Follow Rust API Guidelines for naming, as documented in https://rust-lang.github.io/api-guidelines/naming.html.
Declare a newtype wrapper for any branded string type being ported from TypeScript pnpm. Do not collapse the brand into a plainStringor&str; give the type its own struct so misuse is a type error.
When porting branded string types where upstream TypeScript always validates before construction, validate in the Rust port too. Construct the wrapper only viaTryFrom<String>and/orFromStr; do not provide an infallible public constructor that takes an arbitrary string.
For branded string types where upstream TypeScript never validates (used purely for type-safety to prevent confusion between string slots), expose an infallibleFrom<String>andFrom<&str>constructor in the Rust wrapper.
When upstream TypeScript occasionally constructs a branded type without validation (via bareasassertion), add afrom_str_unchecked(or similarly named) constructor on the Rust side. Keep the validating constructor as well; `from_str_u...
Files:
crates/lockfile/src/resolution.rscrates/git-fetcher/src/lib.rscrates/git-fetcher/src/tarball_fetcher.rscrates/git-fetcher/src/fetcher.rscrates/package-manager/src/install_package_by_snapshot.rscrates/git-fetcher/src/tarball_fetcher/tests.rscrates/lockfile/src/resolution/tests.rscrates/git-fetcher/src/cas_io.rs
🧠 Learnings (2)
📚 Learning: 2026-05-07T23:19:08.272Z
Learnt from: KSXGitHub
Repo: pnpm/pacquet PR: 401
File: tasks/integrated-benchmark/src/work_env.rs:343-344
Timestamp: 2026-05-07T23:19:08.272Z
Learning: When reviewing Rust code in pnpm/pacquet for deprecated API usage, do not automatically treat `serde_saphyr::to_string` as deprecated. In `serde-saphyr` v0.0.25, `serde_saphyr::to_string` has no `#[deprecated]` attribute (the `#[deprecated]` later in `serde-saphyr-0.0.25/src/lib.rs` applies to a different function). Only flag `serde_saphyr::to_string` as deprecated if the resolved dependency version’s source shows `#[deprecated]` on that specific function.
Applied to files:
crates/lockfile/src/resolution.rscrates/git-fetcher/src/lib.rscrates/git-fetcher/src/tarball_fetcher.rscrates/git-fetcher/src/fetcher.rscrates/package-manager/src/install_package_by_snapshot.rscrates/git-fetcher/src/tarball_fetcher/tests.rscrates/lockfile/src/resolution/tests.rscrates/git-fetcher/src/cas_io.rs
📚 Learning: 2026-05-01T10:01:33.766Z
Learnt from: zkochan
Repo: pnpm/pacquet PR: 349
File: crates/reporter/src/tests.rs:121-121
Timestamp: 2026-05-01T10:01:33.766Z
Learning: In Rust test code, follow the repo’s CODE_STYLE_GUIDE test-logging rule: add logging (e.g., `eprintln!`/`eprintln!(...)`) so that useful diagnostic values are printed when a test fails, unless the assertion is `assert_eq!` (where the differing values are already included). Concretely, if you use assertions like `assert!`, `assert_ne!`, etc., ensure the test logs the relevant actual/expected values (or context) before/around the assertion so failures can be diagnosed without rerunning.
Applied to files:
crates/git-fetcher/src/tarball_fetcher/tests.rscrates/lockfile/src/resolution/tests.rs
🔇 Additional comments (6)
crates/lockfile/src/resolution.rs (1)
26-32: LGTM!crates/lockfile/src/resolution/tests.rs (1)
25-25: LGTM!Also applies to: 40-40, 58-58, 78-78, 91-91, 104-104, 117-117, 132-132, 144-144, 159-159, 171-189, 197-197
crates/git-fetcher/src/fetcher.rs (1)
16-16: LGTM!Also applies to: 272-274
crates/git-fetcher/src/lib.rs (1)
1-2: LGTM!Also applies to: 4-18, 20-20, 26-26, 33-33
crates/package-manager/src/install_package_by_snapshot.rs (1)
9-9: LGTM!Also applies to: 107-124, 130-181
crates/git-fetcher/src/tarball_fetcher.rs (1)
94-94: LGTM!Also applies to: 115-118, 142-143
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #451 +/- ##
==========================================
- Coverage 87.45% 87.11% -0.34%
==========================================
Files 100 103 +3
Lines 7865 8206 +341
==========================================
+ Hits 6878 7149 +271
- Misses 987 1057 +70 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Micro-Benchmark ResultsLinux |
There was a problem hiding this comment.
Pull request overview
This PR extends pacquet’s frozen-lockfile install pipeline to handle git-hosted tarball lockfile entries (TarballResolution { gitHosted: true }) by running a prepare+packlist pass after the tarball bytes are downloaded into the CAS, mirroring pnpm’s gitHostedTarballFetcher behavior. It also updates lockfile parsing to accept monorepo subdirectory packing via a new TarballResolution.path field.
Changes:
- Route
gitHosted: truetarball resolutions through a newGitHostedTarballFetcherthat materializes CAS files to a temp dir, runsprepare_package, applies packlist filtering, and re-imports into CAS. - Add
TarballResolution.path: Option<String>to lockfile types + tests, mirroring pnpm’sTarballResolution.path. - Factor shared CAS I/O helpers into
git-fetcher::cas_iofor reuse across git and git-hosted-tarball fetchers.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/package-manager/src/install_package_by_snapshot.rs | Hoists shared locals and adds the git-hosted tarball post-pass via GitHostedTarballFetcher. |
| crates/lockfile/src/resolution.rs | Adds TarballResolution.path with serde support/documentation. |
| crates/lockfile/src/resolution/tests.rs | Updates expectations for new path field and adds a deserialize test for tarball path. |
| crates/git-fetcher/src/lib.rs | Exposes GitHostedTarballFetcher and documents the two git-hosted fetch paths. |
| crates/git-fetcher/src/fetcher.rs | Moves CAS import helpers into cas_io for reuse. |
| crates/git-fetcher/src/cas_io.rs | New shared CAS materialization/import helpers for git-hosted workflows. |
| crates/git-fetcher/src/tarball_fetcher.rs | Implements the GitHostedTarballFetcher prepare+packlist+reimport pipeline. |
| crates/git-fetcher/src/tarball_fetcher/tests.rs | Adds crate-level tests covering pass-through, files filtering, not-allowed build, and CAS isolation regression. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…C review) CodeRabbit flagged that `materialize_into` and `import_into_cas` both joined a caller-supplied `rel` onto a trusted root without checking the components. A malicious tarball entry like `../etc/passwd` would have escaped the working tree on extraction. Pacquet's tarball extractor is supposed to sanitise on the way in, but defense-in-depth at this layer means a bug there can't turn into a write outside the target directory. Add a private `join_checked(root, rel)` helper in `cas_io` that: - Rejects absolute paths (`/etc/passwd`, `C:\foo`). - Rejects `..`, root, and drive-prefix components. - Strips `.` components silently. - Pushes only `Component::Normal` segments onto `root`. Both `materialize_into` and `import_into_cas` route their joins through it. Errors surface as `GitFetcherError::Io` with `io::ErrorKind::InvalidInput` so the miette source chain remains unchanged. A new test covers the rejection paths (absolute, parent- dir at start, parent-dir mid-path) plus a regression test for `materialize_into` that asserts no file lands outside the target on a poisoned `cas_paths` entry.
Integrated-Benchmark Report (Linux)Scenario: Frozen Lockfile
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 2.6085972815200007,
"stddev": 0.17489809876437543,
"median": 2.56699586052,
"user": 2.6067669599999994,
"system": 3.5257157999999995,
"min": 2.43359172902,
"max": 3.02857954602,
"times": [
2.56486350302,
2.48442612402,
2.7188438940200004,
2.67923168502,
2.4942660560200003,
2.6394791410200003,
3.02857954602,
2.43359172902,
2.4735629190200004,
2.56912821802
]
},
{
"command": "pacquet@main",
"mean": 2.51702915162,
"stddev": 0.06594608148206567,
"median": 2.5122714220200004,
"user": 2.5612052600000004,
"system": 3.5102591000000003,
"min": 2.41106919702,
"max": 2.6115159030200004,
"times": [
2.43830381602,
2.56671974502,
2.48094128002,
2.41106919702,
2.59744798502,
2.4952035280200002,
2.52933931602,
2.6115159030200004,
2.4890188010200003,
2.5507319450200003
]
},
{
"command": "pnpm",
"mean": 6.127188401119999,
"stddev": 0.13051544563858503,
"median": 6.117481376019999,
"user": 9.10327846,
"system": 4.368406499999999,
"min": 5.96607421602,
"max": 6.35371301902,
"times": [
6.195679225019999,
6.020747613019999,
6.24331147002,
6.155468078019999,
6.079494674019999,
6.02849367202,
5.98523688502,
5.96607421602,
6.24366515902,
6.35371301902
]
}
]
}Scenario: Frozen Lockfile (Hot Cache)
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 0.6963826587,
"stddev": 0.04146780632427631,
"median": 0.6852439346000001,
"user": 0.37347067999999994,
"system": 1.4609696799999998,
"min": 0.6669509231,
"max": 0.8119884221,
"times": [
0.8119884221,
0.6837126691000001,
0.6867752001,
0.6669509231,
0.6809052741,
0.6942995791000001,
0.6909613651000001,
0.6912112621000001,
0.6836897741,
0.6733321181
]
},
{
"command": "pacquet@main",
"mean": 0.7619125279000001,
"stddev": 0.0696532220781103,
"median": 0.7300466361,
"user": 0.36175198000000003,
"system": 1.48992328,
"min": 0.6938067381,
"max": 0.9031553001,
"times": [
0.8231760611000001,
0.7374727681000001,
0.8436988861,
0.7288577901000001,
0.7303197931000001,
0.7002142861,
0.9031553001,
0.7297734791,
0.6938067381,
0.7286501771
]
},
{
"command": "pnpm",
"mean": 2.6510351446,
"stddev": 0.1376221718298184,
"median": 2.6043153886,
"user": 3.2584477799999996,
"system": 2.14143338,
"min": 2.5020697780999996,
"max": 2.9468712891,
"times": [
2.7888165100999998,
2.5634968461,
2.9468712891,
2.5020697780999996,
2.6110981441,
2.5185219531,
2.5975326330999997,
2.7040738160999997,
2.5719305681,
2.7059399081
]
}
]
} |
… path/round-trip tests (#436 §C review round 2) Addresses the second batch of CodeRabbit comments on #451: - `install_package_by_snapshot.rs`: expand the doc comment on `InstallPackageBySnapshotError::GitFetch` to cover both fetchers (the git-CLI path for `type: git` and the git-hosted-tarball post-pass for `gitHosted: true`). The variant already wrapped `GitFetcherError`, which spans both — the doc was just out of date. - `tarball_fetcher.rs`: drop the `wrap_prepare_error` helper. Its docstring claimed it stamped a "Failed to prepare git-hosted package" prefix, but the body was a one-line `Prepare(err)` passthrough. Inline the `map_err` at the call site and explain the upstream-vs-pacquet difference (we lean on the miette source chain + dispatcher log line for the prefix rather than mutating `err.message`) in a comment next to the call. - `tarball_fetcher/tests.rs`: add `path_field_packs_only_subdirectory` pinning that a `path: Some("packages/sub")` resolution returns files relative to the sub-dir, never including sibling-package files or carrying the monorepo-root prefix. - `lockfile/resolution/tests.rs`: add `serialize_tarball_resolution_with_path` so the new `TarballResolution.path` field has a serialization round-trip alongside its deserialization assertion. Mirrors the pattern `GitResolution.path` already follows.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
crates/git-fetcher/src/tarball_fetcher/tests.rs (2)
108-112: ⚡ Quick winAdd diagnostic logging before assertions.
The assertions check key presence/absence without logging the actual
keysvector beforehand. If these assertions fail, the error won't show what keys were actually inreceived.cas_paths.Based on learnings, test code should include logging for non-
assert_eq!assertions so diagnostic values are printed when a test fails.📋 Proposed fix to add diagnostic logging
let keys: Vec<&str> = received.cas_paths.keys().map(String::as_str).collect(); + eprintln!("keys = {keys:?}"); assert!(keys.contains(&"dist/index.js")); assert!(keys.contains(&"package.json"), "package.json always included"); assert!(!keys.contains(&"src/index.ts"), "src excluded by files field"); assert!(!keys.contains(&"test/foo.test.js"), "test excluded by files field");🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/git-fetcher/src/tarball_fetcher/tests.rs` around lines 108 - 112, Add diagnostic logging of the actual keys before the assertions so failing tests show the contents of received.cas_paths; specifically, compute the keys Vec as currently done (let keys: Vec<&str> = received.cas_paths.keys().map(String::as_str).collect();) and then log or print the keys (e.g., using dbg!, println!, or eprintln!) right before the assert! lines that reference keys (the assertions checking "dist/index.js", "package.json", "src/index.ts", "test/foo.test.js") so test failures reveal the actual keys present.
61-64: ⚡ Quick winAdd diagnostic logging before assertions.
The assertions use
assert!without logging the actual values beforehand. If these assertions fail, the diagnostic output won't show what keys were actually present inreceived.cas_pathsor the value ofreceived.built.Based on learnings, test code should include logging (e.g.,
eprintln!) for non-assert_eq!assertions so diagnostic values are printed when a test fails.📋 Proposed fix to add diagnostic logging
+ eprintln!("received.built = {}", received.built); + eprintln!("received.cas_paths.keys() = {:?}", received.cas_paths.keys().collect::<Vec<_>>()); assert!(!received.built, "no `prepare` script → not built"); assert!(received.cas_paths.contains_key("package.json")); assert!(received.cas_paths.contains_key("index.js")); assert!(received.cas_paths.contains_key("README.md"));🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/git-fetcher/src/tarball_fetcher/tests.rs` around lines 61 - 64, Add diagnostic printing before the assertions in the test to aid failures: print the value of received.built and the keys present in received.cas_paths (e.g., using eprintln! with received.built and received.cas_paths.keys().collect::<Vec<_>>() or similar) immediately before the assert! calls that check received.built and the presence of "package.json", "index.js", and "README.md" so test failure output shows the actual values; update the assertions in tests.rs around the checks that reference received and received.cas_paths accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@crates/git-fetcher/src/tarball_fetcher/tests.rs`:
- Around line 108-112: Add diagnostic logging of the actual keys before the
assertions so failing tests show the contents of received.cas_paths;
specifically, compute the keys Vec as currently done (let keys: Vec<&str> =
received.cas_paths.keys().map(String::as_str).collect();) and then log or print
the keys (e.g., using dbg!, println!, or eprintln!) right before the assert!
lines that reference keys (the assertions checking "dist/index.js",
"package.json", "src/index.ts", "test/foo.test.js") so test failures reveal the
actual keys present.
- Around line 61-64: Add diagnostic printing before the assertions in the test
to aid failures: print the value of received.built and the keys present in
received.cas_paths (e.g., using eprintln! with received.built and
received.cas_paths.keys().collect::<Vec<_>>() or similar) immediately before the
assert! calls that check received.built and the presence of "package.json",
"index.js", and "README.md" so test failure output shows the actual values;
update the assertions in tests.rs around the checks that reference received and
received.cas_paths accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 82d0d226-84ac-4ef2-b5af-e0371b4a9d03
📒 Files selected for processing (5)
crates/git-fetcher/src/cas_io.rscrates/git-fetcher/src/tarball_fetcher.rscrates/git-fetcher/src/tarball_fetcher/tests.rscrates/lockfile/src/resolution/tests.rscrates/package-manager/src/install_package_by_snapshot.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/git-fetcher/src/cas_io.rs
- crates/package-manager/src/install_package_by_snapshot.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Code Coverage
- GitHub Check: Lint and Test (macos-latest)
- GitHub Check: Lint and Test (windows-latest)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Preserve existing method chains andpipe-traitchains; do not break them into intermediateletbindings unless there is a concrete justification such as a compilation failure, borrow checker rejection, meaningful performance improvement, or other technical necessity. Refactoring for style alone is not sufficient justification.
Choose owned vs. borrowed parameters to minimize copies; prefer borrowed types (&Pathover&PathBuf,&strover&String) when it does not force extra copies.
PreferArc::clone(&x)andRc::clone(&x)overx.clone()for reference-counted types to make the cost visible at the call site.
Do not use star imports inside module bodies. Writeuse super::{Foo, bar}instead ofuse super::*;for any glob whose target is a module you control. External-crate preludes (e.g.,use rayon::prelude::*;) and root-of-module re-exports (e.g.,pub use submodule::*;inlib.rs) are exceptions.
Follow Rust API Guidelines for naming, as documented in https://rust-lang.github.io/api-guidelines/naming.html.
Declare a newtype wrapper for any branded string type being ported from TypeScript pnpm. Do not collapse the brand into a plainStringor&str; give the type its own struct so misuse is a type error.
When porting branded string types where upstream TypeScript always validates before construction, validate in the Rust port too. Construct the wrapper only viaTryFrom<String>and/orFromStr; do not provide an infallible public constructor that takes an arbitrary string.
For branded string types where upstream TypeScript never validates (used purely for type-safety to prevent confusion between string slots), expose an infallibleFrom<String>andFrom<&str>constructor in the Rust wrapper.
When upstream TypeScript occasionally constructs a branded type without validation (via bareasassertion), add afrom_str_unchecked(or similarly named) constructor on the Rust side. Keep the validating constructor as well; `from_str_u...
Files:
crates/git-fetcher/src/tarball_fetcher/tests.rscrates/git-fetcher/src/tarball_fetcher.rscrates/lockfile/src/resolution/tests.rs
🧠 Learnings (2)
📚 Learning: 2026-05-01T10:01:33.766Z
Learnt from: zkochan
Repo: pnpm/pacquet PR: 349
File: crates/reporter/src/tests.rs:121-121
Timestamp: 2026-05-01T10:01:33.766Z
Learning: In Rust test code, follow the repo’s CODE_STYLE_GUIDE test-logging rule: add logging (e.g., `eprintln!`/`eprintln!(...)`) so that useful diagnostic values are printed when a test fails, unless the assertion is `assert_eq!` (where the differing values are already included). Concretely, if you use assertions like `assert!`, `assert_ne!`, etc., ensure the test logs the relevant actual/expected values (or context) before/around the assertion so failures can be diagnosed without rerunning.
Applied to files:
crates/git-fetcher/src/tarball_fetcher/tests.rscrates/lockfile/src/resolution/tests.rs
📚 Learning: 2026-05-07T23:19:08.272Z
Learnt from: KSXGitHub
Repo: pnpm/pacquet PR: 401
File: tasks/integrated-benchmark/src/work_env.rs:343-344
Timestamp: 2026-05-07T23:19:08.272Z
Learning: When reviewing Rust code in pnpm/pacquet for deprecated API usage, do not automatically treat `serde_saphyr::to_string` as deprecated. In `serde-saphyr` v0.0.25, `serde_saphyr::to_string` has no `#[deprecated]` attribute (the `#[deprecated]` later in `serde-saphyr-0.0.25/src/lib.rs` applies to a different function). Only flag `serde_saphyr::to_string` as deprecated if the resolved dependency version’s source shows `#[deprecated]` on that specific function.
Applied to files:
crates/git-fetcher/src/tarball_fetcher/tests.rscrates/git-fetcher/src/tarball_fetcher.rscrates/lockfile/src/resolution/tests.rs
🔇 Additional comments (2)
crates/lockfile/src/resolution/tests.rs (1)
171-208: LGTM!crates/git-fetcher/src/tarball_fetcher.rs (1)
1-155: LGTM!
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
crates/git-fetcher/src/cas_io.rs:119
- Same as above:
rel.replace('/', std::path::MAIN_SEPARATOR_STR)creates an allocation per file. Consider passingreldirectly intojoin_checkedto avoid extra work in the per-file CAS import loop.
let mut out = HashMap::with_capacity(files.len());
for rel in files {
let source = join_checked(pkg_dir, &rel.replace('/', std::path::MAIN_SEPARATOR_STR))?;
let bytes = fs::read(&source).map_err(GitFetcherError::Io)?;
§C review round 3) `materialize_into` and `import_into_cas` both ran `rel.replace('/', std::path::MAIN_SEPARATOR_STR)` before handing the path to `join_checked`, allocating a fresh `String` per file. CodeRabbit pointed out that `Path::components()` (which `join_checked` already uses) recognises both `/` and `\` as separators on Windows, so the replace is unnecessary on every platform. For a 1352-package install that's 1352+ avoided allocations per fetcher pass. Pass `rel` straight to `join_checked` instead, with a short comment at each call site explaining why.
Origin/main picked up #451 (`feat(git-fetcher): install git-hosted tarballs via preparePackage + packlist`) which adds a required `path: Option<String>` field to `TarballResolution`. The `synthetic_metadata` helper in `installability/tests.rs` builds the struct literally and so needs the new field. Set it to `None` — the installability check doesn't look at `path`, so an unset value preserves the test's existing assertion subject (engines / cpu / os / libc filtering). --- Written by an agent (Claude Code, claude-opus-4-7).
Origin/main picked up #451 (`feat(git-fetcher): install git-hosted tarballs via preparePackage + packlist`) which adds a required `path: Option<String>` field to `TarballResolution`. The `synthetic_metadata` helper in `installability/tests.rs` builds the struct literally and so needs the new field. Set it to `None` — the installability check doesn't look at `path`, so an unset value preserves the test's existing assertion subject (engines / cpu / os / libc filtering). --- Written by an agent (Claude Code, claude-opus-4-7).
… fixture `crates/package-manager/src/installability/tests.rs` initialises a `TarballResolution` without the `path` field that #451 added (git- hosted tarball sub-directory selection). The omission landed when #439 (installability check, #434 slice 1) merged just before #451 — neither PR touched the other's file, so the bug shipped on `origin/main`. CI was failing with: ``` error[E0063]: missing field `path` in initializer of `TarballResolution` --> crates/package-manager/src/installability/tests.rs:56:49 ``` Fix is one line: `path: None`. The installability check ignores the resolution shape entirely (see the comment a few lines above the init site), so `None` is correct.
Origin/main grew a `path: Option<String>` field on `TarballResolution` (PR #451, `feat(git-fetcher): install git-hosted tarballs via preparePackage + packlist`). That PR landed after #439 (`feat(package-is-installable): platform + engine check`) and missed updating the `TarballResolution` literal in `crates/package-manager/src/installability/tests.rs`, so any PR opened against current main inherits a build break in the `package-is-installable` test build (#445 hit the same failure). My own `peer_dependency_in_lockfile_surfaces_unsupported` test in `crates/real-hoist/src/tests.rs` had the same gap. Add `path: None` to both literals so the test builds compile against the post-#451 `TarballResolution` shape. Includes the rebase of feat/438-slice-3b onto current origin/main.
… fixture (#455) Landrace between #439 (installability tests fixture) and #451 (added `path: Option<String>` to `TarballResolution`). Both merged green against their own bases but main's tip won't compile: error[E0063]: missing field `path` in initializer of `TarballResolution` --> crates/package-manager/src/installability/tests.rs:56:49 Add the missing field with `path: None` so the struct literal matches the current `TarballResolution` shape.
Origin/main grew a `path: Option<String>` field on `TarballResolution` (PR #451, `feat(git-fetcher): install git-hosted tarballs via preparePackage + packlist`). That PR landed after #439 (`feat(package-is-installable): platform + engine check`) and missed updating the `TarballResolution` literal in `crates/package-manager/src/installability/tests.rs`, so any PR opened against current main inherits a build break in the `package-is-installable` test build (#445 hit the same failure). My own `peer_dependency_in_lockfile_surfaces_unsupported` test in `crates/real-hoist/src/tests.rs` had the same gap. Add `path: None` to both literals so the test builds compile against the post-#451 `TarballResolution` shape. Includes the rebase of feat/438-slice-3b onto current origin/main.
Pull in 28 commits from upstream main, including the `pacquet-npmrc` → `pacquet-config` rename (#420) plus features: - supportedArchitectures + --cpu/--os/--libc (#456) - frozen-lockfile (#442, #443, #447, #450) - git-fetcher (#436 / #446, #451, #454) - side-effects cache (#421 / #422, #423, #424) - real-hoist + global-virtual-store (#432 / #438 / #444, #449, #452) - patchedDependencies + allow-builds (#425, #427, #428) - engine/platform installability (#434 / #439) Conflicts resolved: - `crates/npmrc/` files migrated under the renamed `crates/config/` directory; `Npmrc` → `Config` everywhere except `NpmrcAuth` (which keeps the `.npmrc`-domain name). - `Config::current` reads the env-var DI generic `Api: EnvVar` for ${VAR}-substitution in `.npmrc`. Production turbofish in `cli_args.rs` is `Config::current::<RealApi, _, _, _, _>(...)`. - Two-phase `NpmrcAuth::apply_*` retained so default-registry creds key at the yaml-resolved registry URL. - New `Config::auth_headers` field plumbed through `install_package_by_snapshot`'s `DownloadTarballToStore`. - Tests under `crates/config/src/workspace_yaml/tests.rs` pick up the new ParseYaml unit test added on this branch.
Summary
Builds on §B+D (#446). Adds the second half of the git-hosted install path:
TarballResolution { gitHosted: true }lockfile entries — the ones whose tarballs land atcodeload.github.com/gitlab.com/bitbucket.orgarchive endpoints — now run through a newGitHostedTarballFetcherafter the existingDownloadTarballToStoresettles their bytes into CAS.Mirrors pnpm's
gitHostedTarballFetcher.ts:fs::copyper file — hardlinking here would let prepare scripts mutate shared CAS entries).prepare_packageport that §B+D shipped — same<pm>-install+prepublish/prepack/publishdispatch, sameGIT_DEP_PREPARE_NOT_ALLOWEDgate.packlistover the prepared tree to drop source maps, test fixtures, etc. that ship in the raw archive but not in the published file set.HashMap<String, PathBuf>toCreateVirtualDirBySnapshot.Shared
cas_iomoduleThe CAS-import helpers (
import_into_cas,is_file_executable,map_write_cas) plus a newmaterialize_intolift out offetcher.rsinto a sharedcas_iomodule so both fetchers consume the same code path. No public-API change — helpers staypub(crate).cas_ioalso gains ajoin_checked(root, rel)helper that rejects absolute paths,..components, root, and drive-prefix components before joining, so a malicious tarball entry can't escapetarget_dir/pkg_dir. Errors surface asGitFetcherError::Io(InvalidInput). Bothmaterialize_intoandimport_into_casroute through it, takingreldirectly (forward slashes are valid on every platform —Path::components()recognises both/and\on Windows — so no per-fileStringallocation is needed).Supporting changes
TarballResolution.path: Option<String>mirrors pnpm'sTarballResolution.pathso a lockfile pinning a git-hosted tarball from a monorepo (path: packages/sub) deserializes without trippingdeny_unknown_fields. The fetcher threads this through toprepare_package'ssafe_join_pathso prepare and packlist run inside the right sub-directory.allow_buildclosure and theScriptsPrependNodePathconversion above the resolution match so both theGitarm and the newgitHosted: truepost-pass borrow from the same named locals — no temp closures across.await.InstallPackageBySnapshotError::GitFetch's doc now spells out that the variant covers failures from both fetchers (the git-CLI path and the git-hosted-tarball post-pass), since both share the sameGitFetcherErrortaxonomy.Out of scope (follow-ups for #436)
gitHostedStoreIndexKeyrows toindex.dbyet, so a re-install re-runs clone + prepare + packlist. Correct, but slow on the second run.${filesIndexFile}\traw+ final). Upstream optimizes "raw == prepared" by promoting the raw entry in place. Pacquet always re-imports today; hash-dedup means duplicate work but not duplicate storage.plans/TEST_PORTING.md563-572.npm-packlistsemantics (.npmignore,.gitignorelayering,bundleDependencieswalking).Test plan
cargo nextest run -p pacquet-git-fetcher— 41 tests pass:- 31 from §B+D (preferred-pm sniffer, MVP packlist, preparePackage decision logic, git-CLI fetcher integration).
- 5 new
cas_iotests forjoin_checked(accepts normal segments, strips., rejects absolute paths, rejects mid-path.., and amaterialize_intoregression that asserts nothing lands outside the target on a poisonedcas_pathsentry).- 5 new
GitHostedTarballFetchertests (pass-through, files-field filter,GIT_DEP_PREPARE_NOT_ALLOWED, CAS-isolation regression, monorepo sub-path).cargo nextest run -p pacquet-lockfile—TarballResolution { path }deserialize and serialize round-trip.cargo nextest run -p pacquet-package-manager— 148 tests still green after the dispatcher hoist.just ready— 713 tests, 713 pass.cargo doc --no-deps --workspace --document-private-itemswithRUSTDOCFLAGS=-D warnings— clean.taplo format --checkjust dylintWritten by an agent (Claude Code, claude-opus-4-7).