test(git-fetcher): end-to-end shallow-fetch argv assertion via PATH-shim (#436 §E follow-up)#487
Conversation
…him (#436 §E follow-up) Ports the last open §E item — upstream's [`fetching/git-fetcher/test/index.ts:183`](https://github.com/pnpm/pnpm/blob/94240bc046/fetching/git-fetcher/test/index.ts#L183) `still able to shallow fetch for allowed hosts` test, which jest-mocks `execa` to spy on the git argv. Pacquet can't intercept `Command::new("git")` without changing production code, so the test instead: 1. Writes a `/bin/sh` shim at `<tempdir>/shim/git` that appends each invocation (tab-separated argv) to a log file and fakes `rev-parse HEAD` so the fetcher's commit-match check passes. 2. Prepends the shim directory to `PATH` for the duration of the test body via `unsafe { std::env::set_var(...) }`. Edition 2024 requires `unsafe` here; the project's `cargo nextest` runner isolates each test in its own process, so no sibling test in the same binary can race the modified env. 3. Parses the log and asserts the shallow sequence: `git init` → `git remote add origin <url>` → `git fetch --depth 1 origin <commit>`, with `git clone` absent. `fetcher_clones_when_host_not_in_shallow_list` is the mirror — same shim, empty `git_shallow_hosts`, asserts `git clone <url> <dir>` appears and `init` / `fetch` don't. Together the two tests pin the gate's polarity end-to-end (the existing `should_use_shallow_matches_known_host` unit test only covers the predicate, not the argv shape). Unix-only via `#[cfg(unix)]` — a Windows mirror would need a `.cmd` shim and a different launcher. The `should_use_shallow_matches_known_host` cross-platform unit test still covers the predicate on Windows. Manually verified the tests catch a regression: temporarily inverting the `should_use_shallow` gate flipped both tests to red with the exact argv mismatch in the panic message.
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis pull request adds Unix-only end-to-end test coverage for ChangesGit Fetcher Shallow-Fetch End-to-End Tests
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
Adds Unix-only end-to-end coverage for git-fetcher shallow-fetch behavior using a PATH shim that records git argv, and updates the test-porting plan to mark the upstream shallow-fetch test as ported.
Changes:
- Adds a
/bin/shgit shim helper and parser for captured invocations. - Adds shallow and non-shallow branch tests for
GitFetcher. - Updates
plans/TEST_PORTING.mdfor the completed upstream test port.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
crates/git-fetcher/src/fetcher/tests.rs |
Adds Unix-only shim-based tests for shallow fetch vs clone behavior. |
plans/TEST_PORTING.md |
Marks the upstream shallow-fetch argv test as ported and documents the new coverage. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #487 +/- ##
==========================================
- Coverage 90.42% 88.35% -2.07%
==========================================
Files 121 121
Lines 11475 12894 +1419
==========================================
+ Hits 10376 11393 +1017
- Misses 1099 1501 +402 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Micro-Benchmark ResultsLinux |
Integrated-Benchmark Report (Linux)Scenario: Frozen Lockfile
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 2.66101510222,
"stddev": 0.07299444928905274,
"median": 2.66236365412,
"user": 2.5637572599999996,
"system": 3.7225015,
"min": 2.53098525762,
"max": 2.78632088262,
"times": [
2.78632088262,
2.66320743162,
2.72796467662,
2.53098525762,
2.70948548962,
2.68179871362,
2.65291957062,
2.66151987662,
2.60354263262,
2.59240649062
]
},
{
"command": "pacquet@main",
"mean": 2.63025521072,
"stddev": 0.07949312047856921,
"median": 2.59445988712,
"user": 2.5625613599999992,
"system": 3.6604769000000004,
"min": 2.54570666862,
"max": 2.76492160362,
"times": [
2.76376061762,
2.59828516762,
2.56786722462,
2.58867031762,
2.54570666862,
2.5675087096199998,
2.59063460662,
2.76492160362,
2.64987415562,
2.66532303562
]
},
{
"command": "pnpm",
"mean": 6.19991700312,
"stddev": 0.07159398981596043,
"median": 6.2209018916200005,
"user": 9.22382396,
"system": 4.449930199999999,
"min": 6.056127825620001,
"max": 6.308616828620001,
"times": [
6.24928509662,
6.237459446620001,
6.23155378662,
6.19976706962,
6.22200801762,
6.056127825620001,
6.308616828620001,
6.219795765620001,
6.14286017962,
6.13169601462
]
}
]
}Scenario: Frozen Lockfile (Hot Cache)
BENCHMARK_REPORT.json{
"results": [
{
"command": "pacquet@HEAD",
"mean": 0.7322376294399999,
"stddev": 0.057956674130559,
"median": 0.71326355614,
"user": 0.37818184,
"system": 1.57402906,
"min": 0.7000074261400001,
"max": 0.8931803801400001,
"times": [
0.8931803801400001,
0.74389647114,
0.7000074261400001,
0.7100878221400001,
0.7132115691400001,
0.7031384261400001,
0.7133155431400001,
0.72420710714,
0.70356267514,
0.71776887414
]
},
{
"command": "pacquet@main",
"mean": 0.7776610736400001,
"stddev": 0.07866006290159108,
"median": 0.7647847506400001,
"user": 0.38002713999999993,
"system": 1.59474386,
"min": 0.7025277591400001,
"max": 0.9573538541400001,
"times": [
0.82387463914,
0.7848215311400001,
0.77768257214,
0.7197139421400001,
0.7163704561400001,
0.9573538541400001,
0.7518869291400001,
0.7025277591400001,
0.7091303021400001,
0.8332487511400001
]
},
{
"command": "pnpm",
"mean": 2.64366342804,
"stddev": 0.10336434969006977,
"median": 2.67266057264,
"user": 3.20104734,
"system": 2.19024536,
"min": 2.5222465591399996,
"max": 2.84568552414,
"times": [
2.5641724721399997,
2.70763073414,
2.84568552414,
2.53898558114,
2.5222465591399996,
2.68480529714,
2.66105791714,
2.5307713711399997,
2.68426322814,
2.69701559614
]
}
]
} |
… guard) CodeRabbit / Copilot on #487: - **Shell injection.** The shim's `format!("printf '%s\\n' {commit:?}")` baked the temp paths and fake commit into the script body via Rust debug-quoting, which wraps in `"..."` and leaves `$` / backticks / `\` open to /bin/sh re-interpretation. A `TMPDIR` containing those characters would either fail the test or execute attacker-controlled shell. Fixed by reading both paths/values from env vars (`PACQUET_GIT_SHIM_LOG`, `PACQUET_GIT_SHIM_FAKE_COMMIT`) at run time — the shim body is now a static string. - **PATH race under `cargo test`.** The previous version relied on `cargo nextest`'s one-process-per-test isolation, leaving `cargo test` susceptible to two concurrent tests racing each other's `unsafe { set_var("PATH", ...) }`. Extract `EnvGuard` from `pacquet-config`'s private `test_env_guard` into `pacquet_testing_utils::env_guard` and use it here — process-wide `Mutex` holds the env-mutation lock for the test body and `Drop` restores all three named variables. Three call sites in `pacquet-config` updated to the new path; one fewer copy of the same helper. - **Ordering not pinned.** Positive assertions used `.any()`, so a regression that emitted `init` / `remote add` / `fetch` in the wrong order still passed. Switched to `position_of(&invocations, argv)` + a strict `init_at < remote_at < fetch_at` check. - **Missing `remote add` absence in non-shallow test.** The clone test only rejected `init` and `fetch`; a regression that ran both the clone *and* the shallow `remote add origin` sequence would have slipped through. Now the test rejects all three shallow-branch commands. - **Helper doc claim.** Updated `write_git_shim`'s rustdoc — it's used by both the shallow and non-shallow tests, and the non-shallow `git clone` *is* exercised through it. Regression-catch verified for each fix (inverted gate, swapped argv order, extra `remote add` in clone branch — all flipped the relevant assertion red with the exact mismatch in the panic).
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/testing-utils/src/env_guard.rs (1)
84-90: ⚡ Quick winFail fast when
setis called for a non-snapshotted variable
EnvGuard::setcurrently trusts the caller; a typo innamesilently leaks env state beyond guard drop. Add a guard assertion so misuse is caught immediately in tests.Proposed change
pub fn set(&self, name: &'static str, value: impl AsRef<OsStr>) { + debug_assert!( + self.saved.iter().any(|(saved_name, _)| *saved_name == name), + "EnvGuard::set called for '{name}' which was not included in EnvGuard::snapshot(...)" + ); // SAFETY: the guard holds `env_mutex`, so no other // `EnvGuard`-using test in this process is running // concurrently. The variable was named at `snapshot` time, // so `Drop` will restore it. unsafe { env::set_var(name, value) }; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/testing-utils/src/env_guard.rs` around lines 84 - 90, EnvGuard::set currently allows setting any env var; add a runtime assertion that the provided name was recorded at snapshot time to fail fast on typos: before calling unsafe { env::set_var(...) } check that the guard’s snapshot contains the name (e.g. assert!(self.snapshot.contains_key(name) || self.snapshot.contains(name), "EnvGuard::set called for non-snapshotted variable: {}", name)); if the actual snapshot field has a different name, reference that field instead, then proceed with the existing unsafe env::set_var call.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@crates/testing-utils/src/env_guard.rs`:
- Around line 84-90: EnvGuard::set currently allows setting any env var; add a
runtime assertion that the provided name was recorded at snapshot time to fail
fast on typos: before calling unsafe { env::set_var(...) } check that the
guard’s snapshot contains the name (e.g.
assert!(self.snapshot.contains_key(name) || self.snapshot.contains(name),
"EnvGuard::set called for non-snapshotted variable: {}", name)); if the actual
snapshot field has a different name, reference that field instead, then proceed
with the existing unsafe env::set_var call.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 17cdd8ae-c3cf-4f3a-a08f-02c2ed5b0ddf
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (8)
crates/config/Cargo.tomlcrates/config/src/defaults/tests.rscrates/config/src/lib.rscrates/config/src/npmrc_auth/tests.rscrates/git-fetcher/Cargo.tomlcrates/git-fetcher/src/fetcher/tests.rscrates/testing-utils/src/env_guard.rscrates/testing-utils/src/lib.rs
✅ Files skipped from review due to trivial changes (3)
- crates/config/Cargo.toml
- crates/testing-utils/src/lib.rs
- crates/git-fetcher/Cargo.toml
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/git-fetcher/src/fetcher/tests.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Lint and Test (ubuntu-latest)
- GitHub Check: Lint and Test (windows-latest)
- GitHub Check: Run benchmark on ubuntu-latest
- GitHub Check: Code Coverage
- GitHub Check: Run benchmark on ubuntu-latest
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Preserve existing method chains andpipe-traitchains; do not break them into intermediateletbindings unless there is a concrete justification such as a compilation failure, borrow checker rejection, meaningful performance improvement, or other technical necessity. Refactoring for style alone is not sufficient justification.
Choose owned vs. borrowed parameters to minimize copies; prefer borrowed types (&Pathover&PathBuf,&strover&String) when it does not force extra copies.
PreferArc::clone(&x)andRc::clone(&x)overx.clone()for reference-counted types to make the cost visible at the call site.
Do not use star imports inside module bodies. Writeuse super::{Foo, bar}instead ofuse super::*;for any glob whose target is a module you control. External-crate preludes (e.g.,use rayon::prelude::*;) and root-of-module re-exports (e.g.,pub use submodule::*;inlib.rs) are exceptions.
Follow Rust API Guidelines for naming, as documented in https://rust-lang.github.io/api-guidelines/naming.html.
Declare a newtype wrapper for any branded string type being ported from TypeScript pnpm. Do not collapse the brand into a plainStringor&str; give the type its own struct so misuse is a type error.
When porting branded string types where upstream TypeScript always validates before construction, validate in the Rust port too. Construct the wrapper only viaTryFrom<String>and/orFromStr; do not provide an infallible public constructor that takes an arbitrary string.
For branded string types where upstream TypeScript never validates (used purely for type-safety to prevent confusion between string slots), expose an infallibleFrom<String>andFrom<&str>constructor in the Rust wrapper.
When upstream TypeScript occasionally constructs a branded type without validation (via bareasassertion), add afrom_str_unchecked(or similarly named) constructor on the Rust side. Keep the validating constructor as well; `from_str_u...
Files:
crates/config/src/npmrc_auth/tests.rscrates/config/src/defaults/tests.rscrates/config/src/lib.rscrates/testing-utils/src/env_guard.rs
🧠 Learnings (2)
📚 Learning: 2026-05-01T10:01:33.766Z
Learnt from: zkochan
Repo: pnpm/pacquet PR: 349
File: crates/reporter/src/tests.rs:121-121
Timestamp: 2026-05-01T10:01:33.766Z
Learning: In Rust test code, follow the repo’s CODE_STYLE_GUIDE test-logging rule: add logging (e.g., `eprintln!`/`eprintln!(...)`) so that useful diagnostic values are printed when a test fails, unless the assertion is `assert_eq!` (where the differing values are already included). Concretely, if you use assertions like `assert!`, `assert_ne!`, etc., ensure the test logs the relevant actual/expected values (or context) before/around the assertion so failures can be diagnosed without rerunning.
Applied to files:
crates/config/src/npmrc_auth/tests.rscrates/config/src/defaults/tests.rs
📚 Learning: 2026-05-07T23:19:08.272Z
Learnt from: KSXGitHub
Repo: pnpm/pacquet PR: 401
File: tasks/integrated-benchmark/src/work_env.rs:343-344
Timestamp: 2026-05-07T23:19:08.272Z
Learning: When reviewing Rust code in pnpm/pacquet for deprecated API usage, do not automatically treat `serde_saphyr::to_string` as deprecated. In `serde-saphyr` v0.0.25, `serde_saphyr::to_string` has no `#[deprecated]` attribute (the `#[deprecated]` later in `serde-saphyr-0.0.25/src/lib.rs` applies to a different function). Only flag `serde_saphyr::to_string` as deprecated if the resolved dependency version’s source shows `#[deprecated]` on that specific function.
Applied to files:
crates/config/src/npmrc_auth/tests.rscrates/config/src/defaults/tests.rscrates/config/src/lib.rscrates/testing-utils/src/env_guard.rs
🔇 Additional comments (3)
crates/config/src/lib.rs (1)
798-801: LGTM!Also applies to: 1229-1229, 1259-1259, 1304-1304
crates/config/src/defaults/tests.rs (1)
6-6: LGTM!crates/config/src/npmrc_auth/tests.rs (1)
10-10: LGTM!Also applies to: 65-65
…-init GitHub's macOS runner image started shipping `rustup-init` symlinked as `cargo` / `rustc` in `/usr/local/bin/` ahead of the rustup proxies that live in `~/.cargo/bin/`. The "Clippy" step in the macOS matrix then ran `cargo clippy --locked -- -D warnings` and got back `error: unexpected argument 'clippy' found / Usage: rustup-init[EXE]` — rustup-init doesn't understand cargo subcommands. `rustup show` / `rustup component add` calls still work because rustup proper is on PATH at its normal location; only the `cargo` / `rustc` proxies are stomped by the runner image's preinstalled `rustup-init`. Prepending `~/.cargo/bin` to `GITHUB_PATH` once, at the top of the shared rustup composite action, ensures every later step picks the rustup proxies first. Started failing on `main` between commits 25823596870 (success) and 25823996109 (failure) on 2026-05-13, which lines up with the GitHub-hosted macos-14 image bump that shipped with the new rustup-init layout.
…n shim tests Copilot review on #487 (3237379604, 3237379656): the previous shim tests mutated process-global `PATH` under an `EnvGuard` that only serialized the *two* shim tests against each other — sibling tests in the same binary that call `Command::new("git")` (fixture-setup helpers in `make_bare_repo` / `make_bare_repo_with_prepare_script`, ad-hoc `skip_if_no_git` spawns, the other fetcher tests' end-to-end flows) had no lock, so a parallel `cargo test -p pacquet-git-fetcher` run could let those tests resolve `git` to the shim mid-flight and fail unpredictably. The fix swaps process-global `PATH` for a per-fetcher binary override: - New `GitFetcher::git_bin: Option<&Path>` field. Production callers leave it `None` (unchanged behavior: resolve `git` through `PATH`, matching upstream `execa('git', …)`). Tests pass a shim path directly, so only that one fetcher resolves through the shim. - `exec_git` is now a `#[cfg(test)]` convenience wrapper over the new `exec_git_with(bin, …)` which takes the binary path explicitly. Production `GitFetcher::run_sync` calls `exec_git_with` with `self.git_bin.unwrap_or(Path::new("git"))`. - The two shim tests now pass `git_bin: Some(&shim_path)` instead of prepending the shim dir to `PATH`. The only env mutation left is setting `PACQUET_GIT_SHIM_LOG` / `PACQUET_GIT_SHIM_FAKE_COMMIT` — those are read only by the shim itself, so a sibling test's real `git --version` invocation can't see or care about them. `EnvGuard` still serializes the two shim tests against each other so their log paths don't cross-contaminate. Also reverts the speculative `~/.cargo/bin` PATH prepend in the shared rustup composite action — it didn't fix the macOS Clippy failure (proxies were never the issue) and conflated CI plumbing with this PR's scope. 11 `GitFetcher { … }` struct literals updated to add `git_bin: None` (10 in `fetcher/tests.rs`, 1 in `install_package_by_snapshot.rs`).
`cargo doc --document-private-items` in CI runs with `RUSTDOCFLAGS=-D warnings`, which treats `rustdoc::private-intra-doc-links` as an error: the previous doc-string linked from the `pub git_bin` field to the `#[cfg(test)]`-gated `exec_git_with`, and the `private-intra-doc-links` lint flagged the broken external link. Replaced the intra-doc link with a prose description of the `Command::new(path)` behavior, which is the user-visible contract this field exposes.
Summary
Ports the last open §E test — upstream's
fetching/git-fetcher/test/index.ts:183still able to shallow fetch for allowed hosts, which jest-mocksexecato spy on thegitargv.Pacquet can't intercept
Command::new("git")without touching production code, so the test uses a/bin/shshim onPATH:<tempdir>/shim/git: a tiny shell script that tab-logs each invocation to a file and fakesrev-parse HEADso the fetcher's commit-match check passes.PATHfor the test body viaunsafe { std::env::set_var(...) }. Edition 2024 requiresunsafe; the project'scargo nextestrunner isolates each test in its own process so no sibling can race the modified env. PATH is restored before assertions.git init→git remote add origin <url>→git fetch --depth 1 origin <commit>, withgit cloneabsent.fetcher_clones_when_host_not_in_shallow_listis the mirror — same shim, emptygit_shallow_hosts, assertsgit clone <url> <dir>appears andinit/fetchdon't. Together the two tests pin the gate's polarity end-to-end. The existingshould_use_shallow_matches_known_hostunit test only covers the predicate cross-platform; the new tests add Unix end-to-end argv coverage.Unix-only via
#[cfg(unix)]. A Windows mirror would need a.cmdshim and a different launcher; the predicate-level test still covers Windows.Test plan
cargo nextest run -p pacquet-git-fetcher— 74 tests pass (+2 new shim tests, the previously-[ ]§E item now[x]).just ready— workspace clean.just dylint— perfectionist lints clean.should_use_shallowbranch infetcher.rs:102flipped both tests to red with the exact argv mismatch printed in the panic message. Reverted before commit.Why a
unsafeset_varEdition 2024 marks
std::env::set_varunsafebecause thread-internal env mutation can race other threads reading env. The project's standard test runner iscargo nextest run(= one test per process), so a sibling test can't race the modifiedPATH. The test restoresPATHbefore any assertion runs so a failure surfaces with the original PATH intact. Per-test-body env mutation under nextest is the same pattern pnpm'sjest.mock('execa')uses upstream — both isolate the side-effect to the test process.Written by an agent (Claude Code, claude-opus-4-7).
Summary by CodeRabbit
Tests
Documentation
Chores