Skip to content

Commit 3d1fd20

Browse files
sijie-Zzkochan
andauthored
fix(bins.linker): skip redundant .exe warning when hardlink is already correct (#12284)
Closes [#12203](#12203). On Windows, `node` is linked by hardlinking `node.exe` directly rather than through a cmd-shim. The early-exit in `linkBin()` only recognized an existing cmd-shim, so on every warm install the existing (already correct) `node.exe` was treated as a conflict: pnpm warned `The target bin directory already contains an exe called node` and then removed and re-linked it. Because many commands re-link `node`, the warning was spammed. ### `@pnpm/bins.linker` Before warning and replacing an existing `.exe`, check whether it already refers to the link target via a new `isSameFile()` helper: - Compares the OS file identity (inode/device), read as `BigInt` to avoid the precision loss NTFS 64-bit file IDs suffer when cast to a `Number`. A zero/unreliable inode (common on Windows) is not treated as a match. - Falls back to a streaming, chunked (64 KB) content comparison when identity can't be established, so a byte-identical copy still counts as the same file without ever buffering a whole executable. A read error during the comparison is treated as "not the same file" so it can never abort bin linking. The early-return is scoped to the `node.exe` path, so non-`node` commands still fall through to the existing warn + remove + `cmdShim()` regeneration and never end up with a partially populated bins directory. ### pacquet pacquet never emitted this warning, but its Windows `link_node_bin` unconditionally removed and re-linked `node.exe` on every install. Ported the same same-file early-return to `cmd-shim` so warm installs skip that churn, using `same_file::Handle` for the cheap identity check (promoted from a transitive to a workspace dependency) with the same chunked content fallback. --------- Co-authored-by: Zoltan Kochan <z@kochan.io>
1 parent 4ca9247 commit 3d1fd20

8 files changed

Lines changed: 269 additions & 5 deletions

File tree

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
"@pnpm/bins.linker": patch
3+
"pnpm": patch
4+
---
5+
6+
Skip the redundant "target bin directory already contains an exe called node" warning on Windows when the existing `node.exe` already matches the target (same hard link or identical content) [pnpm/pnpm#12203](https://github.com/pnpm/pnpm/issues/12203).

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,7 @@ pgp = { version = "0.19.0", default-features = false }
135135
rayon = { version = "1.12.0" }
136136
rmp-serde = { version = "1.3.0" }
137137
rusqlite = { version = "0.39.0", features = ["bundled"] }
138+
same-file = { version = "1.0.6" }
138139
serde = { version = "1.0.228", features = ["derive"] }
139140
serde_json = { version = "1.0.150", features = ["preserve_order", "raw_value"] }
140141
serde-saphyr = { version = "0.0.27" }

bins/linker/src/index.ts

Lines changed: 74 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -280,15 +280,23 @@ async function linkBin (cmd: CommandInfo, binsDir: string, opts?: LinkBinOptions
280280
} catch {}
281281
if (IS_WINDOWS) {
282282
const exePath = path.join(binsDir, `${cmd.name}${getExeExtension()}`)
283+
// node.exe is the only bin pnpm links directly as a real executable rather
284+
// than through a cmd-shim, so the existing-exe handling only applies to it.
285+
// We could update our own cmd shims to support node.cmd, but we can't
286+
// control npm's cmd shims, which break when node resolves to node.cmd.
287+
// npm's cmd shims use `IF EXIST "%~dp0\node.exe"` to find the node binary.
288+
const isNodeExe = cmd.name === 'node' && cmd.path.toLowerCase().endsWith('.exe')
283289
if (existsSync(exePath)) {
290+
// Skip warning and re-linking when the existing node.exe already matches
291+
// the target, otherwise every command that re-links node would spam the
292+
// warning below on warm installs.
293+
if (isNodeExe && await isSameFile(exePath, cmd.path)) {
294+
return
295+
}
284296
globalWarn(`The target bin directory already contains an exe called ${cmd.name}, so removing ${exePath}`)
285297
await rimraf(exePath)
286298
}
287-
// node.exe must exist as a real executable, not a cmd-shim wrapper.
288-
// We could update our own cmd shims to support node.cmd, but we can't
289-
// control npm's cmd shims, which break when node resolves to node.cmd.
290-
// npm's cmd shims use `IF EXIST "%~dp0\node.exe"` to find the node binary.
291-
if (cmd.name === 'node' && cmd.path.toLowerCase().endsWith('.exe')) {
299+
if (isNodeExe) {
292300
try {
293301
await fs.link(cmd.path, exePath)
294302
} catch {
@@ -361,6 +369,67 @@ async function linkBin (cmd: CommandInfo, binsDir: string, opts?: LinkBinOptions
361369
}
362370
}
363371

372+
// Reports whether two paths refer to the same file. A matching inode/device
373+
// pair (read as BigInts to avoid the precision loss of NTFS 64-bit file IDs)
374+
// proves a hard link cheaply. Whenever identity can't be established that way —
375+
// because the inodes genuinely differ or because Windows reports an unreliable
376+
// zero inode — we fall back to comparing the file contents after a quick size
377+
// check, which also treats a byte-identical copy as the same file.
378+
async function isSameFile (pathA: string, pathB: string): Promise<boolean> {
379+
const [statA, statB] = await Promise.all([
380+
fs.stat(pathA, { bigint: true }).catch(() => null),
381+
fs.stat(pathB, { bigint: true }).catch(() => null),
382+
])
383+
if (statA == null || statB == null) return false
384+
if (statA.ino && statB.ino && statA.ino === statB.ino && statA.dev === statB.dev) {
385+
return true
386+
}
387+
if (statA.size !== statB.size) return false
388+
return haveEqualContents(pathA, pathB)
389+
}
390+
391+
const FILE_COMPARE_CHUNK_SIZE = 64 * 1024
392+
393+
// Compares two equally-sized files chunk by chunk, so an executable is never
394+
// fully buffered in memory and a mismatch returns as early as possible.
395+
async function haveEqualContents (pathA: string, pathB: string): Promise<boolean> {
396+
const [fhA, fhB] = await Promise.all([
397+
fs.open(pathA, 'r').catch(() => null),
398+
fs.open(pathB, 'r').catch(() => null),
399+
])
400+
if (fhA == null || fhB == null) {
401+
await fhA?.close().catch(() => {})
402+
await fhB?.close().catch(() => {})
403+
return false
404+
}
405+
try {
406+
const bufA = Buffer.alloc(FILE_COMPARE_CHUNK_SIZE)
407+
const bufB = Buffer.alloc(FILE_COMPARE_CHUNK_SIZE)
408+
let position = 0
409+
for (;;) {
410+
// Reading sequentially is intentional: each iteration compares one chunk
411+
// and stops early on a mismatch or EOF.
412+
const [readA, readB] = await Promise.all([ // eslint-disable-line no-await-in-loop
413+
fhA.read(bufA, 0, FILE_COMPARE_CHUNK_SIZE, position),
414+
fhB.read(bufB, 0, FILE_COMPARE_CHUNK_SIZE, position),
415+
])
416+
if (readA.bytesRead !== readB.bytesRead) return false
417+
if (readA.bytesRead === 0) return true
418+
if (!bufA.subarray(0, readA.bytesRead).equals(bufB.subarray(0, readB.bytesRead))) {
419+
return false
420+
}
421+
position += readA.bytesRead
422+
}
423+
} catch {
424+
// A transient read error must not abort bin linking: treat the files as
425+
// different so the caller falls back to the warn + remove + relink path.
426+
return false
427+
} finally {
428+
await fhA.close().catch(() => {})
429+
await fhB.close().catch(() => {})
430+
}
431+
}
432+
364433
// `fixBin` chmods the bin's source file (which lives inside the store) to make
365434
// it executable and rewrites a Windows CRLF shebang to LF. Under the global
366435
// virtual store that source is `{storeDir}/links/...`, so on a read-only store

bins/linker/test/index.ts

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -733,6 +733,70 @@ describe('node binary linking', () => {
733733
// No cmd-shim should be created since we return early
734734
expect(fs.existsSync(path.join(binTarget, `node${CMD_EXTENSION}`))).toBe(false)
735735
})
736+
737+
testOnWindows('linkBinsOfPackages() does not warn when node.exe is already the correct hardlink', async () => {
738+
const binTarget = temporaryDirectory()
739+
const nodeDir = temporaryDirectory()
740+
741+
fs.writeFileSync(path.join(nodeDir, 'node.exe'), 'fake-node-binary', 'utf8')
742+
743+
const pkgs = [
744+
{
745+
location: nodeDir,
746+
manifest: {
747+
name: 'node',
748+
version: '20.0.0',
749+
bin: { node: 'node.exe' },
750+
},
751+
},
752+
]
753+
754+
// First call creates the hardlink
755+
await linkBinsOfPackages(pkgs, binTarget)
756+
jest.mocked(globalWarn).mockClear()
757+
758+
// Second call should not warn because the .exe is already the correct hardlink
759+
await linkBinsOfPackages(pkgs, binTarget)
760+
761+
// The absence of a warning is the regression signal: the warn and the
762+
// `rimraf` that recreates node.exe live in the same branch, so no warning
763+
// means node.exe was left untouched. We don't assert on inode identity
764+
// because node.exe may legitimately be a copy rather than a hardlink (the
765+
// implementation falls back to copyFile when hardlinking fails).
766+
expect(globalWarn).not.toHaveBeenCalled()
767+
const exePath = path.join(binTarget, 'node.exe')
768+
expect(fs.existsSync(exePath)).toBe(true)
769+
expect(fs.readFileSync(exePath, 'utf8')).toBe('fake-node-binary')
770+
})
771+
772+
testOnWindows('linkBinsOfPackages() does not warn when node.exe has identical content but is not a hardlink', async () => {
773+
const binTarget = temporaryDirectory()
774+
const nodeDir = temporaryDirectory()
775+
776+
fs.writeFileSync(path.join(nodeDir, 'node.exe'), 'fake-node-binary', 'utf8')
777+
// Pre-place an independent copy with identical content (different inode), as
778+
// happens on filesystems where Windows reports a zero inode and the previous
779+
// link fell back to a copy.
780+
fs.writeFileSync(path.join(binTarget, 'node.exe'), 'fake-node-binary', 'utf8')
781+
782+
const pkgs = [
783+
{
784+
location: nodeDir,
785+
manifest: {
786+
name: 'node',
787+
version: '20.0.0',
788+
bin: { node: 'node.exe' },
789+
},
790+
},
791+
]
792+
793+
await linkBinsOfPackages(pkgs, binTarget)
794+
795+
expect(globalWarn).not.toHaveBeenCalled()
796+
const exePath = path.join(binTarget, 'node.exe')
797+
expect(fs.existsSync(exePath)).toBe(true)
798+
expect(fs.readFileSync(exePath, 'utf8')).toBe('fake-node-binary')
799+
})
736800
})
737801

738802
test('linkBins() resolves conflicts using BIN_OWNER_OVERRIDES (npx owned by npm)', async () => {

pacquet/crates/cmd-shim/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ derive_more = { workspace = true }
1717
miette = { workspace = true }
1818
pipe-trait = { workspace = true }
1919
rayon = { workspace = true }
20+
same-file = { workspace = true }
2021
serde_json = { workspace = true }
2122
walkdir = { workspace = true }
2223

pacquet/crates/cmd-shim/src/link_bins.rs

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -588,6 +588,13 @@ fn link_node_bin(target_path: &Path, shim_path: &Path) -> Result<bool, LinkBinsE
588588
return Ok(false);
589589
}
590590
let exe_path = with_extension_appended(shim_path, "exe");
591+
// Skip the remove + relink churn on warm installs when `node.exe`
592+
// already refers to the source binary. Mirrors pnpm's same-file
593+
// early-return in
594+
// [`bins/linker/src/index.ts`](https://github.com/pnpm/pnpm/blob/06d2d3deb2/bins/linker/src/index.ts#L281-L308).
595+
if is_same_file(&exe_path, target_path) {
596+
return Ok(true);
597+
}
591598
remove_stale_bin(&exe_path)?;
592599
if fs::hard_link(target_path, &exe_path).is_err() {
593600
fs::copy(target_path, &exe_path).map_err(|error| LinkBinsError::LinkNodeBin {
@@ -599,6 +606,72 @@ fn link_node_bin(target_path: &Path, shim_path: &Path) -> Result<bool, LinkBinsE
599606
Ok(true)
600607
}
601608

609+
/// Whether `a` and `b` are the same file. [`same_file::Handle`] proves a hard
610+
/// link cheaply via the OS file identity (device + inode on Unix, file index +
611+
/// volume serial on Windows). When that identity can't be obtained — a missing
612+
/// file, or a filesystem that doesn't expose a stable index — we fall back to
613+
/// comparing the file contents after a quick size check, which also treats a
614+
/// byte-identical copy as the same file. Mirrors `isSameFile` in pnpm's
615+
/// `bins.linker`.
616+
#[cfg(windows)]
617+
fn is_same_file(a: &Path, b: &Path) -> bool {
618+
if let (Ok(handle_a), Ok(handle_b)) =
619+
(same_file::Handle::from_path(a), same_file::Handle::from_path(b))
620+
&& handle_a == handle_b
621+
{
622+
return true;
623+
}
624+
match (std::fs::metadata(a), std::fs::metadata(b)) {
625+
(Ok(meta_a), Ok(meta_b)) => meta_a.len() == meta_b.len() && have_equal_contents(a, b),
626+
_ => false,
627+
}
628+
}
629+
630+
/// Compare two equally-sized files chunk by chunk, so an executable is never
631+
/// fully buffered in memory and a mismatch returns as early as possible.
632+
#[cfg(windows)]
633+
fn have_equal_contents(a: &Path, b: &Path) -> bool {
634+
const CHUNK_SIZE: usize = 64 * 1024;
635+
let (Ok(mut file_a), Ok(mut file_b)) = (std::fs::File::open(a), std::fs::File::open(b)) else {
636+
return false;
637+
};
638+
let mut buf_a = vec![0u8; CHUNK_SIZE];
639+
let mut buf_b = vec![0u8; CHUNK_SIZE];
640+
loop {
641+
let (Ok(read_a), Ok(read_b)) =
642+
(read_chunk(&mut file_a, &mut buf_a), read_chunk(&mut file_b, &mut buf_b))
643+
else {
644+
return false;
645+
};
646+
if read_a != read_b {
647+
return false;
648+
}
649+
if read_a == 0 {
650+
return true;
651+
}
652+
if buf_a[..read_a] != buf_b[..read_b] {
653+
return false;
654+
}
655+
}
656+
}
657+
658+
/// Read up to `buf.len()` bytes, looping over short reads so a full chunk is
659+
/// only short at end of file. Like [`std::io::Read::read_exact`] but tolerant
660+
/// of EOF.
661+
#[cfg(windows)]
662+
fn read_chunk(reader: &mut impl std::io::Read, buf: &mut [u8]) -> io::Result<usize> {
663+
let mut filled = 0;
664+
while filled < buf.len() {
665+
match reader.read(&mut buf[filled..]) {
666+
Ok(0) => break,
667+
Ok(n) => filled += n,
668+
Err(error) if error.kind() == io::ErrorKind::Interrupted => {}
669+
Err(error) => return Err(error),
670+
}
671+
}
672+
Ok(filled)
673+
}
674+
602675
/// Remove an existing dirent at `path`, swallowing `NotFound`. Used by
603676
/// [`link_node_bin`] to clear any prior shim / symlink / hardlink
604677
/// before laying down the new one. Any other IO error (`PermissionDenied`,

pacquet/crates/cmd-shim/src/link_bins/tests.rs

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1272,6 +1272,55 @@ fn link_node_bin_hardlinks_node_exe_on_windows() {
12721272
);
12731273
}
12741274

1275+
/// Windows-only: when `node.exe` already has identical content to the
1276+
/// source binary (e.g. a prior copy-fallback install), the linker must
1277+
/// leave it in place instead of removing and relinking it. Exercises the
1278+
/// content-comparison fallback, since the pre-existing copy has a different
1279+
/// file identity than the source. Mirrors pnpm's same-file early-return at
1280+
/// <https://github.com/pnpm/pnpm/blob/06d2d3deb2/bins/linker/src/index.ts#L281-L308>.
1281+
#[cfg(windows)]
1282+
#[test]
1283+
fn link_node_bin_skips_relink_when_node_exe_already_correct() {
1284+
use same_file::Handle;
1285+
let tmp = tempdir().unwrap();
1286+
let bin_target = tmp.path().join("bin_target");
1287+
create_dir_all(&bin_target).unwrap();
1288+
let node_dir = tmp.path().join("node_pkg");
1289+
create_dir_all(&node_dir).unwrap();
1290+
write_file(node_dir.join("node.exe"), "fake-node-binary").unwrap();
1291+
// Pre-place an independent copy with identical content (a different file
1292+
// identity), as an earlier copy-fallback install would leave behind.
1293+
write_file(bin_target.join("node.exe"), "fake-node-binary").unwrap();
1294+
1295+
write_file(
1296+
node_dir.join("package.json"),
1297+
json!({"name": "node", "version": "20.0.0", "bin": {"node": "node.exe"}}).to_string(),
1298+
)
1299+
.unwrap();
1300+
let manifest: Value =
1301+
serde_json::from_slice(&read_file(node_dir.join("package.json")).unwrap()).unwrap();
1302+
link_bins_of_packages::<Host>(
1303+
&[PackageBinSource::new(node_dir.clone(), Arc::new(manifest))],
1304+
&bin_target,
1305+
)
1306+
.unwrap();
1307+
1308+
let exe = bin_target.join("node.exe");
1309+
assert_eq!(read_to_string(&exe).unwrap(), "fake-node-binary");
1310+
// The pre-existing copy is left in place rather than removed and relinked:
1311+
// node.exe must not become a hardlink to the source binary. When file
1312+
// identity can't be obtained (the production code tolerates this), treat
1313+
// them as distinct rather than panicking on a failed handle lookup.
1314+
let relinked_to_source = matches!(
1315+
(Handle::from_path(&exe), Handle::from_path(node_dir.join("node.exe"))),
1316+
(Ok(exe_handle), Ok(source_handle)) if exe_handle == source_handle,
1317+
);
1318+
assert!(
1319+
!relinked_to_source,
1320+
"node.exe must stay the independent copy, not be relinked to the source",
1321+
);
1322+
}
1323+
12751324
/// Windows-only: when the node manifest declares a non-`.exe` source
12761325
/// (uncommon but possible — e.g. a wrapper script), pnpm falls through
12771326
/// to the regular cmd-shim path. Pacquet must too.

0 commit comments

Comments
 (0)