Skip to content

perf(ext/node): move fs.cpSync implementation to rust#32687

Merged
bartlomieju merged 22 commits intodenoland:mainfrom
Tango992:perf-node-fs-cpSync
Mar 17, 2026
Merged

perf(ext/node): move fs.cpSync implementation to rust#32687
bartlomieju merged 22 commits intodenoland:mainfrom
Tango992:perf-node-fs-cpSync

Conversation

@Tango992
Copy link
Copy Markdown
Contributor

@Tango992 Tango992 commented Mar 13, 2026

Moves most of the implementation to the rust side, reducing roundtrips between js and rust, especially when copying lots of files.

Also did some fixes/improvements over #32580:

  • Make OpenAccessKind differ between lstat and stat here
  • Use bigint with bitwise operations for more compact flag handling here.

Performance insights based on my benchmark:

Linux

Deno 2.7.5

➜ deno --v8-flags="--expose-gc" -A index.js sync
clk: ~3.70 GHz
cpu: AMD Ryzen 7 3700X 8-Core Processor
runtime: deno 2.7.5 (x86_64-unknown-linux-gnu)

benchmark                                    avg (min … max) p75 / p99    (min … top 1%)
------------------------------------------------------------ -------------------------------
fs.cpSync recursive with deep tree + symlinks 146.21 ms/iter 147.91 ms         █   █        
                                     (140.63 ms … 150.73 ms) 150.21 ms █▁▁▁▁▁███▁▁▁██▁█▁▁█▁█
                                   gc(  2.28 ms …   2.98 ms)  20.09 mb ( 19.69 mb… 20.34 mb)
                                    1.48 ipc ( 57.47% cache)   1.65M branch misses
                         371.53M cycles 551.34M instructions  58.12M c-refs  24.72M c-misses

Deno (This PR)

➜ rdeno --v8-flags="--expose-gc" -A index.js sync
clk: ~3.75 GHz
cpu: AMD Ryzen 7 3700X 8-Core Processor
runtime: deno 2.7.5 (x86_64-unknown-linux-gnu)

benchmark                                    avg (min … max) p75 / p99    (min … top 1%)
------------------------------------------------------------ -------------------------------
fs.cpSync recursive with deep tree + symlinks  54.09 ms/iter  55.44 ms ▅            █▅      
                                       (50.90 ms … 58.30 ms)  57.59 ms █▇▁▄▁▁▁▁▄▁▁▁▁██▁▄▁▁▁▄
                                   gc(  2.21 ms …   3.60 ms)   5.92 kb (  1.45 kb… 20.34 kb)
                                    1.21 ipc ( 92.28% cache) 232.84k branch misses
                          22.55M cycles  27.34M instructions   3.81M c-refs 294.48k c-misses
  • ~2.7× faster
  • ~94–95% fewer CPU cycles and instructions
  • ~86% fewer branch mispredictions
  • ~98% fewer cache misses
  • ~99% less memory collected during GC

Node.js 25.8.1

➜ node --expose-gc index.js sync                 
clk: ~3.87 GHz
cpu: AMD Ryzen 7 3700X 8-Core Processor
runtime: node 25.8.0 (x64-linux)

benchmark                                    avg (min … max) p75 / p99    (min … top 1%)
------------------------------------------------------------ -------------------------------
fs.cpSync recursive with deep tree + symlinks  45.35 ms/iter  45.45 ms ▃    █   ▃▃▃ ▃    ▃  
                                       (45.07 ms … 45.78 ms)  45.65 ms █▆▆▆▆█▁▆▁███▆█▁▆▁▆█▁▆
                                   gc(  1.66 ms …   2.97 ms)  11.63 kb (  6.65 kb… 34.77 kb)
                                    1.39 ipc ( 93.84% cache) 108.89k branch misses
                          15.04M cycles  20.95M instructions   2.55M c-refs 156.86k c-misses

Bun 1.3.10

➜ bun index.js
clk: ~3.81 GHz
cpu: AMD Ryzen 7 3700X 8-Core Processor
runtime: bun 1.3.10 (x64-linux)

benchmark                                    avg (min … max) p75 / p99    (min … top 1%)
------------------------------------------------------------ -------------------------------
fs.cpSync recursive with deep tree + symlinks  67.88 ms/iter  68.22 ms ███  █  █            
                                       (65.53 ms … 73.05 ms)  71.81 ms ███▁██▁▁██▁█▁▁▁▁▁▁▁██
                                   gc(  1.53 ms …   1.94 ms) 440.89 kb (  0.00  b…  1.63 mb)
                                    1.50 ipc ( 76.91% cache) 276.98k branch misses
                          63.84M cycles  95.97M instructions  12.73M c-refs   2.94M c-misses
macOS

Deno 2.7.5

➜ sudo deno --v8-flags="--expose-gc" -A index.js                                                 
clk: ~3.11 GHz
cpu: Apple M1 Pro
runtime: deno 2.7.5 (aarch64-apple-darwin)

benchmark                                    avg (min … max) p75 / p99    (min … top 1%)
------------------------------------------------------------ -------------------------------
fs.cpSync recursive with deep tree + symlinks 333.99 ms/iter 331.37 ms  ██ ██               
                                     (324.32 ms … 358.19 ms) 352.03 ms ███▁██▁▁▁▁▁▁▁▁▁█▁▁▁▁█
                                   gc(  1.73 ms …   2.11 ms)  20.10 mb ( 19.92 mb… 20.34 mb)
                                   3.52 ipc (  0.67% stalls)    NaN% L1 data cache
                           1.03G cycles   3.63G instructions   0.00% retired LD/ST (   0.00)

Deno (This PR)

➜ sudo rdeno --v8-flags="--expose-gc" -A index.js
clk: ~3.11 GHz
cpu: Apple M1 Pro
runtime: deno 2.7.4 (aarch64-apple-darwin)

benchmark                                    avg (min … max) p75 / p99    (min … top 1%)
------------------------------------------------------------ -------------------------------
fs.cpSync recursive with deep tree + symlinks 267.72 ms/iter 265.04 ms  █▃ ▃                
                                     (253.01 ms … 309.59 ms) 295.93 ms ▆██▁█▁▆▁▁▁▁▁▁▁▁▁▆▁▁▁▆
                                   gc(  1.74 ms …   2.24 ms)  10.37 kb (  1.45 kb… 20.32 kb)
                                   3.68 ipc (  0.62% stalls)    NaN% L1 data cache
                         808.79M cycles   2.98G instructions   0.00% retired LD/ST (   0.00)
  • ~20% faster
  • ~21% fewer CPU cycles
  • ~18% fewer instructions
  • ~99% less memory collected during GC

Node.js 25.8.1

➜ sudo node --expose-gc index.js                
clk: ~3.18 GHz
cpu: Apple M1 Pro
runtime: node 25.8.1 (arm64-darwin)

benchmark                                    avg (min … max) p75 / p99    (min … top 1%)
------------------------------------------------------------ -------------------------------
fs.cpSync recursive with deep tree + symlinks 283.50 ms/iter 277.64 ms ▃▃█                  
                                     (270.27 ms … 343.19 ms) 305.95 ms ███▆▆▁▁▁▁▁▁▁▆▁▁▁▁▁▁▁▆
                                   gc(  1.32 ms …   1.88 ms)  16.98 kb (  6.92 kb… 27.80 kb)
                                   3.61 ipc (  0.07% stalls)    NaN% L1 data cache
                         855.07M cycles   3.09G instructions   0.00% retired LD/ST (   0.00)

Bun 1.3.10

➜ sudo bun index.js             
clk: ~3.16 GHz
cpu: Apple M1 Pro
runtime: bun 1.3.10 (arm64-darwin)

benchmark                                    avg (min … max) p75 / p99    (min … top 1%)
------------------------------------------------------------ -------------------------------
fs.cpSync recursive with deep tree + symlinks 283.56 ms/iter 285.84 ms   █                  
                                     (272.27 ms … 309.28 ms) 302.86 ms █▅█▁▁▁▁▁▅▅▁▁▁▁▁▁▁▁▁▅▅
                                   gc(  1.10 ms …   1.20 ms)  52.00 kb (  0.00  b…464.00 kb)
                                   3.65 ipc (  0.22% stalls)    NaN% L1 data cache
                         859.63M cycles   3.14G instructions   0.00% retired LD/ST (   0.00)

@Tango992 Tango992 marked this pull request as ready for review March 14, 2026 06:16
@Tango992 Tango992 requested a review from bartlomieju March 14, 2026 08:07
Copy link
Copy Markdown
Member

@bartlomieju bartlomieju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: perf(ext/node): move fs.cpSync implementation to rust

Overview

This PR moves the fs.cpSync implementation from JavaScript to Rust, reducing JS↔Rust roundtrips during recursive copy operations. The JS-side cp_sync.ts goes from ~430 lines to ~50, with all the logic (path validation, directory traversal, file/symlink/special-file handling) now in a single op_node_cp_sync op. The async cp path is also improved with bit-packed stat info instead of a serialized object.

Benchmarks are impressive: ~2.7x faster on Linux, ~20% on macOS, with dramatically lower memory/GC pressure.

Strengths

  • Massive net deletion: +1160/-572, but the JS side shrinks by ~375 lines. The Rust additions are justified by the performance gains.
  • Bit-packed stat info (compact_stat_info): Clean approach — mode in low 32 bits, flags in upper bits. Well-documented layout.
  • Filter function exception handling: The CpSyncRunStatus::JsException propagation is correct — uses v8::tc_scope! and rethrow(), and the exception cleanly unwinds through the recursive copy.
  • Good test: The new spec test covers filter functions that throw mid-copy, verifying partial copy state and proper error propagation.
  • OpenAccessKind fix: The async path now correctly uses ReadNoFollow for lstat (when !dereference), fixing a pre-existing bug.

Issues & Suggestions

1. Duplicated check_parent_paths_impl (async) — significant copy-paste

The async version of check_parent_paths_impl appears to be fully duplicated rather than just having the sync version extracted. The original async function's body was copied into both a new async version and a _sync version. The async version now has stat_async but otherwise identical logic. This is ~60 lines of duplication. Consider extracting shared logic or at least adding a comment noting the intentional duplication.

2. Repeated fs acquisition pattern

Almost every sync helper does:

let fs = {
    let state = state.borrow_mut();
    state.borrow::<FileSystemRc>().clone()
};

This is repeated in cp_sync_copy_dir, cp_sync_mkdir_and_copy, op_node_cp_on_file_sync, op_node_cp_on_link_sync, check_paths_impl_sync, op_node_cp_check_paths_recursive_sync, op_node_cp_validate_and_prepare_sync. Consider passing &FileSystemRc as a parameter to reduce boilerplate and avoid repeatedly borrowing state.

3. borrow_mut() vs borrow() inconsistency

Some sync functions use state.borrow_mut() to get FileSystemRc (e.g., op_node_cp_on_file_sync, op_node_cp_on_link_sync) while others use state.borrow() (e.g., cp_sync_copy_dir). Since FileSystemRc is only being read (.clone()), borrow() should be sufficient everywhere.

4. Missing opts.mode handling in sync path

The original JS cpSync supported opts.mode (the copyFile mode flag, e.g., COPYFILE_EXCL). The new op_node_cp_on_file_sync calls copy_file_sync without a mode parameter. Is opts.mode intentionally dropped? The async path already ignores it too (via op_node_cp_on_file), so this might be pre-existing.

5. Filter returning non-boolean truthy values

The test (main.ts:28) returns 1 (truthy) and undefined (falsy) from the filter. The Rust side uses result.unwrap().boolean_value(tc_scope) which correctly coerces to boolean. But cpSyncFn in JS still does the top-level filter check before calling the op — so the first filter invocation is in JS and subsequent ones in Rust. Both should behave identically, but worth a comment.

Minor nits

  • ext/node/ops/fs.rs:109: Doc comment has a double backtick: `cp ``
  • cp_sync_copy_dir allocates String for every src_item/dest_item via .to_string_lossy().to_string(). For large trees this is a lot of allocations. Could use Cow but probably not worth optimizing further right now.

Verdict

Good PR with real performance wins. The main concern is the code duplication between async/sync variants, but that's a common pattern in this codebase. Ship it. 🚢

@Tango992 Tango992 marked this pull request as draft March 14, 2026 08:20
@Tango992
Copy link
Copy Markdown
Contributor Author

Will do the reviews first

@Tango992 Tango992 marked this pull request as ready for review March 15, 2026 04:20
@Tango992 Tango992 requested a review from bartlomieju March 15, 2026 05:28
@bartlomieju
Copy link
Copy Markdown
Member

A few more comments flagged from Claude:

 1. CpSyncRunStatus::JsException silently returns Ok(())

  match cp_sync_dispatch(&state, &fs, scope, &stat_info, src, dest, opts)? {
      CpSyncRunStatus::Completed | CpSyncRunStatus::JsException => Ok(()),
  }

  This relies on V8 propagating the pending exception after the reentrant op returns. It works, but deserves a comment — it looks
   like a bug at first glance.

  2. check_cp_path called up to 3 times for the same dest in op_node_cp_on_file_sync

  Once for force unlink, once for EXCL lstat check, once for the actual copy. Each re-does the permission check. The checked path
   could be cached.

  3. No new tests for several new Rust code paths

  The new CpError::Socket, CpError::Fifo, CpError::Unknown variants, COPYFILE_EXCL mode handling in op_node_cp_on_file_sync, and
  preserveTimestamps in the sync path are only covered if existing tests happen to exercise them. Worth confirming coverage or
  adding targeted tests.

  4. Large sync/async duplication

  op_node_cp_on_link_sync (~100 lines), check_paths_impl_sync, ensure_parent_dir_impl_sync, and check_parent_paths_impl_sync are
  near-exact copies of their async counterparts. The maybe_spawn_blocking! macro already exists for this pattern — could it be
  leveraged to reduce duplication?

I think no 1 is somewhat important, no 3 is a bit concerning, no 2 and 4 seem fine.

@Tango992
Copy link
Copy Markdown
Contributor Author

Addresses point number 1 with extra comments. Addresses some things on point number 3 with extra tests. There are things worth mentioning:

  • CpError::Unknown seems unreachable to test https://github.com/denoland/deno/pull/32687/changes#diff-3e7d3b6612b7b8b1b0db630e38113b30116906d56d2da75be6d5e962bbb0d06cR1023-R1029

  • Our opts.mode still doesn't cover all cases. As documented on the Node.js site:

    • fs.constants.COPYFILE_EXCL: The copy operation will fail if dest already exists.
    • fs.constants.COPYFILE_FICLONE: The copy operation will attempt to create a copy-on-write reflink. If the platform does not support copy-on-write, then a fallback copy mechanism is used.
    • fs.constants.COPYFILE_FICLONE_FORCE: The copy operation will attempt to create a copy-on-write reflink. If the platform does not support copy-on-write, then the operation will fail.

    I think we might need a separate PR to address these different modes for the copyfile op

Copy link
Copy Markdown
Contributor

@kajukitli kajukitli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR adds synchronous cp (copy) operations for Node.js compatibility. The review identified significant security concerns: two high-severity permission bypass issues in symlink handling where unsafe_new is used after only checking blanket read/write permissions instead of path-specific checks, multiple medium-severity TOCTOU race conditions in file copy operations that could lead to unexpected behavior with COPYFILE_EXCL flag, and potential issues with error handling that could enable path traversal. These security issues should be addressed before merging.

[HIGH] ext/node/ops/fs.rs:2303: Permission bypass in op_node_cp_on_link_sync: Uses check_write_all and check_read_all followed by CheckedPathBuf::unsafe_new, which bypasses proper path-specific permission checks. This allows creating symlinks to arbitrary locations without proper path validation.

Suggestion: Use check_cp_path with appropriate OpenAccessKind instead of check_write_all/check_read_all. Compare with the async version op_node_cp_on_link which should have proper path-based permission checks.

[HIGH] ext/node/ops/fs.rs:2330: Similar permission bypass in the 'else' branch of op_node_cp_on_link_sync when dest_exists is true. Uses check_write_all/check_read_all with CheckedPathBuf::unsafe_new for symlink operations, bypassing path-specific security checks.

Suggestion: Use proper path-based permission checks via check_cp_path instead of blanket check_write_all/check_read_all permissions with unsafe_new paths.

[MEDIUM] ext/node/ops/fs.rs:1620: In check_parent_paths_impl_sync, canonicalize() and std::path::absolute() errors are silently ignored with unwrap_or_else, potentially allowing path traversal attacks where malformed paths bypass parent directory checks.

Suggestion: Handle canonicalization failures more carefully, potentially returning an error instead of falling back to the original path, which could have security implications for the subdirectory check.

[MEDIUM] ext/node/ops/fs.rs:1990: TOCTOU race condition in op_node_cp_on_file_sync: The function checks if dest_exists, removes dest if force is true, then does copy_file_sync. Between the remove and the copy, another process could create a file at the destination.

Suggestion: Consider using atomic file operations or appropriate file creation flags that handle the exclusive creation atomically.

[MEDIUM] ext/node/ops/fs.rs:2016: TOCTOU race condition with COPYFILE_EXCL check: The code performs lstat to check if dest exists, then proceeds to copy. Between the lstat check and copy_file_sync, another process could create a file at the destination, violating COPYFILE_EXCL semantics.

Suggestion: Pass COPYFILE_EXCL mode flag to the underlying filesystem copy operation to ensure atomic exclusive creation at the OS level.

[MEDIUM] ext/node/ops/fs.rs:827: Filter function callback creates V8 strings from user-controlled paths without handling potential failures. The unwrap() on v8::String::new could panic if the string contains invalid data.

Suggestion: Handle the potential failure of v8::String::new gracefully instead of using unwrap().

@Tango992
Copy link
Copy Markdown
Contributor Author

I think symlink always uses check_write_all and check_read_all based on our code patterns (CMIIW):

deno/ext/fs/ops.rs

Lines 897 to 904 in 60edd78

let permissions =
state.borrow_mut::<deno_permissions::PermissionsContainer>();
permissions.check_write_all("Deno.symlinkSync()")?;
permissions.check_read_all("Deno.symlinkSync()")?;
// PERMISSIONS: ok because we verified --allow-write and --allow-read above
let oldpath = CheckedPath::unsafe_new(Cow::Borrowed(Path::new(oldpath)));
let newpath = CheckedPath::unsafe_new(Cow::Borrowed(Path::new(newpath)));

@Tango992
Copy link
Copy Markdown
Contributor Author

Also, kaju mentioned 'TOCTOU race conditions in file copy operations.' I think the current approach is acceptable because our implementation is designed to directly mimic Node.js behavior.

@Tango992
Copy link
Copy Markdown
Contributor Author

I asked claude regarding

[MEDIUM] ext/node/ops/fs.rs:1620: In check_parent_paths_impl_sync, canonicalize() and std::path::absolute() errors are silently ignored with unwrap_or_else, potentially allowing path traversal attacks where malformed paths bypass parent directory checks.

Here's the answer:

Kaju's concern is directionally good, but I would not label this as a clear path-traversal security vulnerability in Deno as-is.

What I see in the code:

  • The fallback they pointed out is real in both async and sync helpers.
  • Paths used for filesystem ops still go through permission/path checks via check_cp_path.
  • Permission query parsing normalizes non-absolute/relative paths lexically.
  • There is also inode-based self-subdirectory detection in the loop, not just string comparison.

I think this is mostly a correctness-hardening issue, not a direct traversal bypass:

  • The dangerous operations still require passing check_cp_path and later filesystem checks.
  • The parent-walk check is an extra guard for cp semantics (prevent copy into own subtree), not the primary permission boundary.
  • So “malformed path bypasses parent directory checks” can be true for this guard in edge cases, but that does not automatically mean permission escape.

Copy link
Copy Markdown
Member

@bartlomieju bartlomieju left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bartlomieju bartlomieju merged commit 5e966f1 into denoland:main Mar 17, 2026
112 checks passed
@Tango992 Tango992 deleted the perf-node-fs-cpSync branch March 17, 2026 12:56
bartlomieju added a commit that referenced this pull request Mar 24, 2026
…ons (#32927)

## Summary

- Fix `node:fs.cp` failing with `ENOENT` when copying between allowed
directories while read permissions outside those directories are ignored
  (`"ignore": true` in deno.json permissions config)
- The `check_parent_paths` walk-up now handles `NotFound` from
permission checks (ignored denials) the same way it handles `NotFound` from `stat`
by stopping the ancestor traversal

**Root cause**: When `fs.cp` was moved from JS to Rust (#32580, #32687),
the parent path validation started propagating ignored-permission `ENOENT`
as a hard error instead of treating it as a stop signal. The old JS code's
`safeStatFn` caught `NotFound` and returned `undefined`, gracefully
stopping the walk. The Rust code's `check_cp_path()?` propagated it before
reaching the stat call's own `NotFound` handler.

Closes #32923

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants