perf(ext/node): move fs.cpSync implementation to rust#32687
perf(ext/node): move fs.cpSync implementation to rust#32687bartlomieju merged 22 commits intodenoland:mainfrom
fs.cpSync implementation to rust#32687Conversation
bartlomieju
left a comment
There was a problem hiding this comment.
Code Review: perf(ext/node): move fs.cpSync implementation to rust
Overview
This PR moves the fs.cpSync implementation from JavaScript to Rust, reducing JS↔Rust roundtrips during recursive copy operations. The JS-side cp_sync.ts goes from ~430 lines to ~50, with all the logic (path validation, directory traversal, file/symlink/special-file handling) now in a single op_node_cp_sync op. The async cp path is also improved with bit-packed stat info instead of a serialized object.
Benchmarks are impressive: ~2.7x faster on Linux, ~20% on macOS, with dramatically lower memory/GC pressure.
Strengths
- Massive net deletion: +1160/-572, but the JS side shrinks by ~375 lines. The Rust additions are justified by the performance gains.
- Bit-packed stat info (
compact_stat_info): Clean approach — mode in low 32 bits, flags in upper bits. Well-documented layout. - Filter function exception handling: The
CpSyncRunStatus::JsExceptionpropagation is correct — usesv8::tc_scope!andrethrow(), and the exception cleanly unwinds through the recursive copy. - Good test: The new spec test covers filter functions that throw mid-copy, verifying partial copy state and proper error propagation.
OpenAccessKindfix: The async path now correctly usesReadNoFollowfor lstat (when!dereference), fixing a pre-existing bug.
Issues & Suggestions
1. Duplicated check_parent_paths_impl (async) — significant copy-paste
The async version of check_parent_paths_impl appears to be fully duplicated rather than just having the sync version extracted. The original async function's body was copied into both a new async version and a _sync version. The async version now has stat_async but otherwise identical logic. This is ~60 lines of duplication. Consider extracting shared logic or at least adding a comment noting the intentional duplication.
2. Repeated fs acquisition pattern
Almost every sync helper does:
let fs = {
let state = state.borrow_mut();
state.borrow::<FileSystemRc>().clone()
};This is repeated in cp_sync_copy_dir, cp_sync_mkdir_and_copy, op_node_cp_on_file_sync, op_node_cp_on_link_sync, check_paths_impl_sync, op_node_cp_check_paths_recursive_sync, op_node_cp_validate_and_prepare_sync. Consider passing &FileSystemRc as a parameter to reduce boilerplate and avoid repeatedly borrowing state.
3. borrow_mut() vs borrow() inconsistency
Some sync functions use state.borrow_mut() to get FileSystemRc (e.g., op_node_cp_on_file_sync, op_node_cp_on_link_sync) while others use state.borrow() (e.g., cp_sync_copy_dir). Since FileSystemRc is only being read (.clone()), borrow() should be sufficient everywhere.
4. Missing opts.mode handling in sync path
The original JS cpSync supported opts.mode (the copyFile mode flag, e.g., COPYFILE_EXCL). The new op_node_cp_on_file_sync calls copy_file_sync without a mode parameter. Is opts.mode intentionally dropped? The async path already ignores it too (via op_node_cp_on_file), so this might be pre-existing.
5. Filter returning non-boolean truthy values
The test (main.ts:28) returns 1 (truthy) and undefined (falsy) from the filter. The Rust side uses result.unwrap().boolean_value(tc_scope) which correctly coerces to boolean. But cpSyncFn in JS still does the top-level filter check before calling the op — so the first filter invocation is in JS and subsequent ones in Rust. Both should behave identically, but worth a comment.
Minor nits
ext/node/ops/fs.rs:109: Doc comment has a double backtick:`cp``cp_sync_copy_dirallocatesStringfor everysrc_item/dest_itemvia.to_string_lossy().to_string(). For large trees this is a lot of allocations. Could useCowbut probably not worth optimizing further right now.
Verdict
Good PR with real performance wins. The main concern is the code duplication between async/sync variants, but that's a common pattern in this codebase. Ship it. 🚢
|
Will do the reviews first |
|
A few more comments flagged from Claude: I think no 1 is somewhat important, no 3 is a bit concerning, no 2 and 4 seem fine. |
|
Addresses point number 1 with extra comments. Addresses some things on point number 3 with extra tests. There are things worth mentioning:
|
kajukitli
left a comment
There was a problem hiding this comment.
This PR adds synchronous cp (copy) operations for Node.js compatibility. The review identified significant security concerns: two high-severity permission bypass issues in symlink handling where unsafe_new is used after only checking blanket read/write permissions instead of path-specific checks, multiple medium-severity TOCTOU race conditions in file copy operations that could lead to unexpected behavior with COPYFILE_EXCL flag, and potential issues with error handling that could enable path traversal. These security issues should be addressed before merging.
[HIGH] ext/node/ops/fs.rs:2303: Permission bypass in op_node_cp_on_link_sync: Uses check_write_all and check_read_all followed by CheckedPathBuf::unsafe_new, which bypasses proper path-specific permission checks. This allows creating symlinks to arbitrary locations without proper path validation.
Suggestion: Use check_cp_path with appropriate OpenAccessKind instead of check_write_all/check_read_all. Compare with the async version op_node_cp_on_link which should have proper path-based permission checks.
[HIGH] ext/node/ops/fs.rs:2330: Similar permission bypass in the 'else' branch of op_node_cp_on_link_sync when dest_exists is true. Uses check_write_all/check_read_all with CheckedPathBuf::unsafe_new for symlink operations, bypassing path-specific security checks.
Suggestion: Use proper path-based permission checks via check_cp_path instead of blanket check_write_all/check_read_all permissions with unsafe_new paths.
[MEDIUM] ext/node/ops/fs.rs:1620: In check_parent_paths_impl_sync, canonicalize() and std::path::absolute() errors are silently ignored with unwrap_or_else, potentially allowing path traversal attacks where malformed paths bypass parent directory checks.
Suggestion: Handle canonicalization failures more carefully, potentially returning an error instead of falling back to the original path, which could have security implications for the subdirectory check.
[MEDIUM] ext/node/ops/fs.rs:1990: TOCTOU race condition in op_node_cp_on_file_sync: The function checks if dest_exists, removes dest if force is true, then does copy_file_sync. Between the remove and the copy, another process could create a file at the destination.
Suggestion: Consider using atomic file operations or appropriate file creation flags that handle the exclusive creation atomically.
[MEDIUM] ext/node/ops/fs.rs:2016: TOCTOU race condition with COPYFILE_EXCL check: The code performs lstat to check if dest exists, then proceeds to copy. Between the lstat check and copy_file_sync, another process could create a file at the destination, violating COPYFILE_EXCL semantics.
Suggestion: Pass COPYFILE_EXCL mode flag to the underlying filesystem copy operation to ensure atomic exclusive creation at the OS level.
[MEDIUM] ext/node/ops/fs.rs:827: Filter function callback creates V8 strings from user-controlled paths without handling potential failures. The unwrap() on v8::String::new could panic if the string contains invalid data.
Suggestion: Handle the potential failure of v8::String::new gracefully instead of using unwrap().
|
I think symlink always uses Lines 897 to 904 in 60edd78 |
|
Also, kaju mentioned 'TOCTOU race conditions in file copy operations.' I think the current approach is acceptable because our implementation is designed to directly mimic Node.js behavior. |
|
I asked claude regarding
Here's the answer: Kaju's concern is directionally good, but I would not label this as a clear path-traversal security vulnerability in Deno as-is. What I see in the code:
I think this is mostly a correctness-hardening issue, not a direct traversal bypass:
|
…ons (#32927) ## Summary - Fix `node:fs.cp` failing with `ENOENT` when copying between allowed directories while read permissions outside those directories are ignored (`"ignore": true` in deno.json permissions config) - The `check_parent_paths` walk-up now handles `NotFound` from permission checks (ignored denials) the same way it handles `NotFound` from `stat` by stopping the ancestor traversal **Root cause**: When `fs.cp` was moved from JS to Rust (#32580, #32687), the parent path validation started propagating ignored-permission `ENOENT` as a hard error instead of treating it as a stop signal. The old JS code's `safeStatFn` caught `NotFound` and returned `undefined`, gracefully stopping the walk. The Rust code's `check_cp_path()?` propagated it before reaching the stat call's own `NotFound` handler. Closes #32923 --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Moves most of the implementation to the rust side, reducing roundtrips between js and rust, especially when copying lots of files.
Also did some fixes/improvements over #32580:
OpenAccessKinddiffer betweenlstatandstatherebigintwith bitwise operations for more compact flag handling here.Performance insights based on my benchmark:
Linux
Deno 2.7.5
Deno (This PR)
Node.js 25.8.1
Bun 1.3.10
macOS
Deno 2.7.5
Deno (This PR)
Node.js 25.8.1
Bun 1.3.10