Skip to content

perf(ext/napi): use threadpool for async work instead of spawning threads#32776

Merged
bartlomieju merged 1 commit intodenoland:mainfrom
bartlomieju:perf/napi-async-work-threadpool
Mar 16, 2026
Merged

perf(ext/napi): use threadpool for async work instead of spawning threads#32776
bartlomieju merged 1 commit intodenoland:mainfrom
bartlomieju:perf/napi-async-work-threadpool

Conversation

@bartlomieju
Copy link
Copy Markdown
Member

Summary

  • Replace std::thread::spawn with deno_core::unsync::spawn_blocking (tokio's blocking threadpool) in napi_queue_async_work
  • Threads are reused from a pool instead of creating a new OS thread per call, reducing overhead especially on Linux where clone() is expensive
  • This is analogous to how Node.js uses libuv's uv_queue_work threadpool

Context

PR #32560 fixed a deadlock bug by moving execute callbacks to worker threads via std::thread::spawn. This created a performance regression (#32773) because every async NAPI call spawns and tears down a fresh OS thread. The tokio blocking threadpool keeps threads alive for reuse (default 10s keep-alive), avoiding per-call thread creation overhead.

Benchmark (macOS aarch64, napi-rs async add)

Runtime µs/call
Node.js ~8.8
Deno 2.7.5 (std::thread::spawn) ~8.5-9
Deno dev (spawn_blocking) ~8.5

On macOS the steady-state is similar since thread creation is lightweight. On Linux (reporter's platform), std::thread::spawn is significantly more expensive due to clone() syscall cost — the reporter measured 62µs → 258µs regression.

Test plan

  • cargo test --test integration -- napi — all NAPI tests pass (debug + release)
  • Benchmarked with napi-rs async functions — performance on par with Node.js

Closes #32773

🤖 Generated with Claude Code

…eads

Replace `std::thread::spawn` with `deno_core::unsync::spawn_blocking`
(tokio's blocking threadpool) in `napi_queue_async_work`. This reuses
threads from a pool instead of creating a new OS thread per call,
reducing overhead especially on Linux where thread creation via clone()
is expensive.

Fixes denoland#32773

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@bartlomieju bartlomieju requested a review from littledivy March 16, 2026 17:59
@bartlomieju bartlomieju enabled auto-merge (squash) March 16, 2026 18:09
Copy link
Copy Markdown
Member

@nathanwhit nathanwhit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bartlomieju bartlomieju merged commit 216126e into denoland:main Mar 16, 2026
112 checks passed
@bartlomieju bartlomieju deleted the perf/napi-async-work-threadpool branch March 16, 2026 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2.7.5 napi async perf regression

2 participants