Skip to content

fix: make this.emitFile chunk path synchronous to avoid deadlock#9031

Merged
shulaoda merged 7 commits intorolldown:mainfrom
lazarv:fix/emit-chunk-deadlock
Apr 26, 2026
Merged

fix: make this.emitFile chunk path synchronous to avoid deadlock#9031
shulaoda merged 7 commits intorolldown:mainfrom
lazarv:fix/emit-chunk-deadlock

Conversation

@lazarv
Copy link
Copy Markdown
Contributor

@lazarv lazarv commented Apr 8, 2026

Closes the emit_chunk item from the sync-NAPI deadlock audit in #7311.

Symptom

Plugins that call this.emitFile({ type: 'chunk', ... }) from a transform (or any hook, really) hang rolldown indefinitely once enough emits accumulate. The hang is deterministic for tight emit loops (~1025 emits from a single hook) and
non-deterministic under parallelism — it can trigger with as few as ~400 emits spread across concurrent transforms. There is no error, no log, no stack trace from rolldown — the build just stops making progress. The main JS thread ends up
parked inside _pthread_cond_wait under napi_call_function → ... → block_on, and every tokio worker is parked at the same pthread_cond_wait.

This was reproducible against rolldown@1.0.0-rc.13 on a real RSC plugin (@lazarv/react-server) building a Mantine-sized app (~3300 "use client" modules, each emitted as an entry chunk), and then
minimised to 30 lines of plugin code.

Root cause

PluginContext.emitFile({ type: 'chunk' }) is a sync napi binding. Until this PR it called napi::bindgen_prelude::block_on(...) on the JS thread to drive an async fn emit_chunk whose two await points were:

  1. self.tx.lock().await on a tokio::sync::Mutex<Option<Sender<ModuleLoaderMsg>>>
  2. send(AddEntryModule(...)).await on a bounded mpsc::channel(1024) shared with the module loader

The bounded channel is the trap. Once it fills, the send future can only be unblocked by the module loader draining it. But draining each AddEntryModule message dispatches plugin hooks (resolveId, load, further transforms) back to
the JS thread via TSFN
— and the JS thread is pinned inside block_on servicing the current emit_chunk. The only consumer that can free capacity is waiting on the only thread that is blocked producing. Classic producer ⇄ consumer deadlock
through TSFN.

A couple of subtleties worth calling out, because they explain why the bug was easy to miss:

  • The effective capacity is much lower than 1024. Under parallelism, in-flight module tasks spawned for earlier emits are already waiting on TSFN responses from the blocked JS thread. Those pending tasks keep the loader effectively frozen
    while the channel stays saturated, so the producer hits tx.send().await and hangs at a much smaller emit count. In local testing against unpatched rc.13:
    • 200 parallel inputs × 1 emit each → ✓ 62 ms
    • 500 → ✓ 154 ms
    • 600 → hang (after ~400 transforms processed)
    • 2000 → hang (never even reaches buildStart)
  • Parallel transforms without emitFile scale fine. A control experiment — 12 000 virtual inputs each going through a transform hook that does not call emitFile — completes cleanly. So the bottleneck is not transform dispatch, TSFN
    throughput, fetch_modules, or the loader loop. It is specifically emit_chunk's block_on + bounded-channel interaction, exactly as audit: Review sync NAPI bindings for deadlock safety #7311 predicted when it flagged emit_chunk as MEDIUM risk.
  • Yielding inside the JS transform does not help. setImmediate / process.nextTick / Promise.resolve yields let libuv service TSFN callbacks, but at the point the producer is stuck in block_on the consumer cannot hand the JS thread
    anything to do — the loader is either blocked on the current transform's TSFN response, or has no work ready that does not depend on the blocked thread. The deadlock is structural, not a scheduling race.
  • This is not an "insanely large build" problem. Each "use client" file in a normal RSC project emits exactly one chunk from its own transform. 3300 client components is a reasonable size for an app. No plugin is doing anything
    pathological; the rolldown primitive simply does not survive its documented usage at scale.

Fix

Collapse the entire emit_chunk path to synchronous code and drop block_on from the napi binding. There is no reason any of it was async:

  • FileEmitter::emit_chunk is now a sync fn. Its tx is a std::sync::Mutex<Option<UnboundedSender<ModuleLoaderMsg>>>. The critical section is Option::clone() — the Sender is cheaply cloneable — then the lock is dropped before send.
    The install/clear path runs only at scan boundaries and is never contended with build traffic; the per-emit path's contention is bounded to nanoseconds (lock-free CAS on the clone).
  • BindingPluginContext::emit_chunk no longer enters the tokio runtime. It is now a plain sync binding with the same shape as the adjacent emit_file, marked // SYNC-SAFE per the convention introduced by fix(dev): make register_modules async #7289 / audit: Review sync NAPI bindings for deadlock safety #7311.
  • PluginContext::emit_chunk and NativePluginContextImpl::emit_chunk are de-async'd to match.
  • PluginDriver.tx is unified on std::sync::Mutex for consistency with FileEmitter.tx — the load hook holds it only to clone the sender and drop the lock before awaiting module-load completion.
  • As defense in depth, the module loader's message channel is switched to unbounded_channel(). UnboundedSender::send is synchronous and infallible, so even if a future refactor reintroduces a sync wait on this path, there is no .await
    that can park the JS thread. All tx.send(...).await call sites in module_task.rs, external_module_task.rs, runtime_module_task.rs, and native_plugin_context.rs::load are updated accordingly.

JS API impact

None. this.emitFile({ type: 'chunk' }) remains synchronous and returns string directly, matching Rollup's PluginContext.emitFile contract. No plugin needs to change.

Tests

Two new deterministic regression fixtures under packages/rolldown/tests/fixtures/plugin/context/:

  • emit-chunk-many-from-transform — a single transform hook emits 2000 chunks in a tight loop. Exercises the "tight emit loop" path. Deadlocks on main, passes in ~470 ms with this fix.
  • emit-chunk-many-parallel-inputs — 1500 virtual inputs, each with its own transform emitting exactly one chunk. Exercises the realistic "large plugin at scale" path. Deadlocks on main, passes in ~600 ms with this fix.

Both are marked sequential: true and live alongside the existing plugin/context/emit-file fixtures. Full fixture.test.ts run is green (99/99).

Relation to #7311

#7311 audited sync NAPI bindings after the dev-engine deadlock fixed in #7289. It explicitly flagged emit_chunk as a MEDIUM-risk sync binding with deadlock potential, listed two possible resolutions (Option A: make async; Option B: add
SYNC-SAFE comment with justification), and left the task unchecked when the issue was closed as completed. This PR resolves that task by taking a third path: make the entire underlying implementation synchronous so the binding can stay
sync without block_on
. The SYNC-SAFE comment is added per convention. This avoids the breaking-change implications of making this.emitFile return a Promise.

`PluginContext.emitFile({ type: 'chunk' })` is exposed to JS as a sync
napi binding. It used to `block_on` a future that sent `AddEntryModule`
over a bounded `mpsc::channel(1024)`, then await capacity. Once the
channel filled, the send future could only be unblocked by the module
loader draining it — but draining each message dispatched plugin hooks
back to the JS thread via TSFN, which was pinned inside `block_on`.
Classic producer ⇄ consumer deadlock.

Under parallelism the cycle could trigger well below 1024 emits
(~400 in local testing), because in-flight tasks waiting on JS hooks
keep the channel saturated.

Fix the root cause by making the entire emit_chunk path synchronous:

- `FileEmitter::emit_chunk` is now sync; its `tx` is a
  `std::sync::Mutex` (critical section: clone the `Option<Sender>`,
  drop the lock, send — all lock-free).
- `BindingPluginContext::emit_chunk` no longer enters the runtime.
- `PluginContext::emit_chunk` and `NativePluginContextImpl::emit_chunk`
  are de-async'd to match.
- `PluginDriver.tx` is unified on `std::sync::Mutex` for consistency
  with `FileEmitter.tx`.

Switch the module loader's message channel to `unbounded_channel()` as
defense in depth, so any future bounded queue on this path cannot
re-introduce the same cycle.

Add two regression fixtures:
- `emit-chunk-many-from-transform`: 2000 emits in a tight loop from a
  single `transform` hook.
- `emit-chunk-many-parallel-inputs`: 1500 inputs each emitting one
  chunk from their own `transform`, exercising the
  parallelism-dependent variant.
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 8, 2026

Merging this PR will not alter performance

✅ 4 untouched benchmarks
⏩ 10 skipped benchmarks1


Comparing lazarv:fix/emit-chunk-deadlock (000d73d) with main (1d79b02)

Open in CodSpeed

Footnotes

  1. 10 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a deterministic/non-deterministic deadlock when JS plugins synchronously call this.emitFile({ type: 'chunk' }) at scale by removing async/blocking behavior from the emit-chunk pipeline and eliminating bounded-channel backpressure that could park the JS thread.

Changes:

  • Make emit_chunk fully synchronous end-to-end (plugin context → file emitter → module-loader message send) and remove block_on from the NAPI binding.
  • Switch module-loader messaging to tokio::sync::mpsc::unbounded_channel() and update all producers/consumers to use UnboundedSender.
  • Add regression fixtures that previously deadlocked (tight loop emits; many parallel transforms each emitting once).

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
packages/rolldown/tests/fixtures/plugin/context/emit-chunk-many-parallel-inputs/_config.ts Adds parallel-input regression fixture that emits many chunks across transforms.
packages/rolldown/tests/fixtures/plugin/context/emit-chunk-many-from-transform/_config.ts Adds tight-loop regression fixture emitting 2000 chunks from one transform.
packages/rolldown/tests/fixtures/plugin/context/emit-chunk-many-from-transform/main.js Trigger module for the tight-loop regression fixture.
crates/rolldown_common/src/file_emitter.rs Converts emit_chunk and sender installation to sync + UnboundedSender.
crates/rolldown_binding/src/options/plugin/binding_plugin_context.rs Removes block_on from sync NAPI emit_chunk binding; marks SYNC-SAFE.
crates/rolldown_plugin/src/plugin_context/plugin_context.rs Changes PluginContext::emit_chunk API from async to sync.
crates/rolldown_plugin/src/plugin_context/native_plugin_context.rs Updates native context to use UnboundedSender and sync emit_chunk.
crates/rolldown_plugin/src/plugin_driver/mod.rs Switches plugin driver tx storage to std::sync::Mutex + UnboundedSender.
crates/rolldown_plugin/src/plugin_driver/plugin_driver_factory.rs Constructs plugin driver with std::sync::Mutex<Option<UnboundedSender>>.
crates/rolldown/src/module_loader/module_loader.rs Changes loader channel to unbounded and documents deadlock rationale.
crates/rolldown/src/module_loader/task_context.rs Updates shared task context tx to UnboundedSender.
crates/rolldown/src/module_loader/module_task.rs Updates message sends to non-async UnboundedSender::send.
crates/rolldown/src/module_loader/external_module_task.rs Updates message sends to non-async UnboundedSender::send.
crates/rolldown/src/module_loader/runtime_module_task.rs Updates error/result sends to UnboundedSender::send.
crates/rolldown/src/stages/scan_stage.rs Removes awaits for setting context sender on plugin driver/file emitter.
crates/rolldown/tests/rolldown/issues/4895/mod.rs Updates test plugin to use sync emit_chunk (no .await).
crates/rolldown/tests/rolldown/issues/5011/mod.rs Updates test plugin to use sync emit_chunk (no .await).
crates/rolldown/tests/rolldown/issues/7833/mod.rs Updates test plugin to use sync emit_chunk (no .await).
crates/rolldown/tests/rolldown/optimization/chunk_merging/allow_extension_exports/mod.rs Updates test plugin to use sync emit_chunk (no .await).
crates/rolldown/tests/rolldown/optimization/chunk_merging/allow_extension_merge_same_exports/mod.rs Updates test plugin to use sync emit_chunk (no .await).

Comment thread crates/rolldown_common/src/file_emitter.rs Outdated
Comment thread crates/rolldown/src/module_loader/module_loader.rs Outdated
Comment thread crates/rolldown/src/module_loader/module_loader.rs Outdated
Comment thread crates/rolldown_plugin/src/plugin_context/plugin_context.rs
Comment thread crates/rolldown_plugin/src/plugin_driver/mod.rs Outdated
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Apr 23, 2026

Open in StackBlitz

@rolldown/browser

npm i https://pkg.pr.new/@rolldown/browser@9031

@rolldown/debug

npm i https://pkg.pr.new/@rolldown/debug@9031

@rolldown/pluginutils

npm i https://pkg.pr.new/@rolldown/pluginutils@9031

rolldown

npm i https://pkg.pr.new/rolldown@9031

@rolldown/binding-android-arm64

npm i https://pkg.pr.new/@rolldown/binding-android-arm64@9031

@rolldown/binding-darwin-arm64

npm i https://pkg.pr.new/@rolldown/binding-darwin-arm64@9031

@rolldown/binding-darwin-x64

npm i https://pkg.pr.new/@rolldown/binding-darwin-x64@9031

@rolldown/binding-freebsd-x64

npm i https://pkg.pr.new/@rolldown/binding-freebsd-x64@9031

@rolldown/binding-linux-arm-gnueabihf

npm i https://pkg.pr.new/@rolldown/binding-linux-arm-gnueabihf@9031

@rolldown/binding-linux-arm64-gnu

npm i https://pkg.pr.new/@rolldown/binding-linux-arm64-gnu@9031

@rolldown/binding-linux-arm64-musl

npm i https://pkg.pr.new/@rolldown/binding-linux-arm64-musl@9031

@rolldown/binding-linux-ppc64-gnu

npm i https://pkg.pr.new/@rolldown/binding-linux-ppc64-gnu@9031

@rolldown/binding-linux-s390x-gnu

npm i https://pkg.pr.new/@rolldown/binding-linux-s390x-gnu@9031

@rolldown/binding-linux-x64-gnu

npm i https://pkg.pr.new/@rolldown/binding-linux-x64-gnu@9031

@rolldown/binding-linux-x64-musl

npm i https://pkg.pr.new/@rolldown/binding-linux-x64-musl@9031

@rolldown/binding-openharmony-arm64

npm i https://pkg.pr.new/@rolldown/binding-openharmony-arm64@9031

@rolldown/binding-wasm32-wasi

npm i https://pkg.pr.new/@rolldown/binding-wasm32-wasi@9031

@rolldown/binding-win32-arm64-msvc

npm i https://pkg.pr.new/@rolldown/binding-win32-arm64-msvc@9031

@rolldown/binding-win32-x64-msvc

npm i https://pkg.pr.new/@rolldown/binding-win32-x64-msvc@9031

commit: 8c30464

@nicholaspufal
Copy link
Copy Markdown

This has fixed the issue I reported here: vitejs/vite#21957

Thank you @lazarv

@IWANABETHATGUY
Copy link
Copy Markdown
Member

@shulaoda, would you mind having a look when you are free?

@shulaoda shulaoda marked this pull request as draft April 24, 2026 06:59
@shulaoda shulaoda marked this pull request as ready for review April 24, 2026 07:34
@shulaoda
Copy link
Copy Markdown
Member

Thanks for the contribution, great fix! LGTM! I pushed a few small tweaks on top.

@shulaoda shulaoda changed the title fix(plugin): make emitFile chunk path synchronous to prevent deadlock fix: make this.emitFile chunk path synchronous to avoid deadlock Apr 24, 2026
@shulaoda shulaoda requested a review from h-a-n-a April 24, 2026 07:45
@shulaoda shulaoda requested a review from hyf0 April 24, 2026 07:45
@shulaoda shulaoda merged commit 6f6cf5a into rolldown:main Apr 26, 2026
29 checks passed
@github-project-automation github-project-automation Bot moved this from Backlog to Done in Vite 8 migration blockers Apr 26, 2026
shulaoda added a commit that referenced this pull request Apr 27, 2026
… data (#9176)

## Summary

- Replace `tokio::sync::Mutex` with `std::sync::Mutex` for two places that only guard plain data and never hold the lock across an `.await` point:
  - `PluginDriver.tx` / `NativePluginContextImpl.tx` — `Arc<Mutex<Option<UnboundedSender<ModuleLoaderMsg>>>>`
  - `BundleCoordinator.watcher` — `Mutex<DynFsWatcher>`
- De-async `PluginDriver::set_context_load_modules_tx` as a consequence, and drop the redundant `.await` at its two call sites in `ScanStage`.

## Why

Per tokio's own guidance, the async mutex should be reserved for protecting IO resources where the lock genuinely needs to be held across `.await`. For plain data the blocking `std::sync::Mutex` is both cheaper and less prone to latent deadlocks when used through sync NAPI bindings. This continues the migration started in #9031 (which collapsed the `FileEmitter.tx` path to sync for the `emit_chunk` deadlock fix).

Scope of this change:

- `PluginDriver.tx` is only ever acquired to **clone an `Option<UnboundedSender>`** and to swap a new sender in at scan boundaries. The existing code in `NativePluginContextImpl::load` already drops the guard before awaiting `sender.send(...)`; the only reason it used `tokio::sync::Mutex` was incidental. Switching to `std::sync::Mutex` makes that invariant syntactically enforced (no `.await` possible while the guard is live) and lets `set_context_load_modules_tx` shed its unnecessary `async` marker.
- `BundleCoordinator.watcher` is used exclusively inside `update_watch_paths`, where the guard is held only over synchronous `paths_mut.add(...)` / `paths_mut.commit()` calls. No `.await` exists in that critical section, so async mutex is pure overhead here.

No behavioral change for plugin authors or bundler consumers, the JS APIs and message flows are unchanged.

## Refs

- Closes part of #9114 (tokio → std mutex migration)
- Related: #9031 (sync-NAPI deadlock fix for `emit_chunk`; established the `std::sync::Mutex` pattern for the tx slot on `FileEmitter`)
This was referenced Apr 29, 2026
shulaoda added a commit that referenced this pull request Apr 29, 2026
## [1.0.0-rc.18] - 2026-04-29

### 💥 BREAKING CHANGES

- optimization: default unspecified inlineConst.mode to smart (#9248) by @IWANABETHATGUY

### 🐛 Bug Fixes

- rolldown_plugin_vite_import_glob: return error instead of panicking when virtual module uses a relative glob (#9241) by @shulaoda
- binding: treat empty inlineConst object as omitted (#9247) by @IWANABETHATGUY
- rolldown: keep enum declaration for optional-chain access (#9229) by @Dunqing
- link_stage: restore inline let-else in exports-kind filter (#9237) by @IWANABETHATGUY
- dev/lazy: avoid module reinitialization in lazy compilation patches (#9179) by @h-a-n-a
- dev: visit identifier references for runtime rewrites in HMR finalizer (#9191) by @h-a-n-a
- chunk-optimizer: pick dominator for runtime placement to avoid cycles (#9164) by @IWANABETHATGUY
- make `this.emitFile` chunk path synchronous to avoid deadlock (#9031) by @lazarv
- use sentinel id for `browser: false` ignored modules (#9192) by @shulaoda
- prevent chunk optimizer from creating import cycles (#9228) by @IWANABETHATGUY

### 🚜 Refactor

- replace tokio::sync::Mutex with std::sync::Mutex for non-IO data (#9176) by @shulaoda
- rolldown_plugin_vite_import_glob: do not rewrite import path for absolute base (#9195) by @shulaoda
- runtime_helper: wrap DependedRuntimeHelperMap in a struct (#9215) by @IWANABETHATGUY
- drop redundant clear() in determine_safely_merge_cjs_ns (#9206) by @IWANABETHATGUY
- clean up generate_lazy_export (#9208) by @IWANABETHATGUY
- bitset: return bool from set_bit to fuse guard-and-set (#9207) by @IWANABETHATGUY
- link_stage: simplify exports-kind filter and clarify safety comments (#9205) by @IWANABETHATGUY

### 📚 Documentation

- determine_module_exports_kind (#9252) by @IWANABETHATGUY
- fix dead link to esbuild ESM/CJS interop tests (#9230) by @Copilot
- remove CSS bundling references (#9234) by @shulaoda
- correct IncrementalFullBuild row in BundleMode table (#9214) by @IWANABETHATGUY
- design: add bundler data lifecycle design doc (#9212) by @hyf0
- remove minifier alpha status notices (#9202) by @sapphi-red

### ⚙️ Miscellaneous Tasks

- upgrade oxc to 0.128.0 (#9260) by @shulaoda
- deps: bump rolldown-ariadne to 0.6.0 (#9254) by @IWANABETHATGUY
- deps: update github actions (#9259) by @renovate[bot]
- deps: update github actions (#9258) by @renovate[bot]
- remove renovate overrides (#9257) by @Boshen
- use ubuntu-latest for security workflow (#9256) by @Boshen
- notify Discord around release publish (#9251) by @Boshen
- add release environment to npm publish workflow (#9250) by @Boshen
- justfile: drop the `--` separator before forwarded args in `vp run` (#9246) by @shulaoda
- deps: update test262 submodule for tests (#9243) by @sapphi-red
- add more tracing instrumentations (#9220) by @sapphi-red
- rolldown_plugin_vite_import_glob: remove outdated sourcemap doc comment (#9213) by @shulaoda
- update security workflow (#9201) by @Boshen

### ❤️ New Contributors

* @lazarv made their first contribution in [#9031](#9031)

Co-authored-by: shulaoda <165626830+shulaoda@users.noreply.github.com>
@rolldown-guard rolldown-guard Bot mentioned this pull request Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

7 participants