Skip to content

fix(runtime): reduce memory retention after web worker termination#32617

Merged
bartlomieju merged 4 commits intomainfrom
fix/worker-memory-leak-26058
Mar 12, 2026
Merged

fix(runtime): reduce memory retention after web worker termination#32617
bartlomieju merged 4 commits intomainfrom
fix/worker-memory-leak-26058

Conversation

@bartlomieju
Copy link
Copy Markdown
Member

@bartlomieju bartlomieju commented Mar 10, 2026

Summary

Addresses #26058 — Web Workers use significantly more RSS than Chrome, and terminating them doesn't release the memory back to the OS.

Two targeted changes:

  • Call malloc_trim(0) on Linux after each worker thread exits. When a worker's V8 isolate and tokio runtime are dropped, glibc's allocator holds onto the fragmented heap pages rather than returning them to the OS. This explicitly asks glibc to release them. Follows the same pattern already used in the SIGUSR2 memory trim handler (runtime/worker.rs:81).

  • Remove the delayed termination hack entirely. The 2-second timer that spawned threads/tasks to force-terminate workers is no longer needed — the upstream V8 issue that required it has been fixed. Workers now terminate cooperatively via the termination signal and event loop wakeup, which also eliminates the ~100 lingering OS threads during rapid worker churn.

What this doesn't fix

  • The ~7-8MB per-isolate overhead from V8's lack of shared read-only heap (upstream V8 issue)
  • macOS/Windows RSS behavior (malloc_trim is Linux-only)

bartlomieju and others added 3 commits March 10, 2026 16:02
Two changes to help RSS go down after workers are destroyed (#26058):

1. Call `malloc_trim(0)` on Linux after each worker thread exits. When a
   worker's V8 isolate and tokio runtime are dropped, the freed memory
   often isn't returned to the OS because glibc's allocator holds onto
   fragmented heap pages. This matches the existing SIGUSR2 handler
   pattern already used in the main worker.

2. Replace the OS thread used for the 2-second termination fallback timer
   with a tokio task. Previously, each worker termination spawned a raw
   `std::thread` just to sleep(2s) and call `terminate_execution()`. With
   rapid worker churn (e.g. 100 workers at a time), this created many
   short-lived OS threads. A tokio timer is much lighter weight.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The 2-second timer fallback for terminating workers is no longer needed
since the upstream V8 issue has been fixed. Workers now terminate
cooperatively via the termination signal and event loop wakeup.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@bartlomieju bartlomieju requested a review from nathanwhit March 12, 2026 07:57
Copy link
Copy Markdown
Contributor

@kajukitli kajukitli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

removing the 2s forced-termination hack is the right call if the upstream V8 issue is actually gone. that path was always gross: extra thread per terminate, weird delayed kill semantics, and easy to leak thread churn under worker-heavy workloads.

malloc_trim(0) after the worker runtime/isolate drops also makes sense as a targeted Linux-only mitigation for the RSS retention problem.

one caveat is that malloc_trim(0) is process-global and can be a little expensive if workers churn hard, but given this is specifically trying to reduce post-worker RSS and it's Linux-only, i think the tradeoff is reasonable.

@bartlomieju bartlomieju merged commit b31680f into main Mar 12, 2026
221 of 224 checks passed
@bartlomieju bartlomieju deleted the fix/worker-memory-leak-26058 branch March 12, 2026 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants