Skip to content

add gc support to Rust owned objects#6272

Merged
anonrig merged 6 commits intomainfrom
yagiz/rust-gc-v2
Mar 20, 2026
Merged

add gc support to Rust owned objects#6272
anonrig merged 6 commits intomainfrom
yagiz/rust-gc-v2

Conversation

@anonrig
Copy link
Copy Markdown
Member

@anonrig anonrig commented Mar 6, 2026

Adds gc support to Rust owned objects using the existing GC implementation, rather than using cppgc heap, which the draft implementation used: #5638.

@anonrig anonrig requested review from guybedford and mikea March 6, 2026 20:01
@anonrig anonrig requested review from a team as code owners March 6, 2026 20:01
ask-bonk[bot]

This comment was marked as outdated.

@ask-bonk

This comment was marked as outdated.

@ask-bonk ask-bonk bot requested a review from a team as a code owner March 6, 2026 20:10
@anonrig anonrig requested a review from harrishancock March 6, 2026 20:11
@anonrig anonrig force-pushed the yagiz/rust-gc-v2 branch 2 times, most recently from 4938422 to 170b02a Compare March 6, 2026 20:24
@anonrig
Copy link
Copy Markdown
Member Author

anonrig commented Mar 6, 2026

/bigbonk roast this pr

@ask-bonk

This comment was marked as outdated.

@anonrig anonrig force-pushed the yagiz/rust-gc-v2 branch 2 times, most recently from ba7b471 to 8a6f84a Compare March 6, 2026 23:08
@anonrig
Copy link
Copy Markdown
Member Author

anonrig commented Mar 6, 2026

/bigbonk roast this pr

@ask-bonk

This comment has been minimized.

@anonrig anonrig force-pushed the yagiz/rust-gc-v2 branch from 57bfc1c to 1d356e2 Compare March 6, 2026 23:22
@codspeed-hq

This comment has been minimized.

@anonrig anonrig force-pushed the yagiz/rust-gc-v2 branch from 1d356e2 to 1525e8e Compare March 6, 2026 23:54
@anonrig
Copy link
Copy Markdown
Member Author

anonrig commented Mar 7, 2026

/bigbonk roast this pr

@ask-bonk

This comment was marked as outdated.

@anonrig anonrig force-pushed the yagiz/rust-gc-v2 branch 3 times, most recently from d64dc80 to 2a09295 Compare March 9, 2026 15:59
@anonrig anonrig added the rust Pull requests that update rust code label Mar 9, 2026
@anonrig anonrig requested a review from kentonv March 9, 2026 23:14
@anonrig
Copy link
Copy Markdown
Member Author

anonrig commented Mar 10, 2026

/bigbonk roast this pr

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk bot commented Mar 10, 2026

@anonrig Bonk workflow was cancelled.

View workflow run · To retry, trigger Bonk again.

@guybedford
Copy link
Copy Markdown
Contributor

Wow, it appears this really does resolve the AI generated concern... nice work!

One last comment here - instead of manually doing macro generations why aren't we just using a trait here for all types? Here's the full generated suggestion:

Suggestion: trait-based GC tracing instead of macro field-type pattern matching
The current approach in jsg-macros pattern-matches on specific field types (jsg::Rc<T>, jsg::v8::Global<T>, Cell<Option<...>>, etc.) to generate visit_ref/visit_global calls. This requires the macro to be updated every time a new traceable type is added, and silently skips unrecognised field types — so a user-defined wrapper around a Global<T> would compile fine but leak.
A trait-based design fixes both problems:

/// # Safety
/// The implementation must report ALL GC-reachable references to the visitor.
/// A missing reference allows the GC to collect a live object, causing
/// use-after-free via cppgc.
pub unsafe trait Trace {
    fn trace(&self, visitor: &mut GcVisitor);
}

The macro then generates a uniform field walk with no type inspection:

fn trace(&self, visitor: &mut GcVisitor) {
    Trace::trace(&self.field_a, visitor);
    Trace::trace(&self.field_b, visitor);
}

If a field type doesn't implement Trace, it's a compile error — the developer must either add an unsafe impl Trace for their type, or wrap it in a NotTraced<T> to explicitly declare it has no GC edges:

unsafe impl<T> Trace for NotTraced<T> {
    fn trace(&self, _visitor: &mut GcVisitor) {} // explicitly opted out
}

Provided impls in the jsg crate cover all the common cases:

unsafe impl<R: Resource> Trace for jsg::Rc<R> {
    fn trace(&self, visitor: &mut GcVisitor) { visitor.visit_ref(self); }
}
unsafe impl<R: Resource> Trace for jsg::Weak<R> {
    fn trace(&self, _visitor: &mut GcVisitor) {} // weak refs not traced
}
unsafe impl<T> Trace for jsg::v8::Global<T> {
    fn trace(&self, visitor: &mut GcVisitor) { visitor.visit_global(self); }
}
unsafe impl<T: Trace> Trace for Option<T> {
    fn trace(&self, visitor: &mut GcVisitor) {
        if let Some(inner) = self { inner.trace(visitor); }
    }
}
unsafe impl<T: Trace> Trace for Cell<Option<T>> {
    fn trace(&self, visitor: &mut GcVisitor) {
        // SAFETY: GC tracing is single-threaded within a V8 isolate and
        // never re-entrant on the same object. Read-only access through
        // as_ptr() is sound under this contract.
        if let Some(inner) = unsafe { &*self.as_ptr() } {
            inner.trace(visitor);
        }
    }
}

Plain types with no GC edges (String, u32, HashMap, etc.) get a trivial impl:

unsafe impl Trace for String {
    fn trace(&self, _visitor: &mut GcVisitor) {}
}
// ... and so on for all primitives/stdlib types

These are all inlined and eliminated by LLVM — there is no runtime cost for non-GC fields. The only overhead is the actual visit_ref/visit_global calls, which are already paid today.
Adding a new traceable type — e.g. a future jsg::Function — only requires an unsafe impl Trace in the jsg crate with no macro changes.
This comment was written with AI assistance.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 20, 2026

The generated output of @cloudflare/workers-types matches the snapshot in types/generated-snapshot 🎉

Copy link
Copy Markdown
Contributor

@guybedford guybedford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed this with @anonrig. Would be great to see a trait design as mentioned if possible, but that can be a follow-up. Regarding the unwind safety it would be nice to still hear the full story on that as well in terms of how panics propagate and terminate the isolate / get reported.

@anonrig anonrig requested a review from jasnell March 20, 2026 19:19
@anonrig anonrig force-pushed the yagiz/rust-gc-v2 branch 2 times, most recently from c42d78b to 677011e Compare March 20, 2026 19:51
@anonrig anonrig enabled auto-merge (rebase) March 20, 2026 19:52
@jasnell
Copy link
Copy Markdown
Collaborator

jasnell commented Mar 20, 2026

Something that can be addressed later but is worth mentioning. A common pattern with the GC-tracing mechanism currently in the C++ side of things is:

class Foo: public jsg::Object {
 public:
  Foo() = default;

 private:
  struct PrivateData {
    void visitForGc(jsg::GcVisitor& visitor) {
      // Visit any member variables that are jsg::Object or jsg::Value here.
    }
  };

  PrivateData privateData_;

  void visitForGc(jsg::GcVisitor& visitor) override {
    privateData_.visitForGc(visitor);
  }
};

Specifically, the items to be traced are held by some other child structure. Given the way the macros are currently set up, this isn't possible in the rust design without a fair amount of manual pain... I asked opencode to write up the details...


The #[jsg_resource] macro on a struct always generates a GarbageCollected impl. The generate_trace_statements function (line 836 of the diff) walks the struct's fields and classifies each one via get_traceable_type:

  • jsg::Rc<T>TraceableType::Ref → generates visitor.visit_rc(&self.field)
  • jsg::v8::Global<T>TraceableType::Global → generates visitor.visit_global(&self.field)
  • jsg::Weak<T>TraceableType::Weak → no-op
  • Everything else → TraceableType::None → silently skipped

A field of type PrivateData falls into None. The macro has no concept of "this field itself implements GarbageCollected or has a trace method — recurse into it." So if you wrote:

struct PrivateData {
    child: jsg::Rc<Child>,
    callback: Cell<Option<jsg::v8::Global<jsg::v8::Value>>>,
}

#[jsg_resource]
pub struct Foo {
    private_data: PrivateData,  // ← NOT traced! Classified as TraceableType::None
}

The generated GarbageCollected impl for Foo would be an empty no-op — the child and callback inside PrivateData would never be visited.

The manual escape hatch

The jsg-macros README documents a manual path (line 402 of the diff):

For complex cases or custom tracing logic, you can manually implement GarbageCollected without using the jsg_resource macro.

But there's a catch: #[jsg_resource] on the struct generates four impls at once — Type, ToJS, FromJS, and GarbageCollected. There's no way to say "generate the first three but let me write GarbageCollected myself." So to use the manual path, you'd need to either:

  1. Skip #[jsg_resource] on the struct entirely and hand-write all four trait impls (painful — lots of boilerplate), or

  2. Put all GC-visible fields directly on the resource struct and restructure to avoid nested structs (works but changes the design pattern).

What would need to change to support this pattern

Two potential approaches, not mutually exclusive:

Option A: Add a #[trace] field attribute. The macro would recognize fields annotated with #[trace] and generate self.field.trace(visitor) in the trace body. The nested struct would implement GarbageCollected (either manually or via its own derive):

struct PrivateData { /* ... */ }
impl jsg::GarbageCollected for PrivateData {
    fn trace(&self, visitor: &mut jsg::GcVisitor) {
        visitor.visit_rc(&self.child);
        visitor.visit_global(&self.callback);
    }
    fn memory_name(&self) -> &'static CStr { c"PrivateData" }
}

#[jsg_resource]
pub struct Foo {
    #[trace]  // ← tells macro to call self.private_data.trace(visitor)
    private_data: PrivateData,
}

This is a small change to generate_trace_statements — add a check for a #[trace] attribute before the type-based classification, and emit self.#field_name.trace(visitor) for those fields.

Option B: Add a #[jsg_resource(custom_trace)] escape hatch. This would suppress the auto-generated GarbageCollected impl while still generating Type, ToJS, and FromJS:

#[jsg_resource(custom_trace)]
pub struct Foo {
    private_data: PrivateData,
}

impl jsg::GarbageCollected for Foo {
    fn trace(&self, visitor: &mut jsg::GcVisitor) {
        self.private_data.trace(visitor);
    }
    fn memory_name(&self) -> &'static CStr { c"Foo" }
}

This is also a small change — check for custom_trace in the attribute and skip #gc_impl in the output.

Recommendation

Option A (#[trace] attribute) is cleaner and less error-prone — it keeps the auto-generation working while supporting delegation, and the GcVisitor already has the right API surface (it's just a matter of calling trace() on the field). Option B is a good safety valve for truly custom cases but puts the burden on the developer to trace everything correctly.

Both are missing from the current PR. Given that this nested-struct pattern is common in the C++ codebase, I'd flag this as a gap worth tracking before Rust resources start holding references to other resources through intermediate structs. For the current use case (simple resources like DnsUtil with no GC-visible fields), it doesn't matter.


This pattern is pervasive — there are at least 38 instances in src/workerd/api/ alone.

Categories include:

Category Examples Count
Nested structs with visitForGc ReadableImpl::Algorithms, WritableImpl::WriteRequest, PendingAbort ~12
External classes with visitForGc CfProperty, ReaderImpl, ReaderLocked, WriterLocked ~8
Virtual base classes CryptoKey::Impl, ReadableStreamController, WritableStreamController ~3
Iterator states (via JSG_ITERATOR) FormData::IteratorState, URLSearchParams::IteratorState ~5
State machine variants StreamStates::Erroring, AllReader ~5
Direct field reach-in (no delegation method) ErrorEvent::init.error, FetchEvent::respondWithCalled.promise ~5

Notable examples:

  • EventTargetJavaScriptHandler, NativeHandler — traced via visitor.visit(js) on each handler variant
  • CryptoKeyImpl (virtual base) — traced via impl->visitForGc(visitor) through a kj::Own<Impl>
  • Request/ResponseCfProperty — traced via visitor.visit(cf), external class in cf-property.h
  • ReadableStreamDefaultControllerReadableImpl<Self> — traced via visitor.visit(impl), contains Algorithms sub-struct with jsg::Function fields
  • HTMLRewriterUnregisteredElementHandlers/UnregisteredDocumentHandlers — traced via visitor.visit(elementHandlers)
  • ReadableStreamReadableStreamController — traced via virtual dispatch through visitor.visit(getController())

@jasnell jasnell disabled auto-merge March 20, 2026 20:03
@jasnell
Copy link
Copy Markdown
Collaborator

jasnell commented Mar 20, 2026

I disabled auto-merge. I'm currently doing a deep dive review on this PR again.

@jasnell
Copy link
Copy Markdown
Collaborator

jasnell commented Mar 20, 2026

There's a similar gc tracing gap with collection, e.g. in C++, kj::Vector<jsg::Ref<...>> is gc-traceable... tracing collections in the rust equivalent is not yet implemented:


Problem: Vec<jsg::Rc<T>>, Vec<jsg::v8::Global<T>>, HashMap<K, jsg::Rc<V>>, and similar collection types are classified as TraceableType::None and silently skipped. Every Rc<T> inside the collection would be invisible to the GC.

C++ equivalent: The C++ GcVisitor provides visitAll(collection) which iterates any range and calls visit() on each element. Manual for loops over collections are also common.

We see this pattern in Performance, TraceItem, QueueEvent, DiagnosticsChannelModule, Channel, WritableImpl, EventTarget, HTMLRewriter, and WritableStreamInternalController


Doesn't need to be addressed in this PR per se but will need to be done before we can migrate a large number of APIs to rust.

@jasnell
Copy link
Copy Markdown
Collaborator

jasnell commented Mar 20, 2026

Also noting for reference... the following gc traceable types have no equivalent rust types yet and cannot be traced: jsg::Function<Ret(Args...)>, jsg::Promise<T>, jsg::Promise<T>::Resolver, jsg::Name, jsg::JsRef<T>, jsg::MemoizedIdentity<T>, jsg::HashableV8Ref<T>, jsg::BufferSource, jsg::PromiseResolverPair<T> ... again, don't need to be done in this PR but will need to be covered as a follow up before we're able to migrate a fair number of APIs to rust

@jasnell
Copy link
Copy Markdown
Collaborator

jasnell commented Mar 20, 2026

Another gap that needs to be verified and likely documented:

Problem: Proc macros operate on the syntactic AST, not the resolved type system. A field declared as type MyCallback = jsg::v8::Global<jsg::v8::Function>; and used as callback: MyCallback will be seen by the macro as a single path segment MyCallback, which does not match jsg::v8::Global. The field is silently ignored.

Impact: Any crate-level type alias, use rename, or newtype wrapper around a GC-visible type will be invisible to the macro. This is a fundamental limitation of proc macros — they cannot resolve types.

Mitigations:

  • Document that GC-visible fields must use the canonical type paths
  • A #[trace] field attribute (from Gap 2) would also solve this, since the user explicitly marks the field for tracing
  • A custom_trace escape hatch allows manual implementation

@jasnell
Copy link
Copy Markdown
Collaborator

jasnell commented Mar 20, 2026

I'm documenting these gaps as individual comments so they can be separately referenced.. here's another:

Problem: The C++ codebase uses kj::OneOf<T...> extensively for state machines where different states hold different GC-visible fields. The Rust equivalent would be a enum with GC-visible fields in some variants. The macro has no support for tracing enum variants.

C++ example (streams):

kj::OneOf<StreamStates::Closed, StreamStates::Errored, Readable> state;
void visitForGc(jsg::GcVisitor& visitor) {
  KJ_SWITCH_ONEOF(state) {
    KJ_CASE_ONEOF(closed, StreamStates::Closed) {}
    KJ_CASE_ONEOF(errored, StreamStates::Errored) {
      visitor.visit(errored.reason);
    }
    KJ_CASE_ONEOF(readable, Readable) {
      readable.visitForGc(visitor);
    }
  }
}

Impact: Stream controllers, event targets, and other stateful resources use this pattern heavily. Porting any of these to Rust would require either manual GarbageCollected implementations or macro support for enum tracing.

Copy link
Copy Markdown
Collaborator

@jasnell jasnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Addressesing the tracing gaps I commented on will be a critical necessary next step. We won't be able to migrate a large number of APIs until those are fully addressed.

@jasnell
Copy link
Copy Markdown
Collaborator

jasnell commented Mar 20, 2026

I'd still prefer @kentonv to take a look at well as the whole gc/tracing path in jsg carries some fairly sharp edges if we get details wrong, but won't insist on it.

@kentonv
Copy link
Copy Markdown
Member

kentonv commented Mar 20, 2026

I'd still prefer @kentonv to take a look at well as the whole gc/tracing path in jsg carries some fairly sharp edges if we get details wrong, but won't insist on it.

Sorry, I am not going to be able to help here. It'd take me weeks just to spin up enough context to be able to meaningfully review this -- TBH I don't really even know Rust yet.

My main concern is I want to make sure we're following the same patterns in Rust and C++. No inventing a new way of doing things, unless the change of language specifically calls for it. E.g. obviously Rust macros are very different from C++ macros, so macro design may be different. But the way we do GC tracing should basically be transliterated from C++.

Hoping we can stay on that track without me personally reviewing all the details.

@anonrig anonrig enabled auto-merge (rebase) March 20, 2026 23:24
@anonrig anonrig merged commit cea70af into main Mar 20, 2026
22 checks passed
@anonrig anonrig deleted the yagiz/rust-gc-v2 branch March 20, 2026 23:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants