Skip to content

Make it possible to put service bindings into dynamic worker env#4834

Merged
kentonv merged 19 commits intomainfrom
kenton/serializable-fetcher
Sep 3, 2025
Merged

Make it possible to put service bindings into dynamic worker env#4834
kentonv merged 19 commits intomainfrom
kenton/serializable-fetcher

Conversation

@kentonv
Copy link
Copy Markdown
Member

@kentonv kentonv commented Aug 19, 2025

This implements a few things:

First, ctx.exports self-referential bindings now allow you to customize the props, by invoking the binding:

let stub = ctx.exports.MyEntrypoint({props: {foo: 123}})

This gives you back a ServiceStub (aka Fetcher) which will invoke the entrypoint with the given props delivered as ctx.props.

This becomes interesting due to:

Second, you can now take a service binding and put it into the env for a dynamically-loaded worker. You just plop it right in there:

env.LOADER.get("foo", () => {
  return {
    compatibilityDate: "2025-06-01",
    mainModule: "foo.js",
    modules: {...},
    env: {
      MY_SERVICE_BINDING: this.ctx.exports.MyEntrypoint({props: {...}})
    },
  };
});

And that'll just work. Now the dynamic worker has a binding that points back to an entrypoint in the parent, with the props you specified.

Also, as a bonus, you can embed service bindings into props bundled into other service bindings.

NOTE: I'm still working on the internal implementation, so the internal build will fail for the moment. It's not too complicated though.

@kentonv kentonv requested a review from harrishancock August 19, 2025 22:34
@kentonv kentonv requested review from a team as code owners August 19, 2025 22:34
@kentonv kentonv force-pushed the kenton/serializable-fetcher branch from a4bb856 to 844a6cc Compare August 22, 2025 02:02
@kentonv
Copy link
Copy Markdown
Member Author

kentonv commented Aug 22, 2025

Rebased and added two commits that were needed for the internal PR (which is now posted).

Have not addressed @jasnell comments yet, will do so tomorrow probably.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Aug 22, 2025

The generated output of @cloudflare/workers-types matches the snapshot in types/generated-snapshot 🎉

@kentonv kentonv force-pushed the kenton/serializable-fetcher branch from 844a6cc to b1e8d3c Compare August 22, 2025 12:51
# contexts. Frakenvalue::toCapnp() and fromCapnp() don't handle this at all -- the caller is
# expected to deal with it.
#
# TODO(cleanup): Consider making `Frakenvalue` a generic over the capTable type? Maybe even make
Copy link
Copy Markdown
Member Author

@kentonv kentonv Aug 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having written all the code (especially the internal code), I'm pretty convinced I need to go back and do this: Make both the capnp struct and C++ type Frankenvalue into generics. However, it would be too painful to retcon back into this PR, so I'd rather leave it for a future PR.

@jasnell
Copy link
Copy Markdown
Collaborator

jasnell commented Aug 22, 2025

Overall don't see any futher issues myself but will leave it for @harrishancock to review and sign off.

@kentonv kentonv force-pushed the kenton/serializable-fetcher branch from 34aca4a to 8ac4137 Compare August 23, 2025 14:59
Copy link
Copy Markdown
Collaborator

@harrishancock harrishancock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, most of my comments are FrakenValue

@kentonv
Copy link
Copy Markdown
Member Author

kentonv commented Sep 2, 2025

(just a rebase)

.wrapMetrics = !isInHouse,
.operationName = kj::mv(operationName),
});
return ClientWithTracing{kj::mv(client), kj::mv(traceContext)};
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmorrell-cloudflare I'm not sure if I need a traceContext here or not since the comments below don't explain why OutgoingFactory doesn't need one.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I do need one because a SubrequestChannel is semantically no different from an integer channel. I actually would like to refactor this later so that the only thing you do with an integer channel is obtain a SubrequestChannel object, then you make the request on that object. So I think it needs a trace context, if integer channels do.

But at the same time, I'd also like to get rid of OutgoingFactory in favor of SubrequestChannel, since they are almost the same thing...

Copy link
Copy Markdown
Contributor

@jmorrell-cloudflare jmorrell-cloudflare Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't explain why OutgoingFactory doesn't need one

I am not sure either. My changes were focused on adding attributes but not modifying the existing logic, which was structured this way. Maybe @fhanau has more context?

It would certainly be simpler if we didn't need to distinguish between cases here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't have a trace context here, then we won't create spans here. I guess that means the question then becomes: Do we want to expose spans for the given operation to users in the SubrequestChannel case? I would err on the side of caution and not add them for now (assuming that we're only adding new features here and not transitioning existing fetches to the SubrequestChannel yet). Expanding span coverage is usually simple, we should just make sure we don't expose spans that should remain internal and document them properly, this can be handled separately. If those requirements are met then adding spans shouldn't be an issue, it just feels out of scope here

As Jeremy said, this function merely maintains the behavior where we only created spans for uint channels so far. I have limited context here too, but it seems like tracing has been working fine without spans being created in the OutgoingFactory/CrossContextOutgoingFactory cases.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the "new feature" is just a new way of calling service bindings. It would be weird if some ways of calling service bindings are traced and others aren't. So I think the tracing is needed.

But what I really wanted to understand was why tracing is not needed with OutgoingFactory. The comment below simply says // For outgoing factories, no trace context needed. But why? Maybe there's a good reason, maybe there isn't. Maybe the reason is simply "we just haven't ever traced these before and don't want to change that without a specific reason". But that should be stated in the comment.

@kentonv kentonv force-pushed the kenton/serializable-fetcher branch from f5f3be5 to 15f79d6 Compare September 2, 2025 13:08
@kentonv
Copy link
Copy Markdown
Member Author

kentonv commented Sep 2, 2025

@kentonv kentonv requested a review from a team as a code owner September 2, 2025 15:49
readonly name?: string;
}
export interface DurableObjectNamespace<
export declare abstract class DurableObjectNamespace<
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently, the fact that LoopbackDurableObjectNamespace inherits DurableObjectNamespace caused DurableObjectNamespace to change from interface to abstract class.

That's probably harmless but also seems unnecessary -- LoopbackDurableObjectNamespace is itself an interface, so there's no reason it can't extend another interface.

@penalosa Any idea why this happens?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure off the top of my head why this would be happening, but it seems mostly harmless. A few other bindings are defined like that too (D1, Email, Vectorize, for instance). That said, this should be fairly easy to override by adding interface DurableObjectNamespace to the beginning of the type overrides:

JSG_TS_OVERRIDE(<T extends Rpc.DurableObjectBranded | undefined = undefined> {
get(id: DurableObjectId, options?: DurableObjectNamespaceGetDurableObjectOptions): DurableObjectStub<T>;
&
JSG_TS_OVERRIDE(<T extends Rpc.DurableObjectBranded | undefined = undefined> {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding interface DurableObjectNamespace to these overrides gives me diffs like:

   readonly name?: string;
   readonly jurisdiction?: string;
 }
-declare abstract class DurableObjectNamespace<
+interface DurableObjectNamespace<
   T extends Rpc.DurableObjectBranded | undefined = undefined,
 > {
-  newUniqueId(
-    options?: DurableObjectNamespaceNewUniqueIdOptions,
-  ): DurableObjectId;
-  idFromName(name: string): DurableObjectId;
-  idFromString(id: string): DurableObjectId;
   get(
     id: DurableObjectId,
     options?: DurableObjectNamespaceGetDurableObjectOptions,

I guess writing interface DurableObjectNamespace explicitly also inhibits the behavior of merging in methods from C++ that aren't explicitly overridden in the override block.

I think I'll just leave it alone.

}
interface DurableObjectState {
waitUntil(promise: Promise<any>): void;
props: any;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it should probably be a generic on DurableObjectState?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, however the same is still needed on ExecutionContext. We hadn't solved this yet because Glen wanted to solve it for both ctx.props and ctx.exports at the same time. In any case, it should be a separate PR.

@kentonv kentonv force-pushed the kenton/serializable-fetcher branch from 2614d5e to 588b9db Compare September 2, 2025 20:20
A JSG_STRUCT cannot have a simple handle as a field -- it must be wrapped in `jsg::JsRef`. Otherwise the handle ends up being discarded by a HandleScope before being delivered.

The tests were missing a test for props.
I found myself writing multiple implementations identical to `WorkerStubEntrypointOutgoingFactory`. Let's just bake it in.

Note: The careful observer may notice that this change no longer sets the `isInHouse` flag dynamically-loaded worker requests (by omitting it, thus accepting the default `false`, whereas previously it was explicitly set `true`). This is intentional, as I remembered what "in house" is meant to mean, and realized it ought to be false here. The `inHouse` flag is meant to apply to subrequests where the application isn't really aware it's making a subrequest, such as Workers KV requests. It is NOT normally applied to service binding requests, which are entirely visible to the app. Requests to dynamically-loaded workers seem like they should behave like service bindings.
This had been left as a TODO previously, but obviously we want it.
This can only actually be non-empty when using facets.
With this change, the loopback entrypoint stubs in `ctx.exports` are now callable. Calling them lets you specify additional options, in particular the value of `props`, like so:

```
let stub = ctx.exports.MyEntrypoint({props: {foo: 123}})
await stub.someRpc()
```
Same as previous commit -- but for actor classes in ctx.exports.
When an exported actor class has storage configured, then its entry in `ctx.exports` becomes a namespace binding. However, we would like to allow it to *also* be used as an actor class binding, in case you wanted to use the same class as a facet.

This change introduces loopback variants of the namespace binding types which simultaneously behave like `LoopbackDurableObjectClass`, so they can be used in both contexts.
These methods will no longer be thread-safe when they possibly refer to refcounted resources.

In practice, no current use of `Frankenvalue` actually relies on constness.
We define `Fetcher` as serializable in `Frakenvalue`s, with `SubrequestChannel` as the capability type.
In production, when we transfer a Fetcher binding to another worker, the second worker might run on an entirely different machine, maybe on the other side of the world. It's important that any bindings passed to it can be loaded from scratch locally, without calling back to the original Worker that made the introduction. But we can't possibly support this for dynamically-loaded workers because we don't know how to reload them without the app's help. So, let's enforce that entrypoint stubs to dynamically-loaded workers cannot be transferred.

We'll take the opportunity to enforce no transfers on DO stubs for now as well, though this is temporary.
This just follows all the same patterns we used for Fetcher.
It's an `AnyPointer` because it has different types in different contexts.

This is used in the production runtime, but nont (currently) workerd.
Depending on the cap table type, it may or may not be possible to clone const caps. We can do it when the caps just refer to I/O channel numbers or, in the production runtime, when they are "dehydrated" references.
This is more consistent with `props`, looks a little nicer, and is slightly more efficient as the move constructor for JsRef (particularly the underlying jsg::Data) is not completely trivial.
And a couple other typos in frankenvalue.h.
@kentonv kentonv force-pushed the kenton/serializable-fetcher branch from 588b9db to 67ef62e Compare September 3, 2025 02:48
@kentonv kentonv merged commit 7483d40 into main Sep 3, 2025
21 of 22 checks passed
@kentonv kentonv deleted the kenton/serializable-fetcher branch September 3, 2025 13:18
@kentonv kentonv restored the kenton/serializable-fetcher branch December 5, 2025 22:10
@kentonv kentonv deleted the kenton/serializable-fetcher branch December 6, 2025 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants