Skip to content

Implement Durable Object alarms in workerd#605

Merged
xortive merged 3 commits intomainfrom
malonso/workerd-alarms
May 11, 2023
Merged

Implement Durable Object alarms in workerd#605
xortive merged 3 commits intomainfrom
malonso/workerd-alarms

Conversation

@xortive
Copy link
Contributor

@xortive xortive commented May 3, 2023

Workerd now supports running alarms, storing them either in-memory or in a local sqlite database to persist across workerd process restarts.

Still need to hook up the alarms API to the scheduler, but wanted to put this up earlier than later.

@xortive xortive requested review from bcaimano, kentonv and mrbbot May 3, 2023 19:53
@xortive xortive force-pushed the malonso/actor-loopback branch from 189e535 to 29980e0 Compare May 3, 2023 19:57
@xortive xortive force-pushed the malonso/workerd-alarms branch from 6e89938 to 9f93337 Compare May 3, 2023 21:34
@xortive xortive changed the base branch from malonso/actor-loopback to main May 3, 2023 21:35
Copy link
Contributor

@bcaimano bcaimano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable!

@xortive xortive force-pushed the malonso/workerd-alarms branch 2 times, most recently from 0d1b894 to f920788 Compare May 4, 2023 16:15
@xortive
Copy link
Contributor Author

xortive commented May 4, 2023

I implemented the rest of what's missing and added a basic test -- with no infrastructure for more complex tests in workerd that can perform process restarts it'll be hard to test all of the behavior

@xortive xortive force-pushed the malonso/workerd-alarms branch from f920788 to b5eb6b8 Compare May 4, 2023 18:27
@xortive xortive changed the title [WIP] Implement Durable Object alarms in workerd Implement Durable Object alarms in workerd May 4, 2023
@xortive xortive marked this pull request as ready for review May 4, 2023 18:28
@xortive xortive force-pushed the malonso/workerd-alarms branch 3 times, most recently from f21f3bc to 28a7aef Compare May 4, 2023 22:22
let alarm = await this.state.storage.getAlarm();
if (alarm != null) {
throw new Error(`alarm time not cleared when handler ends. ${alarm}`);
}
Copy link
Collaborator

@jasnell jasnell May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: you can use

assert.ok(alarm == null, `alarm time not cleared when handler ends. ${alarm}`);

to simplify these checks.


const time = Date.now() + 100;
await this.state.storage.setAlarm(time);
assert.equal(await this.state.storage.getAlarm(), time);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert.equal(await this.state.storage.getAlarm(), time);
assert.strictEqual(await this.state.storage.getAlarm(), time);

let obj = env.ns.get(id);
let res = await obj.fetch("http://foo/test");
let text = await res.text();
assert.equal(text, "OK");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert.equal(text, "OK");
assert.strictEqual(text, "OK");

Copy link
Contributor

@bcaimano bcaimano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable!

@xortive xortive force-pushed the malonso/workerd-alarms branch from 84f6152 to a97cb8a Compare May 5, 2023 19:01
# Extensions provide capabilities to all workers. Extensions are usually prepared separately
# and are late-linked with the app using this config field.

alarmScheduler @4 :AlarmConfig;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, do we really want this to be top-level config?

It feels confusing that I could configure a durable object namespace that stores to disk, but get in-memory-only alarms. Vice-versa is even more confusing I think.

Also, this means that if someone is serving multiple DO namespaces from the same workerd instance, but later decides that they want to split them up, they can't really do that -- they'll lose their alarms.

Could we instead create a separate alarms database per DO namespace? If the namespace is in-memory, use in-memory handling. If the DO namespace is on-disk, then we store the alarms on-disk in the same directory. I'd call the file something like metadata.sqlite or namespace.sqlite so that if we find there's per-namespace stuff other than alarms we want to put in there in the future, we can.

@kentonv
Copy link
Member

kentonv commented May 6, 2023

Sorry for the late review here -- as you know I'm also trying to meet a dev week deadline so it's been tough keeping up.

I'm really happy you implemented this! This was sorely-missing functionality.

I think, though, that we really need to have a separate alarms database for each DO namespace, per my comment above. Having a global alarms database severely limits the ability to combine and split workerd configs, and means that part of a namespace's data is stored separately from the namespace.

If there's not time to resolve this for dev week, I think as a compromise we could remove the configuration and say for now that alarms are always configured for in-memory use only. That should be OK for local testing purposes. It might even be desirable -- it could be confusing if you start up your test instance and suddenly a bunch of alarms from a past session are delivered?

@xortive
Copy link
Contributor Author

xortive commented May 8, 2023

If there's not time to resolve this for dev week, I think as a compromise we could remove the configuration and say for now that alarms are always configured for in-memory use only. That should be OK for local testing purposes. It might even be desirable -- it could be confusing if you start up your test instance and suddenly a bunch of alarms from a past session are delivered?

The desired behavior here is persistent alarms -- the idea was that wrangler/miniflare would delete the alarms db at the beginning/end of the session, so your alarms would persist across code updates, but you wouldn't have alarms from previous sessions hanging around. in-memory alarms wouldn't work for this.

@kentonv
Copy link
Member

kentonv commented May 10, 2023

OK, well... how do we proceed here? I really think it's important that alarms are stored along with the namespace data and not centralized. It also makes the config simpler (no new config is actually needed, I think).

We could name the config property in a way that indicates it's a temporary solution that will change soon, and shouldn't be used in production...

@xortive
Copy link
Contributor Author

xortive commented May 10, 2023

OK, well... how do we proceed here? I really think it's important that alarms are stored along with the namespace data and not centralized. It also makes the config simpler (no new config is actually needed, I think).

We could name the config property in a way that indicates it's a temporary solution that will change soon, and shouldn't be used in production...

@mrbbot has said he's OK with in-memory for the short term -- I think I'll rip out the ability to config the alarm scheduler to be backed by a localDisk and land this as-is otherwise

@xortive xortive force-pushed the malonso/workerd-alarms branch from a639681 to f3a11f9 Compare May 11, 2023 16:25
Workerd now supports running alarms, storing them either in-memory or in
a local sqlite database to persist across workerd process restarts.
@xortive xortive force-pushed the malonso/workerd-alarms branch from f3a11f9 to 4d3c606 Compare May 11, 2023 16:53
@xortive xortive merged commit 3516458 into main May 11, 2023
kentonv added a commit that referenced this pull request Apr 29, 2025
In a Durable Object, the type of `ctx.id` is supposed to be of type `DurableObjectId`, and in production, it is.

Way way back in #605, a bug was introduced such that `ctx.id` became a string for Durable Objects in workerd. But `ctx.id` is only supposed to be a string for "ephemeral objects" (aka colo-local actors).

Somehow, nobody ever noticed? Probably because the only thing most people would do with it anyway is stringify it, which JavaScript is really good at doing automatically.

Anyway, this fixes the bug, restoring the proper type.

(Also this tidies up ownership of the key strings in the actor map. The key is now owned by the ActorContainer.)
kentonv added a commit that referenced this pull request Apr 29, 2025
In a Durable Object, the type of `ctx.id` is supposed to be of type `DurableObjectId`, and in production, it is.

Way way back in #605, a bug was introduced such that `ctx.id` became a string for Durable Objects in workerd. But `ctx.id` is only supposed to be a string for "ephemeral objects" (aka colo-local actors).

Somehow, nobody ever noticed? Probably because the only thing most people would do with it anyway is stringify it, which JavaScript is really good at doing automatically.

Anyway, this fixes the bug, restoring the proper type.

(Also this tidies up ownership of the key strings in the actor map. The key is now owned by the ActorContainer.)
Worker::Lock lock(*service.worker, asyncLock);
auto newActor = kj::refcounted<Worker::Actor>(
*service.worker, nullptr, kj::mv(id), true, kj::mv(makeActorCache),
*service.worker, nullptr, kj::str(id), true, kj::mv(makeActorCache),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prior to this change, id was a Worker::Actor::Id, but this downgraded it to a string. Oops. This is visible in ctx.id in JavaScript having the wrong type (a string, rather than a DO ID), but it seems nobody ever noticed. Fixed in #4041.

kentonv added a commit that referenced this pull request Apr 29, 2025
In a Durable Object, the type of `ctx.id` is supposed to be of type `DurableObjectId`, and in production, it is.

Way way back in #605, a bug was introduced such that `ctx.id` became a string for Durable Objects in workerd. But `ctx.id` is only supposed to be a string for "ephemeral objects" (aka colo-local actors).

Somehow, nobody ever noticed? Probably because the only thing most people would do with it anyway is stringify it, which JavaScript is really good at doing automatically.

Anyway, this fixes the bug, restoring the proper type.

(Also this tidies up ownership of the key strings in the actor map. The key is now owned by the ActorContainer.)
kentonv added a commit that referenced this pull request May 7, 2025
In a Durable Object, the type of `ctx.id` is supposed to be of type `DurableObjectId`, and in production, it is.

Way way back in #605, a bug was introduced such that `ctx.id` became a string for Durable Objects in workerd. But `ctx.id` is only supposed to be a string for "ephemeral objects" (aka colo-local actors).

Somehow, nobody ever noticed? Probably because the only thing most people would do with it anyway is stringify it, which JavaScript is really good at doing automatically.

Anyway, this fixes the bug, restoring the proper type.

(Also this tidies up ownership of the key strings in the actor map. The key is now owned by the ActorContainer.)
threepointone added a commit that referenced this pull request Feb 18, 2026
Previously, all alarms were stored in a single global in-memory SQLite
database, meaning they were lost on process restart. This made it
impossible to test alarm resiliency scenarios.

Move alarm scheduler ownership from Server into ActorNamespace, so each
DO namespace gets its own AlarmScheduler backed by metadata.sqlite in
the namespace's storage directory. On-disk namespaces get persistent
alarms; in-memory namespaces get in-memory alarms. No new configuration
is needed -- alarm storage follows the existing durableObjectStorage
setting.

Ref: #605 (comment)
Co-authored-by: Cursor <cursoragent@cursor.com>
threepointone added a commit that referenced this pull request Feb 27, 2026
Previously, all alarms were stored in a single global in-memory SQLite
database, meaning they were lost on process restart. This made it
impossible to test alarm resiliency scenarios.

Move alarm scheduler ownership from Server into ActorNamespace, so each
DO namespace gets its own AlarmScheduler backed by metadata.sqlite in
the namespace's storage directory. On-disk namespaces get persistent
alarms; in-memory namespaces get in-memory alarms. No new configuration
is needed -- alarm storage follows the existing durableObjectStorage
setting.

Ref: #605 (comment)
Co-authored-by: Cursor <cursoragent@cursor.com>
kentonv pushed a commit that referenced this pull request Mar 15, 2026
Previously, all alarms were stored in a single global in-memory SQLite
database, meaning they were lost on process restart. This made it
impossible to test alarm resiliency scenarios.

Move alarm scheduler ownership from Server into ActorNamespace, so each
DO namespace gets its own AlarmScheduler backed by metadata.sqlite in
the namespace's storage directory. On-disk namespaces get persistent
alarms; in-memory namespaces get in-memory alarms. No new configuration
is needed -- alarm storage follows the existing durableObjectStorage
setting.

Ref: #605 (comment)
Co-authored-by: Cursor <cursoragent@cursor.com>
kentonv pushed a commit that referenced this pull request Mar 18, 2026
Previously, all alarms were stored in a single global in-memory SQLite
database, meaning they were lost on process restart. This made it
impossible to test alarm resiliency scenarios.

Move alarm scheduler ownership from Server into ActorNamespace, so each
DO namespace gets its own AlarmScheduler backed by metadata.sqlite in
the namespace's storage directory. On-disk namespaces get persistent
alarms; in-memory namespaces get in-memory alarms. No new configuration
is needed -- alarm storage follows the existing durableObjectStorage
setting.

Ref: #605 (comment)
Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants