Slack latency: 3× smaller worker script, append-only webhook ack, eyes at the routing hop, pre-warmed hosts#1494
Conversation
…e worker script The deployed os-prd script measured 89.1 MB: alchemy's noBundle upload globs everything under dist/server, which includes 50 MB of sourcemaps and browser-only modules (web workers, wasm) the Vite SSR build emits but the server graph never imports. Cold Durable Object isolates pay for total script size, and the Slack webhook path chains 3-4 DOs — measured in prd as ~6.5s + 3.0s + 1.4s of sequential DO cold starts (14s from message to the eyes reaction, 5ms CPU). prune-server-bundle.ts runs between build and asset preupload: deletes every module unreachable from the entrypoint via import/new URL literals, plus all sourcemaps except the entrypoint's own (small, and the one Cloudflare can use to symbolicate worker stack traces; chunk maps are mostly browser code and pure ballast inside a worker script). Validated against the extracted prd bundle: keeps exactly the 186 live modules, deletes the 3 browser-only ones and the chunk maps (~52 MB). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…pre-warmed hosts Three serial cold-start legs stood between a Slack message and any visible response in prd: 1. The webhook handler awaited SlackIntegrationDO.initialize() before appending, serializing a cold DO ahead of Slack's 200 — observed at 8s, with Slack's 3s retry queueing behind the same gate. Now only the durable append gates the response; initialize + catch-up run in waitUntil. Order-independent: existing integrations already have their subscription on the stream, and a new integration picks the webhook up via replay once the subscription lands. 2. The eyes reaction lived in the slack-agent processor, three DO cold starts downstream. The router now reports routed webhooks to the host (acknowledgeRoutedWebhook) and SlackIntegrationDO adds the reaction immediately, gated by the same payload-only rules the slack-agent applies (no bot messages, no reaction events, no bot-user actions). The slack-agent still adds it on catch-up; already_reacted makes the pair idempotent. 3. A newly routed thread stream cold-started serially: stream DO, then its dial woke the slack-agent host, then the agent host. The router now pre-warms both hosts (prewarmRoutedStreamHosts) concurrently with the bootstrap append. Everything either side appends is idempotency-keyed and order-independent. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Verification record Script size (the cold-start tax every DO isolate pays):
E2E against
Dev smoke: After this deploys to prd, worth re-measuring message→👀 on a cold thread (was 14s; the eyes now depend only on the first hop, and each cold DO start should drop from seconds to sub-second). 🤖 Generated with Claude Code |
|
Measured latency traces on this PR's preview ( Method: real root message posted to
Reproduced twice cold (second run: eyes at ≈+1.5s after webhook commit, route-configured at +1.28s) — consistent. Observations:
🤖 Generated with Claude Code |
The problem
A Slack message in prd took ~14s to get the 👀 reaction and ~20s to get a reply (example:
iterateproject, threadts-1781170058-112929). Hop-by-hop, from the message's Slackts:Two multiplying causes: the deployed script was 89.1 MB (50 MB sourcemaps + browser-only modules uploaded as worker modules by alchemy's noBundle glob over
dist/server; the live server graph is ~34 MB, the entrypoint 1.75 MB) — and every cold DO isolate loads all of it — times 3–4 distinct DOs chained serially on the webhook path. The warm path was always fine (webhook 1–6ms, appends 20–100ms): this is cold-start tax, not stream-architecture tax.The fixes (no change to the streams/processors idea)
prune-server-bundle.ts(runs between build and asset preupload): deletes everydist/servermodule unreachable from the entrypoint via import/new URLliterals (browser web workers + their wasm that the SSR build emits), plus all sourcemaps except the entrypoint's own (small; the one Cloudflare can symbolicate worker stack traces with — chunk maps are browser code and pure ballast inside a worker script). Validated against the extracted prd bundle: keeps exactly the 186-module live graph, deletes the 3 browser-only modules + chunk maps.SlackIntegrationDO.initialize()before responding — only the durable append gates the 200; initialize + catch-up moved towaitUntil. Order-independent (existing integrations have their subscription on the stream; new ones pick the webhook up via replay). Stops the >3s Slack retry storm.acknowledgeRoutedWebhook) and SlackIntegrationDO adds the reaction immediately — one hop from ingress instead of three cold DO hops downstream — gated by the same payload-only rules the slack-agent applies (no bot messages, no reaction events, no bot-user actions). slack-agent still adds it on catch-up;already_reactedmakes the pair idempotent.prewarmRoutedStreamHosts): for a newly routed thread, the SLACK_AGENT and AGENT host DOsinitialize()concurrently with the bootstrap append instead of serially after each dial. Everything either side appends is idempotency-keyed and order-independent (the anchor-skip recovery from agent: recover triggering inputs skipped by the side-effect anchor #1481 covers trigger ordering).Measured
Dev-stage deploys of this branch (
os-dev-jonas):kept 186 modules, deleted 3 unreachable modules + 180 sourcemaps (55.0 MB)Expected effect: each cold DO instantiation drops from multi-second to sub-second, and the eyes ack stops depending on the deepest part of the chain. Worth re-measuring the full message→eyes timing in prd after this deploys.
Trade-offs / notes
from,import(),export from,new URL) stays. The unreachable set on the real bundle is exactly the browser-only web workers + wasm.🤖 Generated with Claude Code
Note
Medium Risk
Changes production Slack webhook timing, adds best-effort Slack API calls on the routing path, and alters deploy artifacts via bundle pruning; behavior is designed to be idempotent but affects a critical user-visible path.
Overview
Cuts Slack cold-path latency by shrinking the deployed worker and parallelizing work on the webhook path.
Deploy: Adds
prune-server-bundleto the Alchemy build (after Vite, before asset preupload). It strips unreachabledist/servermodules and most sourcemaps so each cold Durable Object isolate loads a much smaller script.Webhook ingress: The Slack webhook handler now returns
{ ok: true }after the durable stream append only;SlackIntegrationDO.initialize()/ensureReady()run inwaitUntil, avoiding >3s acks and Slack retries.Routing hop:
SlackProcessorgains optionalacknowledgeRoutedWebhookandprewarmRoutedStreamHosts. The integration DO adds the 👀 reaction at route time (viaeyesReactionTargetFromWebhookPayload+reactions.add) and pre-initializesSLACK_AGENTandAGENTDOs in parallel with new-thread bootstrap. Downstream slack-agent behavior stays idempotent (already_reacted).Reviewed by Cursor Bugbot for commit 8cf05b1. Bugbot is set up for automated code reviews on this repo. Configure here.
Environment Config Lease
No active environment config lease.
OS
Status: released
Commit:
8cf05b1Preview: https://os.iterate-preview-3.com
Summary: Preview app released.
Workflow run
Updated: 2026-06-11T12:53:44.296Z
Semaphore
Status: released
Commit:
8cf05b1Preview: https://semaphore.iterate-preview-3.com
Summary: Preview app released.
Workflow run
Updated: 2026-06-11T12:53:35.576Z