Declarative workflow primitives for Cloudflare Workflows.
npm install workflais
| Native CF Workflows | workflais | |
|---|---|---|
| Step definition | step.do("name", { retries: { limit: 3, delay: "10s", backoff: "exponential" }, timeout: "30s" }, async () => {...}) |
step("name", fn).retry(3).timeout("30s") |
| Result chaining | Manual variables between steps | Automatic ctx.prev pipeline |
| Saga compensation | Manual implementation | .compensate(fn) with automatic LIFO rollback via step.do("⟲ name") |
| Parallel execution | Manual self-spawn + waitForEvent orchestration | parallel(step1, step2, step3) |
| waitForEvent | Imperative step.waitForEvent() call |
waitForEvent("name", opts) in pipeline with ctx.prev |
| Compile-time validation | Runtime errors only | Duplicate step names, step count limits, timeout limits, event type validation |
| Error types | Generic errors | NonRetryableError, TimeoutTooLongError, DuplicateStepNameError, etc. |
import { step, compile, execute } from "workflais";
import { WorkflowEntrypoint, WorkflowStep, WorkflowEvent } from "cloudflare:workers";
import type { WorkflowStep as WfStep, WorkflowEvent as WfEvent } from "workflais";
export class MyWorkflow extends WorkflowEntrypoint {
async run(event: WorkflowEvent, cfStep: WorkflowStep) {
const plan = compile([
step("fetch-data", async (ctx) => {
return { userId: ctx.event.payload.userId, name: "Alice" };
}),
step("process", async (ctx) => {
const data = ctx.prev; // automatic chaining
return { ...data, processed: true };
})
.retry(3, "1 minute")
.timeout("30 seconds"),
step("save", async (ctx) => {
return { ...ctx.prev, saved: true };
}).compensate(async () => {
// runs automatically on failure, wrapped in step.do for CF durability
}),
]);
return execute(plan, cfStep as unknown as WfStep, event as unknown as WfEvent, this.env);
}
}step(name, fn) // durable step
.retry(limit, delay?) // retry config (default: exponential backoff)
.timeout(duration) // step timeout (max 30 min)
.compensate(fn) // saga rollback handler
parallel(step1, step2, ...) // fan-out/fan-in execution
waitForEvent(name, { type, timeout }) // pause for external event
compile(nodes) // validate + build execution plan
execute(plan, step, event, env) // run against CF Workflows runtimeEvery step callback receives ctx:
ctx.prev— previous step's return value (orundefinedfor the first step)ctx.event— workflow event (frozen, immutable)ctx.env— CF bindings
After parallel(), ctx.prev is a tuple of results in declaration order.
Each example is a standalone, deploy-ready CF Workers project. Pick one and run:
cd examples/ecommerce-checkout
npm install
npx wrangler devThen test it:
# Start a checkout workflow
curl -X POST http://localhost:8787/checkout \
-H "Content-Type: application/json" \
-d '{"cartId": "cart-42"}'
# Check status
curl http://localhost:8787/status?id=<instanceId>| Example | Pattern | Test command |
|---|---|---|
ecommerce-checkout |
Cart → Payment → Invoice | curl -X POST localhost:8787/checkout -d '{"cartId":"42"}' |
user-onboarding |
Saga compensation | curl -X POST localhost:8787/onboard -d '{"email":"a@b.com"}' |
image-tagging |
Human-in-the-loop | curl -X POST localhost:8787/upload -d '{"imageKey":"photo.jpg"}' |
parallel-fan-out |
Parallel fan-out/fan-in (child DO isolation) | curl -X POST localhost:8787/notify -d '{"userId":"u1","message":"hi"}' |
All examples include console.log at every step — use npx wrangler tail to see execution flow in real time. Hit GET / on any example to see available endpoints.
parallel-fan-outis the only example that demonstrates child workflow DO isolation. Eachparallel()branch spawns a separate Durable Object with its own 128 MB memory, CPU budget, and retry policy. See the Resource Isolation Problem section for why this matters.
CF Workflows runs each instance inside a single Durable Object. A DO has hard limits:
| Resource | Limit |
|---|---|
| Memory | 128 MB per DO |
| CPU | 5 min per invocation |
| Retry budget | Shared across the entire instance |
If you run three heavy steps with Promise.all inside one DO, you get:
┌─ Single Durable Object (128 MB shared) ──────────────┐
│ Promise.all([ │
│ mlInference(), ← 80 MB ← 3 min CPU │
│ imageProcess(), ← 60 MB ← 2 min CPU │
│ videoTranscode(), ← 50 MB ← 4 min CPU │
│ ]) │
│ Total: 190 MB → OOM CRASH │
│ Total CPU: 9 min → TIMEOUT │
│ If imageProcess fails → all three die │
└───────────────────────────────────────────────────────┘
The problem is threefold:
- Memory — All branches share 128 MB. Two 80 MB allocations = OOM crash, killing the entire workflow.
- CPU — All branches share 5 min. Three 2-minute tasks = timeout, even though each one is well under the limit.
- Blast radius — One branch throwing an unhandled error kills
Promise.all, terminating siblings mid-execution. No partial results, no independent retry.
parallel() compiles to the self-spawn pattern — each branch becomes a separate workflow instance running in its own DO:
Parent DO Child DO #1 Child DO #2 Child DO #3
────────── ─────────── ─────────── ───────────
step.do("⊕ spawn") ─────────► 128 MB own memory 128 MB own memory 128 MB own memory
binding.create(child1) 5 min own CPU 5 min own CPU 5 min own CPU
binding.create(child2) own retry budget own retry budget own retry budget
binding.create(child3)
step.do("ml", fn) step.do("img", fn) step.do("vid", fn)
waitForEvent("ml:cb") ◄─ $0 ─ sendEvent(result)
waitForEvent("img:cb") ◄─ $0 ─ sendEvent(result)
waitForEvent("vid:cb") ◄─ $0 ─ sendEvent(result)
ctx.prev = [mlResult, imgResult, vidResult] // tuple in declaration order
Promise.all (single DO) |
parallel() (child DOs) |
|
|---|---|---|
| Memory | 128 MB shared | 128 MB each |
| CPU | 5 min shared | 5 min each |
| Retry | All-or-nothing | Per-branch |
| Failure | One kills all | Isolated |
| Parent cost while waiting | N/A | $0 (hibernated) |
// workflais — each branch gets its own DO
parallel(
step("ml-inference", mlFn).retry(5, "exponential").timeout("25m"),
step("image-process", imgFn).retry(3).timeout("10m"),
step("video-transcode", vidFn).retry(2).timeout("20m"),
)The parent spawns all children in a single step.do, then hibernates via waitForEvent. Zero CPU, zero memory, zero cost. When all children report back, the parent wakes up with ctx.prev = [result1, result2, result3].
If any branch fails after the parallel group completes, workflais runs .compensate() for every child in the group — each compensation wrapped in its own step.do for CF-durable retry.
The parallel-fan-out example is deployed and verified on Cloudflare Workers.
Result:
{
"status": "complete",
"output": {
"notified": true,
"channelCount": 3,
"results": [
{ "channel": "email", "sent": true, "to": "user@example.com" },
{ "channel": "sms", "sent": true, "to": "+1234567890" },
{ "channel": "crm", "updated": true, "userId": "u1" }
]
}
}| Metric | HTTP Trigger | Workflow DO (parent) |
|---|---|---|
executionModel |
stateless |
stateless |
wallTime |
517ms | ~3s (complete) |
cpuTime |
0 | 0 |
outcome |
ok |
ok |
Key observations:
- Parent hibernation confirmed —
cpuTime: 0whilewallTime: 300168ms(5 min) on a stale instance proves the parent DO genuinely hibernates at $0 cost duringwaitForEvent - Each step is a separate DO invocation —
executionModel: "stateless"on every log entry confirms CF treats each step as an independent call - Results arrive in declaration order —
[email, sms, crm]tuple matches theparallel()child order, regardless of which child finishes first
step("a", fn).retry(3) → compile([...]) → execute(plan, cfStep, event, env)
DSL Validation CF step.do() / waitForEvent()
- DSL — Declarative step definitions with chainable config
- Compiler — Validates names, limits, timeouts; builds execution plan
- Runtime — Translates plan into CF Workflows API calls with saga compensation