workflais

Declarative workflow primitives for Cloudflare Workflows.

npm install workflais

Native CF Workflows vs workflais

	Native CF Workflows	workflais
Step definition	`step.do("name", { retries: { limit: 3, delay: "10s", backoff: "exponential" }, timeout: "30s" }, async () => {...})`	`step("name", fn).retry(3).timeout("30s")`
Result chaining	Manual variables between steps	Automatic `ctx.prev` pipeline
Saga compensation	Manual implementation	`.compensate(fn)` with automatic LIFO rollback via `step.do("⟲ name")`
Parallel execution	Manual self-spawn + waitForEvent orchestration	`parallel(step1, step2, step3)`
waitForEvent	Imperative `step.waitForEvent()` call	`waitForEvent("name", opts)` in pipeline with `ctx.prev`
Compile-time validation	Runtime errors only	Duplicate step names, step count limits, timeout limits, event type validation
Error types	Generic errors	`NonRetryableError`, `TimeoutTooLongError`, `DuplicateStepNameError`, etc.

Quick Start

import { step, compile, execute } from "workflais";
import { WorkflowEntrypoint, WorkflowStep, WorkflowEvent } from "cloudflare:workers";
import type { WorkflowStep as WfStep, WorkflowEvent as WfEvent } from "workflais";

export class MyWorkflow extends WorkflowEntrypoint {
  async run(event: WorkflowEvent, cfStep: WorkflowStep) {
    const plan = compile([
      step("fetch-data", async (ctx) => {
        return { userId: ctx.event.payload.userId, name: "Alice" };
      }),

      step("process", async (ctx) => {
        const data = ctx.prev; // automatic chaining
        return { ...data, processed: true };
      })
        .retry(3, "1 minute")
        .timeout("30 seconds"),

      step("save", async (ctx) => {
        return { ...ctx.prev, saved: true };
      }).compensate(async () => {
        // runs automatically on failure, wrapped in step.do for CF durability
      }),
    ]);

    return execute(plan, cfStep as unknown as WfStep, event as unknown as WfEvent, this.env);
  }
}

API

DSL

step(name, fn)                    // durable step
  .retry(limit, delay?)           // retry config (default: exponential backoff)
  .timeout(duration)              // step timeout (max 30 min)
  .compensate(fn)                 // saga rollback handler

parallel(step1, step2, ...)       // fan-out/fan-in execution
waitForEvent(name, { type, timeout })  // pause for external event

compile(nodes)                    // validate + build execution plan
execute(plan, step, event, env)   // run against CF Workflows runtime

Context

Every step callback receives ctx:

ctx.prev — previous step's return value (or undefined for the first step)
ctx.event — workflow event (frozen, immutable)
ctx.env — CF bindings

After parallel(), ctx.prev is a tuple of results in declaration order.

Examples

Each example is a standalone, deploy-ready CF Workers project. Pick one and run:

cd examples/ecommerce-checkout
npm install
npx wrangler dev

Then test it:

# Start a checkout workflow
curl -X POST http://localhost:8787/checkout \
  -H "Content-Type: application/json" \
  -d '{"cartId": "cart-42"}'

# Check status
curl http://localhost:8787/status?id=<instanceId>

Example	Pattern	Test command
`ecommerce-checkout`	Cart → Payment → Invoice	`curl -X POST localhost:8787/checkout -d '{"cartId":"42"}'`
`user-onboarding`	Saga compensation	`curl -X POST localhost:8787/onboard -d '{"email":"a@b.com"}'`
`image-tagging`	Human-in-the-loop	`curl -X POST localhost:8787/upload -d '{"imageKey":"photo.jpg"}'`
`parallel-fan-out`	Parallel fan-out/fan-in (child DO isolation)	`curl -X POST localhost:8787/notify -d '{"userId":"u1","message":"hi"}'`

All examples include console.log at every step — use npx wrangler tail to see execution flow in real time. Hit GET / on any example to see available endpoints.

parallel-fan-out is the only example that demonstrates child workflow DO isolation. Each parallel() branch spawns a separate Durable Object with its own 128 MB memory, CPU budget, and retry policy. See the Resource Isolation Problem section for why this matters.

Why Child Workflows? The Resource Isolation Problem

CF Workflows runs each instance inside a single Durable Object. A DO has hard limits:

Resource	Limit
Memory	128 MB per DO
CPU	5 min per invocation
Retry budget	Shared across the entire instance

If you run three heavy steps with Promise.all inside one DO, you get:

┌─ Single Durable Object (128 MB shared) ──────────────┐
│  Promise.all([                                        │
│    mlInference(),    ← 80 MB   ← 3 min CPU           │
│    imageProcess(),   ← 60 MB   ← 2 min CPU           │
│    videoTranscode(), ← 50 MB   ← 4 min CPU           │
│  ])                                                   │
│  Total: 190 MB → OOM CRASH                            │
│  Total CPU: 9 min → TIMEOUT                           │
│  If imageProcess fails → all three die                │
└───────────────────────────────────────────────────────┘

The problem is threefold:

Memory — All branches share 128 MB. Two 80 MB allocations = OOM crash, killing the entire workflow.
CPU — All branches share 5 min. Three 2-minute tasks = timeout, even though each one is well under the limit.
Blast radius — One branch throwing an unhandled error kills Promise.all, terminating siblings mid-execution. No partial results, no independent retry.

The Solution: Child Workflow Spawning

parallel() compiles to the self-spawn pattern — each branch becomes a separate workflow instance running in its own DO:

Parent DO                          Child DO #1           Child DO #2           Child DO #3
──────────                        ───────────           ───────────           ───────────
step.do("⊕ spawn") ─────────►   128 MB own memory     128 MB own memory     128 MB own memory
  binding.create(child1)          5 min own CPU         5 min own CPU         5 min own CPU
  binding.create(child2)          own retry budget      own retry budget      own retry budget
  binding.create(child3)
                                  step.do("ml", fn)     step.do("img", fn)    step.do("vid", fn)
waitForEvent("ml:cb")  ◄─ $0 ─  sendEvent(result)
waitForEvent("img:cb") ◄─ $0 ─                        sendEvent(result)
waitForEvent("vid:cb") ◄─ $0 ─                                              sendEvent(result)

ctx.prev = [mlResult, imgResult, vidResult]  // tuple in declaration order

	`Promise.all` (single DO)	`parallel()` (child DOs)
Memory	128 MB shared	128 MB each
CPU	5 min shared	5 min each
Retry	All-or-nothing	Per-branch
Failure	One kills all	Isolated
Parent cost while waiting	N/A	$0 (hibernated)

// workflais — each branch gets its own DO
parallel(
  step("ml-inference", mlFn).retry(5, "exponential").timeout("25m"),
  step("image-process", imgFn).retry(3).timeout("10m"),
  step("video-transcode", vidFn).retry(2).timeout("20m"),
)

The parent spawns all children in a single step.do, then hibernates via waitForEvent. Zero CPU, zero memory, zero cost. When all children report back, the parent wakes up with ctx.prev = [result1, result2, result3].

If any branch fails after the parallel group completes, workflais runs .compensate() for every child in the group — each compensation wrapped in its own step.do for CF-durable retry.

Production Verification

The parallel-fan-out example is deployed and verified on Cloudflare Workers.

Result:

{
  "status": "complete",
  "output": {
    "notified": true,
    "channelCount": 3,
    "results": [
      { "channel": "email", "sent": true, "to": "user@example.com" },
      { "channel": "sms",   "sent": true, "to": "+1234567890" },
      { "channel": "crm",   "updated": true, "userId": "u1" }
    ]
  }
}

Runtime Metrics (`wrangler tail`)

Metric	HTTP Trigger	Workflow DO (parent)
`executionModel`	`stateless`	`stateless`
`wallTime`	517ms	~3s (complete)
`cpuTime`	0	0
`outcome`	`ok`	`ok`

Key observations:

Parent hibernation confirmed — cpuTime: 0 while wallTime: 300168ms (5 min) on a stale instance proves the parent DO genuinely hibernates at $0 cost during waitForEvent
Each step is a separate DO invocation — executionModel: "stateless" on every log entry confirms CF treats each step as an independent call
Results arrive in declaration order — [email, sms, crm] tuple matches the parallel() child order, regardless of which child finishes first

How It Works

step("a", fn).retry(3)  →  compile([...])  →  execute(plan, cfStep, event, env)
     DSL                     Validation          CF step.do() / waitForEvent()

DSL — Declarative step definitions with chainable config
Compiler — Validates names, limits, timeouts; builds execution plan
Runtime — Translates plan into CF Workflows API calls with saga compensation

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
docs		docs
examples		examples
src		src
tests		tests
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
mcp.json		mcp.json
package-lock.json		package-lock.json
package.json		package.json
prd.md		prd.md
presentation.html		presentation.html
tsconfig.json		tsconfig.json
vitest.config.cf.ts		vitest.config.cf.ts
vitest.config.ts		vitest.config.ts
wrangler.toml		wrangler.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

workflais

Native CF Workflows vs workflais

Quick Start

API

DSL

Context

Examples

Why Child Workflows? The Resource Isolation Problem

The Solution: Child Workflow Spawning

Production Verification

Runtime Metrics (`wrangler tail`)

How It Works

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

workflais

Native CF Workflows vs workflais

Quick Start

API

DSL

Context

Examples

Why Child Workflows? The Resource Isolation Problem

The Solution: Child Workflow Spawning

Production Verification

Runtime Metrics (wrangler tail)

How It Works

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Runtime Metrics (`wrangler tail`)

Packages