Skip to content

[workflows] Add Workflow rollback support#13983

Merged
pombosilva merged 14 commits into
mainfrom
vaish/saga-rollbacks
May 21, 2026
Merged

[workflows] Add Workflow rollback support#13983
pombosilva merged 14 commits into
mainfrom
vaish/saga-rollbacks

Conversation

@vaishnav-mk

@vaishnav-mk vaishnav-mk commented May 20, 2026

Copy link
Copy Markdown
Contributor

Add saga rollback support to the local Workflows engine used by Wrangler dev.

Workflow steps can now register rollback handlers through trailing step.do() options and the local engine runs those handlers when the workflow fails. Rollbacks execute through rollback scoped Context.do() calls so they reuse the existing retry, timeout, attempt logging and step bookkeeping paths

This also aligns local rollback behavior with the service implementation by:

  • running rollbacks in reverse step-start order
  • re-registering rollback handlers on cached step results
  • emitting rollback-specific step and attempt events
  • disposing duplicated rollback RPC stubs after success/failure cleanup

A picture of a cute animal (not mandatory, but encouraged) (his name's leo)
image

@changeset-bot

changeset-bot Bot commented May 20, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 3ffb097

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@cloudflare/workflows-shared Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions

github-actions Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

✅ All changesets look good

ask-bonk[bot]

This comment was marked as resolved.

@ask-bonk

ask-bonk Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Review posted successfully on PR #13983. I found two actionable issues:

  1. Missing closure guard (context.ts:225) — The ...rest: unknown[] refactor lost the implicit type safety of the old named parameters. When the config path is taken, rest[1] is cast to a function without validation, which would produce a confusing crash if a caller passes a malformed arg list. I posted an inline suggestion with a guard.

  2. readLogs/readDetailedLogs stream output handling — These methods only resolve streamOutput metadata for STEP_SUCCESS events, but rollback steps now emit ROLLBACK_STEP_SUCCESS through the same code path that can attach streamOutput. If a rollback step returns a ReadableStream, the raw metadata will leak into log output. These lines are outside the diff so I noted the fix in the review body.

github run

@pkg-pr-new

pkg-pr-new Bot commented May 20, 2026

Copy link
Copy Markdown
create-cloudflare

npm i https://pkg.pr.new/create-cloudflare@13983

@cloudflare/kv-asset-handler

npm i https://pkg.pr.new/@cloudflare/kv-asset-handler@13983

miniflare

npm i https://pkg.pr.new/miniflare@13983

@cloudflare/pages-shared

npm i https://pkg.pr.new/@cloudflare/pages-shared@13983

@cloudflare/unenv-preset

npm i https://pkg.pr.new/@cloudflare/unenv-preset@13983

@cloudflare/vite-plugin

npm i https://pkg.pr.new/@cloudflare/vite-plugin@13983

@cloudflare/vitest-pool-workers

npm i https://pkg.pr.new/@cloudflare/vitest-pool-workers@13983

@cloudflare/workers-editor-shared

npm i https://pkg.pr.new/@cloudflare/workers-editor-shared@13983

@cloudflare/workers-utils

npm i https://pkg.pr.new/@cloudflare/workers-utils@13983

wrangler

npm i https://pkg.pr.new/wrangler@13983

commit: 3ffb097

@vaishnav-mk

Copy link
Copy Markdown
Contributor Author

/bonk

ask-bonk[bot]

This comment was marked as resolved.

@ask-bonk

ask-bonk Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Review posted successfully on PR #13983. I found two actionable issues:

  1. Missing curly braces (rollback.ts:149) — The for loop cleanup of remaining rollback stubs after a failure is missing curly braces, which will fail the repo's curly: error ESLint rule.

  2. Non-null assertion (context.ts:306) — this.#rollbackStep!.cacheKey uses a ! assertion which is forbidden by the repo's lint config. The fix is to narrow directly on this.#rollbackStep instead of going through the isRollback boolean.

I also noted that the fatal abort paths (DataCloneError, etc.) skip rollbacks for earlier steps since abort() kills the DO before executeRollbacks can run — this seems intentionally correct but worth a code comment.

Both issues from the previous review (closure guard, streamOutput handling in readLogs/readDetailedLogs) have been properly addressed.

github run

@vaishnav-mk

Copy link
Copy Markdown
Contributor Author

/bonk

@ask-bonk ask-bonk Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm Bonk, and I've reviewed your PR after the latest lint-fix commit (6f338dd).

Previously flagged issues — status:

  1. Closure guard (context.ts:226) — Fixed. The typeof closure !== "function" guard is in place.
  2. readLogs/readDetailedLogs streamOutput handling (engine.ts:111-116) — Fixed. The isStepSuccessEvent() helper correctly covers both STEP_SUCCESS and ROLLBACK_STEP_SUCCESS.
  3. for loop braces (rollback.ts:146-148) — Fixed. Braces added.
  4. Non-null assertion on rollback cache key (context.ts:299-301) — Fixed. Now narrows via const rollbackStep = this.#rollbackStep.

Remaining issue:

  1. Non-null assertions on array index (rollback.ts:126, rollback.ts:147) — entries[i]! uses non-null assertions which violate the repo's @typescript-eslint/no-non-null-assertion: error rule (.oxlintrc.jsonc:47). While the index is always in-bounds due to the loop condition, the linter doesn't know that. The fix is to use an intermediate variable or use at() with a guard. See inline comments.

Overall: The implementation is clean and well-structured. Rollback execution through Context.do is a smart reuse of existing retry/timeout/logging machinery. The LIFO ordering via SQL STEP_START events is correct, test coverage is thorough, and the changeset is well-written. One lint issue remains to address.

github run

ask-bonk[bot]

This comment was marked as resolved.

@ask-bonk

ask-bonk Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Review posted successfully on PR #13983. Here's a summary:

Previously flagged issues — all addressed:

  1. Closure guard (context.ts:226) — fixed
  2. readLogs/readDetailedLogs streamOutput handling (engine.ts:111-116) — fixed via isStepSuccessEvent() helper
  3. for loop braces (rollback.ts:146-148) — fixed
  4. Non-null assertion on rollback cache key (context.ts:299-301) — fixed via proper narrowing

One remaining issue found:

  • Non-null assertions on array index (rollback.ts:126, rollback.ts:147) — entries[i]! violates the repo's @typescript-eslint/no-non-null-assertion: error rule. Posted an inline suggestion to either use for...of with .slice() for the cleanup loop, or extract with an undefined guard.

github run

@vaishnav-mk

Copy link
Copy Markdown
Contributor Author

/bonk

@ask-bonk ask-bonk Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm Bonk, and I've reviewed the latest state of the PR (d1ba1fb).

All previously flagged issues are resolved:

  1. Closure guard (context.ts:226) — Fixed. typeof closure !== "function" guard in place.
  2. readLogs/readDetailedLogs streamOutput handling (engine.ts:111-116) — Fixed. isStepSuccessEvent() helper covers both STEP_SUCCESS and ROLLBACK_STEP_SUCCESS.
  3. for loop braces (rollback.ts:146-148) — Fixed.
  4. Non-null assertion on rollback cache key (context.ts:299-301) — Fixed via const rollbackStep = this.#rollbackStep narrowing.
  5. Non-null assertions on array index (rollback.ts:125, 146-148) — Fixed. Now uses for...of with entries.entries() and .slice() destructuring, eliminating all entries[i]! patterns.

No new issues found. The implementation is clean — rollback execution through Context.do is a smart reuse of retry/timeout/logging machinery, LIFO ordering via SQL STEP_START events is correct, and test coverage is thorough. LGTM.

@ask-bonk

ask-bonk Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Review posted on PR #13983. All five issues raised across the previous three review rounds have been properly addressed in the latest commits. No new issues found — the PR looks good.

github run

@vaishnav-mk

Copy link
Copy Markdown
Contributor Author

/bonk review

@ask-bonk ask-bonk Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed the full PR. All five issues raised across the previous four review rounds have been properly addressed:

  1. Closure guard (context.ts:226) — typeof closure !== "function" guard is in place.
  2. readLogs/readDetailedLogs streamOutput handling (engine.ts:111-116) — isStepSuccessEvent() helper correctly covers both STEP_SUCCESS and ROLLBACK_STEP_SUCCESS.
  3. for loop braces (rollback.ts:146-148) — Fixed.
  4. Non-null assertion on rollback cache key (context.ts:299-301) — Fixed via const rollbackStep = this.#rollbackStep narrowing.
  5. Non-null assertions on array index (rollback.ts:125, 146-148) — Fixed. Now uses for...of with entries.entries() and .slice() destructuring.

The implementation is well-structured. A few highlights:

  • Smart reuse of Context.do for rollback execution means rollbacks automatically inherit retry, timeout, and attempt-logging machinery without duplicating any of that logic.
  • LIFO ordering via SQL STEP_START events is correct and matches the service implementation. The fallback to registry insertion order for entries missing from SQL (edge case) is a good defensive measure.
  • Event model is clean — the new ROLLBACK_* events (17-25) are sequential, the isStepSuccessEvent helper keeps the readLogs/readDetailedLogs stream handling DRY, and rollback steps don't count against the step limit.
  • RPC stub lifecycle is handled correctly: dup() on registration, Symbol.dispose on cleanup, clearRollbackRegistry on success, and dispose of remaining entries on rollback chain halt.
  • Test coverage is thorough — LIFO ordering, parallel step ordering, rollbackConfig propagation, rollback context correctness, failed-step rollbacks, partial registration, chain-halt on failure, and no-rollback-on-success.

No new issues found. LGTM.

@ask-bonk

ask-bonk Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Review posted and approved on PR #13983. All previously raised issues have been addressed, and no new issues were found. The implementation is solid — rollback execution through Context.do is a clean reuse of existing retry/timeout/logging machinery, LIFO ordering is correct, RPC stub lifecycle is properly managed, and test coverage is thorough.

github run

@vaishnav-mk vaishnav-mk changed the title [workflows-shared] Add Workflow rollback support [workflows] Add Workflow rollback support May 20, 2026
@vaishnav-mk vaishnav-mk marked this pull request as ready for review May 20, 2026 18:27
@workers-devprod workers-devprod requested review from a team and edmundhung and removed request for a team May 20, 2026 18:27
@workers-devprod

workers-devprod commented May 20, 2026

Copy link
Copy Markdown
Contributor

Codeowners approval required for this PR:

  • ✅ @cloudflare/workflows
  • ✅ @cloudflare/wrangler
Show detailed file reviewers

@devin-ai-integration devin-ai-integration Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 6 additional findings in Devin Review.

Open in Devin Review

Comment thread packages/workflows-shared/src/engine.ts
Comment thread .changeset/saga-rollbacks.md Outdated
Comment thread packages/workflows-shared/src/instance.ts
@vaishnav-mk vaishnav-mk force-pushed the vaish/saga-rollbacks branch from fa8b197 to 90e7632 Compare May 21, 2026 09:21
@vaishnav-mk

Copy link
Copy Markdown
Contributor Author

/bonk

@ask-bonk ask-bonk Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed all the outstanding feedback against the current state of the branch. Everything has already been addressed:

@pombosilva's comments:

  1. Changeset title (.changeset/saga-rollbacks.md:5) — Already updated to "Add rollback support for local Workflows development" (no "saga").
  2. Remove PR reference (.changeset/saga-rollbacks.md:11) — Already removed; now just mentions the workflows_step_rollback compat flag without referencing cloudflare/workerd#6330.
  3. Move parseRollbackOptions to rollback.ts (context.ts) — Already done. The function lives in rollback.ts:43-84 and is imported in context.ts:18.
  4. Blank line before ROLLBACK_START (instance.ts:128) — Already present at line 127.

Devin review:

  1. Clear rollbackRegistry in attemptResume (engine.ts) — Already handled at engine.ts:1019 with clearRollbackRegistry(this.rollbackRegistry).

No remaining unaddressed feedback found.

@ask-bonk

ask-bonk Bot commented May 21, 2026

Copy link
Copy Markdown
Contributor

I reviewed all outstanding feedback on PR #13983. All five items from @pombosilva and Devin's review have already been addressed in the current state of the branch:

  1. Changeset title updated (no "saga")
  2. PR reference removed from changeset (just mentions compat flag)
  3. parseRollbackOptions already moved to rollback.ts
  4. Blank line before ROLLBACK_START already present
  5. rollbackRegistry cleared in attemptResume already implemented

Posted a review comment confirming this on the PR.

github run

@vaishnav-mk vaishnav-mk force-pushed the vaish/saga-rollbacks branch from 90e7632 to 3ffb097 Compare May 21, 2026 09:29

@petebacondarwin petebacondarwin left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes here are all local to the workflows-shared package, so delegating full review and approval to the workflows team (e.g. @pombosilva).
Looked through the code and don't see any logical issues, but have not validated whether the behaviour is expected from a product point of view.
Approving on behalf of workers-sdk in general.

@pombosilva pombosilva self-requested a review May 21, 2026 10:11

@workers-devprod workers-devprod left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codeowners reviews satisfied

@github-project-automation github-project-automation Bot moved this from Untriaged to Approved in workers-sdk May 21, 2026
@pombosilva pombosilva merged commit 9803586 into main May 21, 2026
58 checks passed
@pombosilva pombosilva deleted the vaish/saga-rollbacks branch May 21, 2026 10:13
@github-project-automation github-project-automation Bot moved this from Approved to Done in workers-sdk May 21, 2026
penalosa pushed a commit that referenced this pull request May 28, 2026
Co-authored-by: Olga Silva <78314353+pombosilva@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants