Skip to content

feat(workflows): workflow-level loop_until + max_iterations to re-run full DAG #1765

@Wirasm

Description

@Wirasm

Spun out from closed PR #1282 (3 features bundled — landing them one at a time). Depends on #1764 (workflow: node type) and #1763 (condition-evaluator improvements) landing first.

What

Add two top-level workflow fields:

name: my-workflow
loop_until: "$qa.output.passed == true"
max_iterations: 5
nodes: [...]

When loop_until is set, the executor re-runs the full DAG until the condition evaluates true, capped at max_iterations. Distinct from node-level loop: (which iterates a single prompt).

Why

The motivating case: "plan → implement → test → fix → retest until tests pass" as a workflow-level loop rather than as multi-level nested loop: nodes. Today this requires duplicating nodes, capped at compile time.

Scope boundary

  • Workflow-level only — does NOT change node-level loop: semantics (PR 🐛 [Bug]: can get list of projects #785 stays orthogonal)
  • Whole-DAG re-run only — no partial-DAG retry, no per-node retry policy (those are different features)
  • Fresh artifact subdirectory per iteration so iterations don't pollute each other's $ARTIFACTS_DIR
  • No exponential backoff or delay between iterations for v1

Why this depends on the other two

Design

  • New fields on workflowBaseSchema: loop_until?: string, max_iterations?: number (default 5)
  • Outer iteration loop in executeDagWorkflow: after each DAG pass, evaluate loop_until against final node outputs. If false and iterations < max → emit workflow.iteration_started event, re-run the full DAG with a fresh scope (cleared node outputs). If condition met → exit. If max exhausted → emit warning, exit normally.
  • Event types: workflow.iteration_started, workflow.iteration_completed, workflow.loop_max_iterations_reached
  • Cancel-token check at top of each iteration so a paused/cancelled workflow stops

Reference implementation

@Dev-Force already implemented this in PR #1282 — the outer iteration loop in executeDagWorkflow is the main reference. The archon-compose-plan-implement-qa.yaml example shows the API works but uses a node-level loop: for the implement-retry phase (intentional choice — only that phase retries, not the planner).

Acceptance

  • Tests: condition met on iteration 1 (single pass), condition met on iteration N, max_iterations exhausted, condition fail-closed on bad expression, cancel mid-iteration honored
  • Demo workflow bundled that uses loop_until at workflow level (different from Fix Issue #972 | Workflow Composition #1282's loop: example so both patterns are demonstrated)
  • Docs page describing when to use workflow-level vs node-level loops

Suggested PR title

feat(workflows): add workflow-level loop_until + max_iterations

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: workflowsWorkflow enginefeature-requestNew functionality (external suggestion, needs review)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions