Migrated from dynamous-community/remote-coding-agent#792 — Archon active development has moved to coleam00/Archon. Original issue retained as historical reference.
Summary
Enable test→fix→retest and other "re-run until condition" patterns in workflows. Rescoped from a sub-DAG node to a composition-based design after a primitives review (see History below).
This issue now tracks two pieces of work:
- Prerequisite: a real expression evaluator for
when: / until: conditions
- Feature: workflow invocation node + workflow-level
loop_until
Problem
The DAG topology is computed once at load time (dag-executor.ts) and back-edges are rejected by cycle detection (loader.ts). There's no way to express "run these nodes until tests pass." The only existing iteration primitive is PR #785's loop: node, which iterates a single prompt — not a multi-step graph.
The naive workaround is duplicating nodes (run-tests, fix, run-tests-2, fix-2), which is fragile and caps at whatever count is hardcoded.
Why not a sub-DAG node (the original proposal)
The original proposal added a loop_node containing an inner nodes: list with its own edges. From a primitives standpoint this introduces a composite-with-nested-scope concept that brings real design debt:
- Nested topology (a node that has internal edges)
- Scoped variable resolution — what does
$test.output mean across iterations? Not specified.
- Recursive validation (cycle detection, ref checks, model compat all need to recurse)
- Recursive execution (nested cancellation, timeouts, events, JSONL logging)
- A second meaning for
nodes: (outer DAG nodes vs. inner loop-body nodes)
- Once nesting is allowed one level, two-level nesting becomes the next ask
The motivating examples don't require any of that.
Proposed Approach: composition over nesting
Reuse a primitive that already exists — a workflow is a unit of execution with its own scope, validation, cancellation, events, and logs. Add two small things:
1. Workflow-level loop_until
A workflow can declare itself as a loop body:
# .archon/workflows/test-fix.yaml
loop_until: \"\$test.exit_code == 0\"
max_iterations: 5
nodes:
- id: test
bash: |
bun test
echo \"exit_code=\$?\"
- id: fix
prompt: \"Tests failed. Fix: \$test.output\"
depends_on: [test]
when: \"\$test.exit_code != 0\"
The entire workflow re-runs until the condition is met or max_iterations is exhausted.
2. Workflow invocation node
A parent workflow can invoke a child workflow as a single node:
# .archon/workflows/ship.yaml
nodes:
- id: build
bash: \"bun run build\"
- id: test-fix-loop
workflow: test-fix # invokes the workflow above
depends_on: [build]
- id: pr
command: archon-create-pr
depends_on: [test-fix-loop]
The parent sees test-fix-loop as a single node with a single output (the child's final state). Downstream nodes use \$test-fix-loop.output as usual.
Why this is better
| Concern |
Sub-DAG node |
Workflow composition |
| New container primitive |
Yes (composite node + inner scope) |
No (workflow already exists) |
| Variable scoping across iterations |
Undefined / needs design |
Solved by function-call semantics — each invocation is its own scope |
| Recursive validation |
Required |
Each workflow validates itself (already does) |
| Cancellation / timeouts / events |
Need nested versions |
Reuses existing per-workflow infra |
| Reuses #785 iteration machinery |
At node layer |
At workflow layer (cleaner) |
| Naming collision with #785's `loop:` |
Yes |
No — different layers (node vs workflow) |
| "Loop a fragment of a workflow inline" |
Supported |
Requires extraction into named workflow |
The last row is the only tradeoff, and it's a feature: forcing extraction mirrors how functions discipline iteration scope in code.
Prerequisite: expression evaluator
Both #785 and this issue hand-wave the condition evaluator. Today `when:` only does basic substring matching — `$test.exit_code == '0'` doesn't actually work. No loop design ships without this.
This should be done as its own self-contained piece of work:
- Numeric and string comparisons (`==`, `!=`, `<`, `>`, `<=`, `>=`)
- Boolean operators (`&&`, `||`, `!`)
- Path access on captured outputs (`$node.field`)
- Used by both `when:` (existing) and `loop_until:` / `until:` (new)
It benefits existing `when:` users immediately and unblocks both #785 follow-ups and this issue.
Implementation Sketch
Phase 1 — Expression evaluator (prerequisite)
| File |
Change |
| `packages/workflows/src/condition-evaluator.ts` |
New: parser + evaluator for the expression grammar above |
| `packages/workflows/src/dag-executor.ts` |
Replace current substring-based `when:` check with evaluator |
| Tests |
Cover comparisons, booleans, missing-field handling, type coercion rules |
Phase 2 — Workflow-level loop
| File |
Change |
| `packages/workflows/src/schemas/workflow.ts` |
Add optional `loop_until` + `max_iterations` to workflow root schema |
| `packages/workflows/src/executor.ts` |
After a run completes, evaluate `loop_until`; if false and under `max_iterations`, re-execute the DAG with a fresh `nodeOutputs` scope. Reuse PR #785's iteration events at workflow level. |
| `packages/workflows/src/event-emitter.ts` |
`workflow.iteration_started` / `workflow.iteration_completed` events |
Phase 3 — Workflow invocation node
| File |
Change |
| `packages/workflows/src/schemas/dag-node.ts` |
New `workflow:` node variant (mutually exclusive with `command:`/`prompt:`/`bash:`/`loop:`) |
| `packages/workflows/src/loader.ts` |
Parse + validate `workflow:` node; reject self-reference cycles across workflows |
| `packages/workflows/src/dag-executor.ts` |
Dispatch `workflow:` nodes by spawning a child workflow run, awaiting completion, mapping its final output back to the parent's `nodeOutputs` |
| `packages/core/src/orchestrator/` |
Child runs share parent's conversation/isolation environment (no second worktree) |
Coordination with #785
PR #785 ships a node-level `loop:` for iterating a single prompt (Ralph-style). Under this rescope there is no naming collision — `loop:` stays at the node layer, `loop_until:` lives at the workflow layer. Different concepts, different scopes, both first-class.
Out of Scope
Related
History
Originally proposed a `loop_node` containing an inner sub-DAG. Rescoped after a first-principles review concluded that workflow composition gives the same capability without introducing a nested-scope composite primitive. Original proposal preserved in earlier comments.
Summary
Enable test→fix→retest and other "re-run until condition" patterns in workflows. Rescoped from a sub-DAG node to a composition-based design after a primitives review (see History below).
This issue now tracks two pieces of work:
when:/until:conditionsloop_untilProblem
The DAG topology is computed once at load time (
dag-executor.ts) and back-edges are rejected by cycle detection (loader.ts). There's no way to express "run these nodes until tests pass." The only existing iteration primitive is PR #785'sloop:node, which iterates a single prompt — not a multi-step graph.The naive workaround is duplicating nodes (
run-tests,fix,run-tests-2,fix-2), which is fragile and caps at whatever count is hardcoded.Why not a sub-DAG node (the original proposal)
The original proposal added a
loop_nodecontaining an innernodes:list with its own edges. From a primitives standpoint this introduces a composite-with-nested-scope concept that brings real design debt:$test.outputmean across iterations? Not specified.nodes:(outer DAG nodes vs. inner loop-body nodes)The motivating examples don't require any of that.
Proposed Approach: composition over nesting
Reuse a primitive that already exists — a workflow is a unit of execution with its own scope, validation, cancellation, events, and logs. Add two small things:
1. Workflow-level
loop_untilA workflow can declare itself as a loop body:
The entire workflow re-runs until the condition is met or
max_iterationsis exhausted.2. Workflow invocation node
A parent workflow can invoke a child workflow as a single node:
The parent sees
test-fix-loopas a single node with a single output (the child's final state). Downstream nodes use\$test-fix-loop.outputas usual.Why this is better
The last row is the only tradeoff, and it's a feature: forcing extraction mirrors how functions discipline iteration scope in code.
Prerequisite: expression evaluator
Both #785 and this issue hand-wave the condition evaluator. Today `when:` only does basic substring matching — `$test.exit_code == '0'` doesn't actually work. No loop design ships without this.
This should be done as its own self-contained piece of work:
It benefits existing `when:` users immediately and unblocks both #785 follow-ups and this issue.
Implementation Sketch
Phase 1 — Expression evaluator (prerequisite)
Phase 2 — Workflow-level loop
Phase 3 — Workflow invocation node
Coordination with #785
PR #785 ships a node-level `loop:` for iterating a single prompt (Ralph-style). Under this rescope there is no naming collision — `loop:` stays at the node layer, `loop_until:` lives at the workflow layer. Different concepts, different scopes, both first-class.
Out of Scope
Related
History
Originally proposed a `loop_node` containing an inner sub-DAG. Rescoped after a first-principles review concluded that workflow composition gives the same capability without introducing a nested-scope composite primitive. Original proposal preserved in earlier comments.