Problem
The WorkflowRunOptions doc-comment (`packages/cli/src/commands/workflow.ts`) says:
--resume: reuse worktree from last failed run.
…implying --resume is opt-in. The CLI usage referenced in references/cli-commands.md and the README also presents --resume as a separate, explicit subcommand-style flag.
Observed behavior contradicts this: archon workflow run <name> \"<message>\" (no --resume) silently auto-resumes the latest failed run with the same workflow name, replaying cached node outputs.
Repro
# Run 1 — produces a failure
archon workflow run my-workflow \"...\" # fails at node-3
# Run 2 — no --resume flag
archon workflow run my-workflow \"...\" # observe:
# workflow.dag_resuming
# workflowRunId is the SAME as run 1
# priorCompletedCount: 2 (parse-args, node-2)
# node-1 Skipped (prior_success)
# node-2 Skipped (prior_success)
# node-3 re-runs
Concretely: the run id stays constant across archon workflow run invocations as long as the prior run is in a non-terminal-not-completed state. Both runs are observably the same DB row; the second invocation isn't a fresh run.
Why this is confusing
For agents (and humans) authoring workflows:
- They run, hit a bug, fix the bug, re-run. They expect a clean slate. They get a resume.
- They expect their
parse-args change to take effect. It doesn't (cached prior_success).
- They expect to see the dynamic-args path execute. They see the cached output from the prior args.
- They
cd into a different state and re-run. They get the prior run's worktree path, not their current one.
- Combined with issue E (no way to invalidate cached node output), debugging cycles compound: a fix doesn't surface because of cached upstream state, and there's no obvious flag to disable resume.
The auto-resume is a great feature. The problem is it's not labelled, not documented as default, and there's no --no-resume to opt out.
What the docs say
packages/cli/src/commands/workflow.ts:
* Default: creates worktree with auto-generated branch name (isolation by default).
* --branch: explicit branch name for the worktree.
* --no-worktree: opt out of isolation, run in live checkout.
* --resume: reuse worktree from last failed run. ← reads as opt-in
* --from: override base branch (start-point for worktree).
references/cli-commands.md (Archon skill):
# Resume a failed workflow (re-runs, skipping completed nodes)
archon workflow resume <run-id>
…suggesting resume is an explicit resume subcommand or --resume flag.
Proposed direction
Pick one of:
- Behavior matches docs —
archon workflow run defaults to a fresh run. Auto-resume only when --resume is passed (or when the user runs archon workflow resume <id>).
- Docs match behavior — document that
archon workflow run auto-resumes any non-completed prior run for this workflow on this codebase, and add a --no-resume (or --fresh) flag to opt out.
Either is fine; the current state — silent auto-resume contradicting the docs — is the worst combination. (1) is probably the safer default since fresh-by-default matches CLI conventions almost everywhere; (2) requires less code change.
Found while
Building a workshop archon-release workflow. Spent ~2 iterations confused about why my workflow-YAML edits weren't taking effect — the answer was "they were, but the cached prior-success nodes were unchanged." See #1389 and the related cache-invalidation issue for adjacent papercuts.
Problem
The
WorkflowRunOptionsdoc-comment (`packages/cli/src/commands/workflow.ts`) says:…implying
--resumeis opt-in. The CLI usage referenced inreferences/cli-commands.mdand the README also presents--resumeas a separate, explicit subcommand-style flag.Observed behavior contradicts this:
archon workflow run <name> \"<message>\"(no--resume) silently auto-resumes the latest failed run with the same workflow name, replaying cached node outputs.Repro
Concretely: the run id stays constant across
archon workflow runinvocations as long as the prior run is in a non-terminal-not-completed state. Both runs are observably the same DB row; the second invocation isn't a fresh run.Why this is confusing
For agents (and humans) authoring workflows:
parse-argschange to take effect. It doesn't (cachedprior_success).cdinto a different state and re-run. They get the prior run's worktree path, not their current one.The auto-resume is a great feature. The problem is it's not labelled, not documented as default, and there's no
--no-resumeto opt out.What the docs say
packages/cli/src/commands/workflow.ts:references/cli-commands.md(Archon skill):…suggesting resume is an explicit
resumesubcommand or--resumeflag.Proposed direction
Pick one of:
archon workflow rundefaults to a fresh run. Auto-resume only when--resumeis passed (or when the user runsarchon workflow resume <id>).archon workflow runauto-resumes any non-completed prior run for this workflow on this codebase, and add a--no-resume(or--fresh) flag to opt out.Either is fine; the current state — silent auto-resume contradicting the docs — is the worst combination. (1) is probably the safer default since fresh-by-default matches CLI conventions almost everywhere; (2) requires less code change.
Found while
Building a workshop
archon-releaseworkflow. Spent ~2 iterations confused about why my workflow-YAML edits weren't taking effect — the answer was "they were, but the cached prior-success nodes were unchanged." See #1389 and the related cache-invalidation issue for adjacent papercuts.