Bug
Interactive loop nodes (e.g., archon-piv-loop) crash with error_during_execution every time a user approves a paused gate (iteration 2+). Iteration 1 works fine.
Root Cause
In packages/workflows/src/dag-executor.ts around line 1795:
const needsFreshSession = loop.fresh_context || i === 1;
const resumeSessionId = needsFreshSession ? undefined : currentSessionId;
When resuming from a paused interactive loop gate:
startIteration = 2 (from loopGateMeta.iteration + 1)
i === 2, not 1, so needsFreshSession = false
currentSessionId is set from loopGateMeta.sessionId (the session from iteration 1)
- The Claude SDK tries to resume a session that has been idle for minutes/hours while waiting for the human
- That session is expired/invalid →
error_during_execution
Reproduction
- Start
archon-piv-loop workflow via Slack or Web UI
- Explore node runs iteration 1 successfully, asks questions, pauses at gate
- User approves with
/workflow approve <run-id> <feedback>
- Iteration 2 starts, tries to resume stale session, crashes in ~5 seconds
Logs
{"level":20,"module":"provider.claude","sessionId":"e9688e1e-...","msg":"resuming_session"}
{"level":50,"module":"provider.claude","sessionId":"5864b49f-...","errorSubtype":"error_during_execution","msg":"claude.result_is_error"}
The workflow log shows iteration 1 takes ~6.7 minutes (normal), but iterations 2 and 3 crash in ~5-7 seconds each.
Proposed Fix
For interactive loop resume, always start a fresh Claude SDK session since the previous session may have expired during the human wait:
const needsFreshSession = loop.fresh_context || i === 1 || (isLoopResume && i === startIteration);
The user's input is already passed via $LOOP_USER_INPUT in the prompt, so session continuity isn't needed.
Test update needed in dag-executor.test.ts — the test "interactive loop resumes from stored iteration with user input" currently expects session resume; should expect undefined (fresh session).
Environment
- Archon running in Docker on VPS
- Auth via
CLAUDE_CODE_OAUTH_TOKEN
- Slack adapter (batch streaming mode)
Bug
Interactive loop nodes (e.g.,
archon-piv-loop) crash witherror_during_executionevery time a user approves a paused gate (iteration 2+). Iteration 1 works fine.Root Cause
In
packages/workflows/src/dag-executor.tsaround line 1795:When resuming from a paused interactive loop gate:
startIteration = 2(fromloopGateMeta.iteration + 1)i === 2, not1, soneedsFreshSession = falsecurrentSessionIdis set fromloopGateMeta.sessionId(the session from iteration 1)error_during_executionReproduction
archon-piv-loopworkflow via Slack or Web UI/workflow approve <run-id> <feedback>Logs
The workflow log shows iteration 1 takes ~6.7 minutes (normal), but iterations 2 and 3 crash in ~5-7 seconds each.
Proposed Fix
For interactive loop resume, always start a fresh Claude SDK session since the previous session may have expired during the human wait:
The user's input is already passed via
$LOOP_USER_INPUTin the prompt, so session continuity isn't needed.Test update needed in
dag-executor.test.ts— the test "interactive loop resumes from stored iteration with user input" currently expects session resume; should expectundefined(fresh session).Environment
CLAUDE_CODE_OAUTH_TOKEN