Summary
On Windows, Archon bash nodes fail silently when bash resolves to C:\Windows\System32\bash.exe (the WSL launcher shipped with Windows). Any ${VAR} reference in the bash script expands to empty string, causing downstream logic to fail. The failure appears as "dir does not exist" or empty path errors, which is misleading — the actual root cause is that variable expansion doesn't work in bash.exe -c 'script' when bash is WSL's launcher.
Impact: every bash node in every workflow fails silently on affected Windows machines. Gather-context succeeds (LLM node), but the first downstream bash node fails. Looks like a workflow YAML bug or user error; actually a shell-resolution bug.
Repro
Minimal workflow YAML:
name: repro
nodes:
- id: gather
prompt: "Return JSON {\"name\": \"hello\"}"
model: haiku
output_format:
type: object
properties:
name: { type: string }
required: [name]
- id: bash-test
bash: |
set -e
X=$gather.output.name
echo "X=[$X]"
[ -n "$X" ] && echo "OK" || echo "FAIL"
depends_on: [gather]
Fire on Windows via archon workflow run repro. On affected machines, bash-test fails with X=[] and exits non-zero. Same YAML runs fine on macOS/Linux and on Windows machines where bash in PATH resolves to Git Bash.
Root cause
Windows CreateProcess search order for bare command names (no absolute path):
- Application's directory
- Current directory
- System directory (
C:\Windows\System32) ← bash.exe found here (WSL launcher)
- 16-bit system directory
- Windows directory
- PATH environment variable ← Git Bash's
bash.exe is here (if installed)
Because System32 is searched BEFORE PATH, Bun's child_process.spawn('bash', [...]) always resolves bash to C:\Windows\System32\bash.exe on any Windows install that has WSL enabled, even when Git Bash's C:\Program Files\Git\bin is prepended to PATH. Get-Command bash in PowerShell uses a different (PATH-respecting) lookup, so users verify their PATH shows Git Bash and assume their shell resolution is correct — but Bun doesn't see that resolution.
Once bash.exe resolves to WSL launcher:
C:\Windows\System32\bash.exe -c 'VAR=hello; echo "$VAR"'
→ [] (empty, $VAR not expanded)
The WSL launcher's -c argument handling strips $VAR references somewhere in the PowerShell→bash.exe→WSL argument passing chain. Known Windows/WSL arg-passing quirk. Result: every ${VAR} in Archon bash node scripts evaluates to empty.
Additionally, even if variable expansion worked, WSL bash mounts C: at /mnt/c/ by default, not /c/ — so path conventions like /c/Dev/hcr/hcr-els (Git Bash / MSYS2 convention) don't resolve in WSL regardless of expansion.
Why it's intermittent / hard to diagnose
Users who once ran their daemon from a context that had Git Bash early in PATH (VS Code integrated terminal with Git Bash default shell, a pre-configured PowerShell profile, an admin session with modified PATH, etc.) end up with a long-lived daemon that inherited that PATH. Bun's child_process.spawn for those daemons DOES find Git Bash (for reasons I don't fully understand — maybe bun on Windows uses different resolution than plain CreateProcess). Those daemons keep working fine indefinitely. When that daemon is eventually killed and restarted from a default PowerShell session, the new daemon hits this bug.
In our case: fires #1-20 of a long-running workflow all succeeded. Daemon restart at session end → fire #21 broke, reproduces every time afterward regardless of PATH modifications.
Expected behavior
Bash nodes execute their script with working ${VAR} expansion. The _DirExistenceCheck, wc -l < $TARGET/file, and similar simple bash idioms should behave identically to a macOS/Linux run.
Actual behavior
Every ${VAR} reference evaluates to empty. Paths built from variables are empty. Downstream ls, wc, etc. fail with "No such file or directory" errors. if [ ! -d "$EMPTY" ] is TRUE, causing early exit from defensive checks.
Suggested fixes (ordered by complexity)
Option A — resolve bash through PATH lookup explicitly, not via CreateProcess default search. Before spawning bash for a bash node, walk PATH in code and use the first bash.exe found, passed as an absolute path. Would bypass the System32-first quirk.
Option B — prefer C:\Program Files\Git\bin\bash.exe on Windows when it exists. Git Bash is the de-facto standard for Windows dev shells and is what Archon workflow scripts target by convention (Unix paths, /c/ style). Hard-coding a check for Git Bash first on Windows would make the intended behavior the actual behavior.
Option C — document the requirement and provide a setup check. Least invasive: at daemon startup, detect the bash.exe that will be spawned (via CreateProcess search order emulation), verify it's not System32's WSL launcher, emit a loud warning + doc link to install Git Bash if the check fails. Users can then install Git Bash and be directed to fix their setup.
Option A is the cleanest. Option B is a shortcut that works for the vast majority of real Windows setups. Option C is documentation-only.
Workaround we adopted (in our fork-equivalent use)
None of the user-facing workarounds work reliably:
$env:Path modification in PS parent session → Bun still resolves via CreateProcess, ignores PATH
- Git Bash terminal running daemon → same, Bun uses CreateProcess
bun install / bun link refresh → unrelated, doesn't touch bash resolution
Planned workaround until fix lands: place a symlink bash.exe → Git Bash inside Archon's own directory so CreateProcess (#2 current directory) finds it before System32. Hacky but works locally.
Environment
- Windows 11 Pro 10.0.26200
- PowerShell 7.6.0 Core
- Bun 1.3.12
- Archon main (83c119a, tested also on d89bc76 — identical failure, commit-agnostic)
- WSL present (Ubuntu distro), required for other tools (Codex CLI)
- Git for Windows installed with
C:\Program Files\Git\bin\bash.exe, but NOT on system PATH by default
Repro timeline
Full debug log is in our fleet journal ([SQI#348-R3, L155 when banked], 14-step scientific-method isolation that ruled out Archon code regression, YAML changes, my shell-tool choice, and Git Bash vs WSL daemon context — eventually narrowing to CreateProcess System32 priority). Happy to provide the full debug thread if useful for regression-test creation.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com
Summary
On Windows, Archon bash nodes fail silently when
bashresolves toC:\Windows\System32\bash.exe(the WSL launcher shipped with Windows). Any${VAR}reference in the bash script expands to empty string, causing downstream logic to fail. The failure appears as "dir does not exist" or empty path errors, which is misleading — the actual root cause is that variable expansion doesn't work inbash.exe -c 'script'when bash is WSL's launcher.Impact: every bash node in every workflow fails silently on affected Windows machines. Gather-context succeeds (LLM node), but the first downstream bash node fails. Looks like a workflow YAML bug or user error; actually a shell-resolution bug.
Repro
Minimal workflow YAML:
Fire on Windows via
archon workflow run repro. On affected machines, bash-test fails withX=[]and exits non-zero. Same YAML runs fine on macOS/Linux and on Windows machines wherebashin PATH resolves to Git Bash.Root cause
Windows
CreateProcesssearch order for bare command names (no absolute path):C:\Windows\System32) ←bash.exefound here (WSL launcher)bash.exeis here (if installed)Because System32 is searched BEFORE PATH, Bun's
child_process.spawn('bash', [...])always resolvesbashtoC:\Windows\System32\bash.exeon any Windows install that has WSL enabled, even when Git Bash'sC:\Program Files\Git\binis prepended to PATH.Get-Command bashin PowerShell uses a different (PATH-respecting) lookup, so users verify their PATH shows Git Bash and assume their shell resolution is correct — but Bun doesn't see that resolution.Once
bash.exeresolves to WSL launcher:The WSL launcher's
-cargument handling strips$VARreferences somewhere in the PowerShell→bash.exe→WSL argument passing chain. Known Windows/WSL arg-passing quirk. Result: every${VAR}in Archon bash node scripts evaluates to empty.Additionally, even if variable expansion worked, WSL bash mounts C: at
/mnt/c/by default, not/c/— so path conventions like/c/Dev/hcr/hcr-els(Git Bash / MSYS2 convention) don't resolve in WSL regardless of expansion.Why it's intermittent / hard to diagnose
Users who once ran their daemon from a context that had Git Bash early in PATH (VS Code integrated terminal with Git Bash default shell, a pre-configured PowerShell profile, an admin session with modified PATH, etc.) end up with a long-lived daemon that inherited that PATH. Bun's
child_process.spawnfor those daemons DOES find Git Bash (for reasons I don't fully understand — maybe bun on Windows uses different resolution than plain CreateProcess). Those daemons keep working fine indefinitely. When that daemon is eventually killed and restarted from a default PowerShell session, the new daemon hits this bug.In our case: fires #1-20 of a long-running workflow all succeeded. Daemon restart at session end → fire #21 broke, reproduces every time afterward regardless of PATH modifications.
Expected behavior
Bash nodes execute their script with working
${VAR}expansion. The_DirExistenceCheck,wc -l < $TARGET/file, and similar simple bash idioms should behave identically to a macOS/Linux run.Actual behavior
Every
${VAR}reference evaluates to empty. Paths built from variables are empty. Downstreamls,wc, etc. fail with "No such file or directory" errors.if [ ! -d "$EMPTY" ]is TRUE, causing early exit from defensive checks.Suggested fixes (ordered by complexity)
Option A — resolve
bashthrough PATH lookup explicitly, not via CreateProcess default search. Before spawning bash for a bash node, walk PATH in code and use the firstbash.exefound, passed as an absolute path. Would bypass the System32-first quirk.Option B — prefer
C:\Program Files\Git\bin\bash.exeon Windows when it exists. Git Bash is the de-facto standard for Windows dev shells and is what Archon workflow scripts target by convention (Unix paths,/c/style). Hard-coding a check for Git Bash first on Windows would make the intended behavior the actual behavior.Option C — document the requirement and provide a setup check. Least invasive: at daemon startup, detect the
bash.exethat will be spawned (viaCreateProcesssearch order emulation), verify it's not System32's WSL launcher, emit a loud warning + doc link to install Git Bash if the check fails. Users can then install Git Bash and be directed to fix their setup.Option A is the cleanest. Option B is a shortcut that works for the vast majority of real Windows setups. Option C is documentation-only.
Workaround we adopted (in our fork-equivalent use)
None of the user-facing workarounds work reliably:
$env:Pathmodification in PS parent session → Bun still resolves via CreateProcess, ignores PATHbun install/bun linkrefresh → unrelated, doesn't touch bash resolutionPlanned workaround until fix lands: place a symlink
bash.exe→ Git Bash inside Archon's own directory so CreateProcess (#2 current directory) finds it before System32. Hacky but works locally.Environment
C:\Program Files\Git\bin\bash.exe, but NOT on system PATH by defaultRepro timeline
Full debug log is in our fleet journal ([SQI#348-R3, L155 when banked], 14-step scientific-method isolation that ruled out Archon code regression, YAML changes, my shell-tool choice, and Git Bash vs WSL daemon context — eventually narrowing to CreateProcess System32 priority). Happy to provide the full debug thread if useful for regression-test creation.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com