fix(workflows): resolve bash via absolute path on Windows (#1326)#1470
fix(workflows): resolve bash via absolute path on Windows (#1326)#1470atlas-architect wants to merge 8 commits into
Conversation
) Fixes coleam00#1326. On Windows, `child_process.execFile('bash', ...)` fails to use Git Bash even when Git Bash is first on PATH. Windows CreateProcess searches the System32 directory BEFORE consulting the PATH env var, so bare `'bash'` resolves to `C:\Windows\System32\bash.exe` — the WSL launcher. The WSL bash has two pathologies that break workflow bash nodes: 1. `${VAR}` expansion is broken when bash is invoked in `-c` mode via CreateProcess arg-passing. Variables emit as empty strings regardless of environment. 2. Path convention is `/mnt/c/...` rather than Git Bash's `/c/...`, so any absolute path substitutions (worktree paths, cwd, repo paths) mismatch expectations of downstream bash nodes. Symptom seen in A00 workflow: `precheck-worktree` node fails fast with TARGET empty, even though the token substitution correctly wrote TARGET=<path> into the script. Add `resolveBashPath()` helper in `@archon/git/exec` that returns: - `process.env.ARCHON_BASH_PATH` if set (escape hatch for non-standard Git Bash installs, e.g. user-scope installer at `%LOCALAPPDATA%\Programs\Git\bin\bash.exe`) - `C:\Program Files\Git\bin\bash.exe` on Windows (Git Bash default) - `bash` on Linux/macOS (unchanged PATH behavior) Update the two `execFileAsync('bash', ...)` call sites in `dag-executor.ts` (bash node + loop node until-bash check) to call through `resolveBashPath()`. Local smoke test on Windows 11 + Git Bash 2.47: ``` Resolved bash: C:\Program Files\Git\bin\bash.exe TARGET_FROM_PARENT=test-value-123 ← ${VAR} expansion works PATH_CONVENTION=/c/Dev/platform/Archon/... ← /c/ convention preserved ``` Before the patch (bare `'bash'`), same test returned: ``` Resolved bash: bash TARGET_FROM_PARENT= ← WSL bash dropped the variable PATH_CONVENTION=/mnt/c/Dev/... ← WSL mount convention ``` Type-check green across all 9 packages. - Linux/macOS: no behavioral change (still resolves `bash` via PATH). - Windows with standard Git for Windows install: works out of the box. - Windows with user-scope Git install: override via ARCHON_BASH_PATH. - Docker containers (Linux): no change needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Caution Review failedFailed to post review comments 📝 WalkthroughWalkthroughAdds a platform-aware exported utility ChangesPrimary changes
Sequence Diagram(s)sequenceDiagram
participant Executor as Dag Executor
participant GitPkg as git.resolveBashPath()
participant Spawn as execFileAsync
participant OS as System (bash)
Executor->>GitPkg: request bash path
GitPkg-->>Executor: return resolved path (ARCHON_BASH_PATH or platform default)
Executor->>Spawn: execFileAsync(resolvedPath, args, env)
Spawn->>OS: spawn process at resolvedPath
OS-->>Spawn: exit code / errors
Spawn-->>Executor: result / ENOENT|EACCES -> error thrown or logged
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
packages/workflows/src/dag-executor.ts (2)
1323-1327:⚠️ Potential issue | 🟡 MinorUpdate ENOENT message to reflect resolved absolute bash path.
After Line 1323, the ENOENT branch (Line 1376) can be inaccurate because resolution may use an absolute path, not PATH search. This makes troubleshooting harder.
💡 Suggested patch
- const { stdout, stderr } = await execFileAsync(resolveBashPath(), ['-c', finalScript], { + const bashPath = resolveBashPath(); + const { stdout, stderr } = await execFileAsync(bashPath, ['-c', finalScript], { cwd, timeout, env: subprocessEnv, }); @@ - } else if (err.message?.includes('ENOENT')) { - errorMsg = `Bash node '${node.id}' failed: bash executable not found in PATH`; + } else if (err.message?.includes('ENOENT')) { + errorMsg = `Bash node '${node.id}' failed: bash executable not found at '${bashPath}'. Set ARCHON_BASH_PATH if Git Bash is installed elsewhere.`;As per coding guidelines, "handle expected failures gracefully ... surface ... errors to users for actionable issues."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/workflows/src/dag-executor.ts` around lines 1323 - 1327, The catch/ENOENT branch that handles failures from the execFileAsync call using resolveBashPath() should include the resolved absolute bash path in its error message to avoid misleading PATH-based suggestions; update the error handling around the execFileAsync(resolveBashPath(), ['-c', finalScript], {cwd, timeout, env: subprocessEnv}) invocation to capture the value returned by resolveBashPath() (store it in a local variable) and, when errno === 'ENOENT', log/throw an error that mentions that exact resolved path (and the cwd) so users see the absolute path that was attempted rather than implying a PATH search.
2111-2123:⚠️ Potential issue | 🟠 MajorDon’t swallow
until_bashinfrastructure errors as “condition not met.”At Line 2111, system-level execution failures (e.g.,
ENOENT,EACCES) are treated like a normal false condition. That can hide environment breakage and waste iterations instead of failing immediately.💡 Suggested patch
- await execFileAsync(resolveBashPath(), ['-c', substitutedBash], { cwd }); + const bashPath = resolveBashPath(); + await execFileAsync(bashPath, ['-c', substitutedBash], { cwd }); bashComplete = true; // exit 0 = complete } catch (e) { const bashErr = e as NodeJS.ErrnoException; - // ENOENT or other system errors are unexpected — log them - if (bashErr.code === 'ENOENT') { - getLog().warn( - { err: bashErr, nodeId: node.id, iteration: i }, - 'loop_node.until_bash_exec_error' - ); - } - bashComplete = false; // non-zero exit = not complete + if (bashErr.code === 'ENOENT' || bashErr.code === 'EACCES') { + throw new Error( + `Loop node '${node.id}' until_bash failed: cannot execute '${bashPath}'. Set ARCHON_BASH_PATH if needed.` + ); + } + // Non-zero exit from bash script means completion condition is false. + bashComplete = false; }As per coding guidelines, "handle expected failures gracefully ... surface ... errors to users for actionable issues."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/workflows/src/dag-executor.ts` around lines 2111 - 2123, The current catch in the until_bash execution swallows system-level errors (caught around execFileAsync/resolveBashPath) and treats them as a simple "condition not met" by setting bashComplete=false; instead detect infrastructure errors (e.g., bashErr.code === 'ENOENT' or 'EACCES'), log them as errors via getLog() including node.id and iteration i, and rethrow or propagate the error so the DAG run fails immediately rather than looping; update the catch handling around execFileAsync/resolveBashPath to only treat non-system (command exit) failures as condition=false (bashComplete=false) while surfacing system-level errors upward.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@packages/workflows/src/dag-executor.ts`:
- Around line 1323-1327: The catch/ENOENT branch that handles failures from the
execFileAsync call using resolveBashPath() should include the resolved absolute
bash path in its error message to avoid misleading PATH-based suggestions;
update the error handling around the execFileAsync(resolveBashPath(), ['-c',
finalScript], {cwd, timeout, env: subprocessEnv}) invocation to capture the
value returned by resolveBashPath() (store it in a local variable) and, when
errno === 'ENOENT', log/throw an error that mentions that exact resolved path
(and the cwd) so users see the absolute path that was attempted rather than
implying a PATH search.
- Around line 2111-2123: The current catch in the until_bash execution swallows
system-level errors (caught around execFileAsync/resolveBashPath) and treats
them as a simple "condition not met" by setting bashComplete=false; instead
detect infrastructure errors (e.g., bashErr.code === 'ENOENT' or 'EACCES'), log
them as errors via getLog() including node.id and iteration i, and rethrow or
propagate the error so the DAG run fails immediately rather than looping; update
the catch handling around execFileAsync/resolveBashPath to only treat non-system
(command exit) failures as condition=false (bashComplete=false) while surfacing
system-level errors upward.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: dec77c3a-d03c-4163-a0f4-8eec0cc9671f
📒 Files selected for processing (3)
packages/git/src/exec.tspackages/git/src/index.tspackages/workflows/src/dag-executor.ts
… cause Three fixes on top of resolveBashPath introduction: 1. **Test fix (the Windows-CI cascade root cause):** dag-executor.test.ts line 1236 asserted literal `'bash'` for the bash subprocess call. Our patch returns `resolveBashPath()` which is `'bash'` on Linux/macOS but `'C:\Program Files\Git\bin\bash.exe'` on Windows. The assertion failed on Windows, but the spy's `mockRestore()` was called on the line AFTER the assertion — so when expect() threw, the spy was never restored and bled into subsequent tests, causing 7 unrelated script-node tests to fail with mysterious "Use: ok" / "ok" outputs from the leaked spy. Wrapped expect + mockRestore in try/finally and updated the expected first arg to `git.resolveBashPath()` for platform-awareness. 2. **CodeRabbit minor (line 1322-1377):** Capture `bashPath` once at the top of executeBashNode + update the ENOENT error message to include the resolved absolute path + suggest ARCHON_BASH_PATH for non-standard Git installs. Resolves the misleading "not found in PATH" message that no longer reflects how bash is actually being resolved. 3. **CodeRabbit major (line 2112-2128):** Loop node `until_bash` catch was treating ENOENT/EACCES as "condition not met" — silently looping against a broken bash binary. Now distinguishes system-level errors (ENOENT/EACCES = throw immediately, surface env breakage) from non-zero script exit (= condition false, keep looping). Local Windows verification: `bun test src/dag-executor.test.ts` = 199 pass, 0 fail. Type-check green across all 10 packages. Refs: coleam00#1326 (the Windows bash bug this PR fixes), this PR's CodeRabbit review (coleam00#1470). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Pushed 1. Test fix (the Windows-CI cascade root cause): the 2. CodeRabbit minor (line 1322-1377): captured 3. CodeRabbit major (line 2112-2128): loop node Local Windows verification: Thanks @coderabbitai for the review — the ENOENT message + loop-error-handling catches were both real bugs. |
|
Triggering a re-review of the updated commit. ✏️ Learnings added
🧠 Learnings used✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/workflows/src/dag-executor.ts (1)
2122-2125: Align the new log event key with the repo’s structured naming convention.
loop_node.until_bash_exec_errordoes not follow{domain}.{action}_{state}with standard states (e.g.,_failed). Consider renaming to something likeloop.until_bash_failed.As per coding guidelines: "Structured logging with Pino: use event naming format
{domain}.{action}_{state}(standard states: _started, _completed, _failed, _validated, _rejected)".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/workflows/src/dag-executor.ts` around lines 2122 - 2125, The log event key 'loop_node.until_bash_exec_error' doesn't follow the repo's structured naming ({domain}.{action}_{state}); update the call to getLog().error(...) to use a compliant event key such as 'loop.until_bash_failed' (preserving the same metadata: err: bashErr, nodeId: node.id, iteration: i) so callers still receive bashErr, node.id and i but the event name matches the {domain}.{action}_{state} convention.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/workflows/src/dag-executor.ts`:
- Around line 1376-1378: The ENOENT branch currently emits a Windows-specific
hint and recomputes the bash path; update it to reuse the already-resolved
loopBashPath variable (instead of recomputing bashPath) and emit a
platform-aware message: on Windows mention setting ARCHON_BASH_PATH and an
example Git Bash path, on POSIX mention installing bash or ensuring it’s on
PATH; include the node id (node.id) and the loopBashPath value in the errorMsg
for clear diagnostics.
---
Nitpick comments:
In `@packages/workflows/src/dag-executor.ts`:
- Around line 2122-2125: The log event key 'loop_node.until_bash_exec_error'
doesn't follow the repo's structured naming ({domain}.{action}_{state}); update
the call to getLog().error(...) to use a compliant event key such as
'loop.until_bash_failed' (preserving the same metadata: err: bashErr, nodeId:
node.id, iteration: i) so callers still receive bashErr, node.id and i but the
event name matches the {domain}.{action}_{state} convention.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8e9c5e69-7996-4e80-9a85-e2afdaf88992
📒 Files selected for processing (2)
packages/workflows/src/dag-executor.test.tspackages/workflows/src/dag-executor.ts
| } else if (err.message?.includes('ENOENT')) { | ||
| errorMsg = `Bash node '${node.id}' failed: bash executable not found in PATH`; | ||
| errorMsg = `Bash node '${node.id}' failed: bash executable not found at '${bashPath}'. Set ARCHON_BASH_PATH if Git Bash is installed elsewhere (e.g. user-scope installer at %LOCALAPPDATA%\\Programs\\Git\\bin\\bash.exe).`; | ||
| } else if (err.message?.includes('EACCES')) { |
There was a problem hiding this comment.
Use platform-aware ENOENT hints and reuse the already-resolved bash path.
Current ENOENT text is Windows-specific on all platforms, and Line 2127 recomputes the path instead of using loopBashPath (can desync diagnostics if env changes).
🔧 Suggested patch
+function getBashResolutionHint(): string {
+ return process.platform === 'win32'
+ ? 'Set ARCHON_BASH_PATH if Git Bash is installed elsewhere (e.g. user-scope installer at %LOCALAPPDATA%\\Programs\\Git\\bin\\bash.exe).'
+ : "Ensure 'bash' is installed or set ARCHON_BASH_PATH to a valid bash executable.";
+}
+
...
- } else if (err.message?.includes('ENOENT')) {
- errorMsg = `Bash node '${node.id}' failed: bash executable not found at '${bashPath}'. Set ARCHON_BASH_PATH if Git Bash is installed elsewhere (e.g. user-scope installer at %LOCALAPPDATA%\\Programs\\Git\\bin\\bash.exe).`;
+ } else if (err.message?.includes('ENOENT')) {
+ errorMsg = `Bash node '${node.id}' failed: bash executable not found at '${bashPath}'. ${getBashResolutionHint()}`;
...
- throw new Error(
- `Loop node '${node.id}' until_bash failed: cannot execute bash at '${resolveBashPath()}' (${bashErr.code}). Set ARCHON_BASH_PATH if Git Bash is installed elsewhere.`
- );
+ throw new Error(
+ `Loop node '${node.id}' until_bash failed: cannot execute bash at '${loopBashPath}' (${bashErr.code}). ${getBashResolutionHint()}`
+ );Also applies to: 2126-2128
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/workflows/src/dag-executor.ts` around lines 1376 - 1378, The ENOENT
branch currently emits a Windows-specific hint and recomputes the bash path;
update it to reuse the already-resolved loopBashPath variable (instead of
recomputing bashPath) and emit a platform-aware message: on Windows mention
setting ARCHON_BASH_PATH and an example Git Bash path, on POSIX mention
installing bash or ensuring it’s on PATH; include the node id (node.id) and the
loopBashPath value in the errorMsg for clear diagnostics.
Review SummaryVerdict: minor-fixes-needed Good fix — the Windows bash path resolution is solid and the WSL mis-resolution is a real pain point. Three correctness issues in error-handling paths and one missing docs entry need to be addressed before merge. None are large, but the Blocking issues
Suggested fixes
Minor / nice-to-have
Compliments
Reviewed via maintainer-review-pr workflow (Pi/Minimax). Aspects run: code-review, error-handling, test-coverage, comment-quality, docs-impact. |
|
@atlas-architect related to #1326 — Windows bash path resolution. |
|
@atlas-architect this PR looks similar to #1342, which was closed on 2026-04-28 (closed without merging). You may want to read the discussion there before pushing further. |
Review SummaryVerdict: minor-fixes-needed This PR lands six focused changes — most importantly, the Windows Git Bash path resolution fix ( Blocking issues(none) Suggested fixes
Minor / nice-to-have
Compliments
Reviewed via maintainer-review-pr workflow (Pi/Minimax). Aspects run: code-review, error-handling, test-coverage, comment-quality, docs-impact. |
…dows-bash-resolve-path-dev
…lve-path-dev # Conflicts: # CHANGELOG.md
Wirasm review 2026-05-13 + CodeRabbit auto-review feedback close-out.
Suggested fixes (3 substantive):
- test/loader: add node-level provider validation test asserting error
identifies both the node id and the unknown provider id
- git/exec: eagerly validate ARCHON_BASH_PATH override with existsSync;
throws clear error if path doesn't exist (catches the typo class at
resolveBashPath() call time instead of as opaque ENOENT inside the
first bash-node fire). Best-effort check — a path that exists at
validate-time can still race-disappear before exec, but the common
case (env-var typo) surfaces immediately.
- docs/authoring-workflows: add mutates_checkout: false entry after the
tags: block; calls out distinction from worktree.enabled: false
Minor cleanup (3 polish + 1 CodeRabbit):
- executor-shared: drop direct fresh_context field-name reference from
$LOOP_PREV_OUTPUT comment per Wirasm
- loader: update tags parenthetical "(suppresses keyword inference in
the UI)" → "(opt-out of any auto-inference)" — UI-agnostic phrasing
- dag-executor: rename log event 'loop_node.until_bash_exec_error' →
'loop.until_bash_failed' to match repo's {domain}.{action}_{state}
structured-logging convention (CodeRabbit Nitpick)
Verified passing on individual run: new node-level provider test +
existing 'reject unknown provider' test + existing mutates_checkout
path-lock tests. (Full suite has 5 pre-existing failures in
interactive-loop / cancel-node mock-isolation tests — they pass when
run individually with `-t`; unrelated to this PR.)
Items already addressed in prior commits:
- Test for mutates_checkout: false DB-skip (executor.test.ts:302 has
'skips path-lock check when mutates_checkout is false')
- Platform-aware ENOENT message (dag-executor.ts:1414 includes
%LOCALAPPDATA% Git-Bash example path)
- Docs error example syncs to actual loader.ts emit format
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
@Wirasm thanks for the thorough review — apologies for the 13-day gap (our fault entirely; we've upgraded our daily-ritual to surface stale-review-pending PRs after this). Pushed Suggested fixes (substantive):
Minor / nice-to-have:
CodeRabbit Nitpick:
Already addressed in earlier commits (verified during this pass):
Verification: new node-level test passes individually + workflow-level Ready for re-review. |
|
This PR has a lot of unrelated changes, eg archon.rb, version bump etc, please strip out anything that doesnt have with the solution to do and reopen please |
|
Got it, @Wirasm — apologies for the scope creep. Filed slim replacement at #1779 with ONLY the bash-path fix + your 2026-04-29 review feedback applied. Stripped from #1470 (everything unrelated): Kept in #1779 (the actual solution): 6 files / +116 / -18 lines. All your 2026-04-29 review fixes addressed — see the PR body for the full checklist. Closing this; review continues on #1779. Thanks for the patient feedback throughout — happy to keep iterating. |
Fixes #1326.
Summary
child_process.execFile('bash', ...)resolves toC:\Windows\System32\bash.exe(WSL launcher) instead of Git Bash, because Windows CreateProcess searches System32 BEFORE PATH.${VAR}expansion in-cmode and uses/mnt/c/paths instead of Git Bash's/c/, breaking every workflow bash node whose token substitution depends on env vars or path strings. Archon CLI on Windows fails with confusing "empty variable" symptoms.resolveBashPath()helper in@archon/git/execreturnsARCHON_BASH_PATHenv override ->C:\Program Files\Git\bin\bash.exeWindows default ->bash(Linux/macOS PATH unchanged). BothexecFileAsync('bash', ...)sites indag-executor.ts(bash node + loop node until-bash check) call through the helper.Label Snapshot
risk: lowsize: Scoregit:exec(helper) +workflows:executor(call sites)Change Metadata
bugcoreLinked Issue
bashresolves to WSL launcher (System32\bash.exe) — $VAR expansion broken in-cmode #1326maintodev)Validation Evidence (required)
bun run lint,bun run format:check,bun run testnot run by submitter - fix is surgical (1 helper function + 2 call-site updates). Happy to run on request, or rely on CI.Security Impact (required)
process.env.ARCHON_BASH_PATHif set; no file system writes - only spawns the existing bash binary at a more deterministic path)Compatibility / Migration
bashvia PATH)ARCHON_BASH_PATHis a NEW optional env var; defaults work for standard Git for Windows installs.Human Verification (required)
What was personally validated beyond CI:
Verified scenarios:
${VAR}expansion works,/c/path convention preserved'bash') -> confirmed empty${VAR}+/mnt/c/pathsEdge cases checked:
%LOCALAPPDATA%\Programs\Git\...) -> flagged asARCHON_BASH_PATHuse case'bash') -> unchanged PATH behaviorWhat was NOT verified:
'bash'on non-Windows)Smoke test output (after patch):
Before the patch (bare
'bash'):Side Effects / Blast Radius (required)
bash:nodes when CLI runs on Windows. Loop nodes withuntil-bashconditions.bash.exe(higher on PATH) will now resolve to Git Bash specifically. Was non-deterministic before.ARCHON_BASH_PATHis the override.Rollback Plan (required)
git revert <commit-sha>- single commit, no migration. Reverting reinstates bare'bash'resolution behavior.ARCHON_BASH_PATH=bashenv var would partially undo the fix at runtime without code change.${VAR},/mnt/c/paths). Linux/macOS unaffected.Risks and Mitigations
C:\Program Files\Git\bin\bash.exemay not exist on user-scope installer or Windows Server with no Git for Windows.ARCHON_BASH_PATHenv var provides clean override. Helper falls through to PATH search if hardcoded path doesn't exist (preserves current behavior in that edge case). Open to walking PATH manually for the first non-System32bash.exeif reviewer prefers a fancier heuristic - env-var-plus-default kept the fix surgical.execFileAsync('bash', ...)site without going throughresolveBashPath().@archon/git/execfor discoverability. Could add ESLint rule banning bare'bash'literal in future hardening pass if recurrence is observed.Notes for reviewers
The Windows default (
C:\Program Files\Git\bin\bash.exe) is the standard install location for Git for Windows. If you run into edge cases on Windows Server or custom Git installs,ARCHON_BASH_PATHgives a clean override without requiring a new patch.Happy to iterate on the detection heuristic if you'd prefer something fancier (e.g. walking PATH manually to find the first non-System32
bash.exe), but the env-var-plus-sensible-default approach keeps the fix surgical and overridable.Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores