Skip to content

shell: fall back to sh when SHELL is /usr/bin/false or nologin#69308

Merged
steipete merged 2 commits intoopenclaw:mainfrom
sk7n4k3d:fix/shell-fallback-non-interactive
Apr 21, 2026
Merged

shell: fall back to sh when SHELL is /usr/bin/false or nologin#69308
steipete merged 2 commits intoopenclaw:mainfrom
sk7n4k3d:fix/shell-fallback-non-interactive

Conversation

@sk7n4k3d
Copy link
Copy Markdown
Contributor

Summary

  • Fix getShellConfig() so exec no longer exits with code 1 when process.env.SHELL is a non-interactive placeholder such as /usr/bin/false, /bin/false, /usr/sbin/nologin, or /sbin/nologin.
  • Apply the same filter to detectRuntimeShell() so runtime classification does not leak the placeholder name.
  • Prefer a resolved sh (then bash) from PATH; fall back to bare sh only when nothing is on PATH.

Problem — fixes #69077

On macOS 15, OpenClaw Gateway deployed as a custom LaunchDaemon under a dedicated service user whose login shell is /usr/bin/false has every exec return exitCode=1 after 3-10ms, with the target script never actually running. The workaround was to set SHELL=/bin/sh explicitly in the LaunchDaemon EnvironmentVariables plist.

Root cause: src/agents/shell-utils.ts#getShellConfig reads process.env.SHELL directly and, when non-empty, uses it verbatim with -c. Spawning /usr/bin/false -c '<script>' exits 1 immediately because false is not a shell.

Fix

  • Treat shells whose basename is false or nologin as non-interactive placeholders and ignore them.
  • When SHELL is unset or a placeholder, resolve sh from PATH (then bash), and fall back to bare "sh" only if nothing is found — matches the previous "unset" branch.
  • Same filter is applied to detectRuntimeShell() so /usr/bin/false does not bubble up as a runtime shell name.

Not changed

  • SHELL=/usr/bin/fish path is untouched (still prefers bash then sh).
  • The OPENCLAW_SHELL override in detectRuntimeShell() is unchanged and still wins; users who deliberately configure a non-standard binary can keep using it.
  • resolveShellFromPath helper is unchanged.

Testing

  • pnpm test src/agents/shell-utils.test.ts — 14/14 green (3 new cases: SHELL=/usr/bin/false with sh on PATH; SHELL=/sbin/nologin with sh on PATH; placeholder + empty PATH).
  • pnpm test src/agents/shell-utils.test.ts src/agents/bash-tools.exec.path.test.ts — 40/40 green.
  • pnpm lint src/agents/shell-utils.ts src/agents/shell-utils.test.ts — 0 warnings / 0 errors.
  • pnpm tsgo:core and pnpm tsgo:core:test — clean.
  • pnpm format — applied.

Notes

AI-assisted: Claude Code (Opus 4.7), human reviewed and tested. Scope stays within src/agents/shell-utils.{ts,test.ts}; no behavior change for users whose SHELL is a real shell.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 20, 2026

Greptile Summary

This PR fixes a real production issue where processes running under a service account with SHELL=/usr/bin/false caused every exec to exit immediately with code 1. The fix introduces isNonInteractiveShell() to detect placeholder shells by basename (false, nologin) and applies it in both getShellConfig() and detectRuntimeShell(), falling back to a PATH-resolved sh/bash (or bare "sh") instead of invoking the placeholder. The implementation is clean and the three new tests cover the key cases.

Confidence Score: 5/5

Safe to merge — the fix is minimal, well-scoped, and all existing plus new tests pass.

The logic change is correct: placeholder basenames are filtered before any shell invocation, the fallback path reuses the existing resolveShellFromPath helper, and the fish-shell path is untouched. The only finding is a P2 suggestion to add a test for the detectRuntimeShell code path, which does not block merge.

No files require special attention.

Comments Outside Diff (1)

  1. src/agents/shell-utils.test.ts, line 1-6 (link)

    P2 No test coverage for detectRuntimeShell change

    The PR adds a placeholder-shell guard to detectRuntimeShell() (line 136 of shell-utils.ts), but the test file only imports and exercises getShellConfig, resolvePowerShellPath, and resolveShellFromPath. A test asserting that detectRuntimeShell() returns undefined (or falls through to env-var detection) when SHELL=/usr/bin/false would give the same confidence for that branch as the three new getShellConfig tests do for theirs.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/agents/shell-utils.test.ts
    Line: 1-6
    
    Comment:
    **No test coverage for `detectRuntimeShell` change**
    
    The PR adds a placeholder-shell guard to `detectRuntimeShell()` (line 136 of `shell-utils.ts`), but the test file only imports and exercises `getShellConfig`, `resolvePowerShellPath`, and `resolveShellFromPath`. A test asserting that `detectRuntimeShell()` returns `undefined` (or falls through to env-var detection) when `SHELL=/usr/bin/false` would give the same confidence for that branch as the three new `getShellConfig` tests do for theirs.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/shell-utils.test.ts
Line: 1-6

Comment:
**No test coverage for `detectRuntimeShell` change**

The PR adds a placeholder-shell guard to `detectRuntimeShell()` (line 136 of `shell-utils.ts`), but the test file only imports and exercises `getShellConfig`, `resolvePowerShellPath`, and `resolveShellFromPath`. A test asserting that `detectRuntimeShell()` returns `undefined` (or falls through to env-var detection) when `SHELL=/usr/bin/false` would give the same confidence for that branch as the three new `getShellConfig` tests do for theirs.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "shell: fall back to sh when SHELL is /us..." | Re-trigger Greptile

@steipete steipete force-pushed the fix/shell-fallback-non-interactive branch from e15950b to 39dde04 Compare April 21, 2026 03:52
@steipete steipete force-pushed the fix/shell-fallback-non-interactive branch from 39dde04 to a221f51 Compare April 21, 2026 04:26
@steipete steipete force-pushed the fix/shell-fallback-non-interactive branch from a221f51 to 10197ce Compare April 21, 2026 04:40
@steipete steipete force-pushed the fix/shell-fallback-non-interactive branch from 10197ce to aadb99a Compare April 21, 2026 04:59
@steipete steipete force-pushed the fix/shell-fallback-non-interactive branch from aadb99a to 0dc6f04 Compare April 21, 2026 05:12
@steipete steipete force-pushed the fix/shell-fallback-non-interactive branch from 0dc6f04 to a3defb0 Compare April 21, 2026 07:12
@steipete steipete merged commit 2ad7bd0 into openclaw:main Apr 21, 2026
92 checks passed
@steipete
Copy link
Copy Markdown
Contributor

Thanks @sk7n4k3d, landed in a3defb0.

Validation before merge:

  • pnpm test src/agents/shell-utils.test.ts
  • pnpm check:changed
  • GitHub checks green, including GPT-5.4 / Opus 4.6 parity gate

I rebased over current main and kept the changelog entry with the other landed fixes.

zhongmairen pushed a commit to agencyos-cn/openclaw-bad that referenced this pull request Apr 21, 2026
* 'main' of https://github.com/openclaw/openclaw: (653 commits)
  docs(changelog): deduplicate openclaw#67800 entries in Unreleased (openclaw#69670)
  fix(agents): honor explicit long Anthropic cache TTL on custom hosts (openclaw#67800)
  fix: fix Telegram media file delivery (openclaw#69641)
  fix(media): preserve outbound attachment filenames
  fix(media): parse lowercase media directives
  fix(bluebubbles): add opt-in coalesceSameSenderDms for split-send DMs (openclaw#69258)
  fix: centralize provider thinking profiles
  docs: prepare 2026.4.20 changelog
  fix: stage ACP and Codex runtime deps
  fix(gateway): drop stale service env on reinstall
  test: add bundled channel dependency Docker smoke
  test: relax detached task recovery timing assertion
  fix: ignore placeholder shells in runtime detection (openclaw#69308)
  shell: fall back to sh when SHELL is /usr/bin/false or nologin
  fix(browser): clarify DevToolsActivePort attach failures
  fix: sanitize mcp transport warning fields
  fix: launch Windows startup gateway directly
  fix(openai-codex): normalize legacy copilot transport
  fix: narrow MCP stdio env safety filter (openclaw#69540)
  fix(mcp): block dangerous stdio env overrides
  ...
gdibble pushed a commit to gdibble/openclaw that referenced this pull request Apr 21, 2026
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
zhonghe0615 pushed a commit to zhonghe0615/openclaw that referenced this pull request May 7, 2026
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: macOS LaunchDaemon exec exits immediately with code 1 when service user shell is /usr/bin/false

2 participants