Skip to content

fix(approval): detect shell-expanded command names#40663

Open
xy200303 wants to merge 5 commits into
NousResearch:mainfrom
xy200303:fix/approval-shell-expansion-bypass
Open

fix(approval): detect shell-expanded command names#40663
xy200303 wants to merge 5 commits into
NousResearch:mainfrom
xy200303:fix/approval-shell-expansion-bypass

Conversation

@xy200303

@xy200303 xy200303 commented Jun 6, 2026

Copy link
Copy Markdown
Contributor

Summary

Plain-language model

The fix is intentionally conservative: it does not try to parse all of shell, and it never executes shell code. It only asks one question that matters for approval safety: what word is the shell about to treat as the command name?

A shell command is like a sentence where the executable name is the verb. The dangerous part of this repro is not that the text contains rm; it is that the command-position word will become rm after shell spelling rules are applied.

Examples:

$(echo rm) -rf /

Here the command-position word is $(echo rm), so the detector creates a safe detection-only variant:

rm -rf /

That variant hits the existing hardline/dangerous rules.

sudo $(echo rm) -rf /

Here sudo is a wrapper, so the scanner skips it and inspects the next command-position word, which also deobfuscates to rm.

echo $(echo rm) -rf /

Here the command name is echo; $(echo rm) is just an argument. The patch deliberately does not rewrite that argument into a command name, so this stays unflagged by the new path.

The important part is not just deobfuscating rm; it is deobfuscating only at places where the shell would actually look for an executable.

Why this is larger than a simple normalizer change

The tempting fix is to globally strip \, '', "", or replace $(echo rm) across the whole command string. That catches the repro, but it also changes the meaning of data arguments. For example, echo $(echo rm) -rf / should not be treated as if the top-level command is rm -rf /, and quoted/logged examples should not accidentally get promoted into the hardline path.

This patch therefore does a small, non-executing shell-word scan and only creates extra detection variants for words that appear where a shell would parse an executable name:

  • start of a command
  • after command separators such as ;, &&, ||, |, and newlines
  • inside command substitutions where a nested command starts
  • after common wrappers such as sudo, env, exec, nohup, setsid, time, command, and builtin

That is why most of the diff is helper code: it is there to keep the fix scoped to command positions instead of rewriting arbitrary user data.

Main implementation points

  • _command_detection_variants() yields the original normalized command plus any command-name-deobfuscated variants.
  • _iter_shell_command_starts() finds shell command start positions while respecting quotes and simple substitutions.
  • _read_shell_word() reads one shell word without executing or resolving it.
  • _deobfuscate_shell_word_for_detection() collapses only syntax-level spelling tricks in that word:
    • r\m -> rm
    • r''m / r""m -> rm
    • $(echo rm) / `echo rm` -> rm for literal echo payloads
    • ${0/x/r}m -> rm for the issue repro's simple replacement shape
  • detect_hardline_command() and detect_dangerous_command() now run the existing compiled patterns over those variants.

Safety boundaries

  • No shell is executed and no environment values are expanded.
  • Command substitutions are only folded for simple literal echo payloads, not arbitrary nested commands.
  • Deobfuscation is applied to command-position words, not every token in the command string.
  • Regression tests include echo $(echo rm) -rf /, echo r\m -rf /, and echo r''m -rf / to document that argument-only text is not promoted into a hardline/dangerous command.

Verification

  • python -m pytest -o addopts="" tests\tools\test_approval.py tests\tools\test_hardline_blocklist.py -q
  • python -m pytest -o addopts="" tests\tools\test_command_guards.py -q
  • git diff --check

Notes

@alt-glitch alt-glitch added type/security Security vulnerability or hardening P1 High — major feature broken, no workaround comp/tools Tool registry, model_tools, toolsets tool/terminal Terminal execution and process management labels Jun 6, 2026
@liuhao1024

Copy link
Copy Markdown
Contributor

Verified: shell-expanded command name deobfuscation is correct and well-tested.

Reviewed the full diff — this is a solid security hardening. Key observations:

  1. _deobfuscate_shell_word_for_detection() correctly iterates (max 2 passes) to collapse nested shell quoting/escaping + simple $(echo ...) and ${var/pat/repl} expansions into their literal form. The convergence guard (if deobfuscated == previous: break) prevents infinite loops.

  2. _read_shell_word() properly skips $(...), ${...}, backtick substitutions, and quote-delimited spans without executing them — this is detection-only, not execution. The _scan_dollar_paren_end() handles balanced nesting with correct quote-awareness.

  3. _iter_shell_command_starts() correctly identifies command-position starts after ;, &&, ||, |, \n, and nested $(...). The seen set deduplicates overlapping starts (e.g., $(echo rm) yields both position 0 and the inner command start).

  4. _command_detection_variants() only yields deobfuscated variants that differ from the original — no unnecessary pattern re-evaluation.

  5. Test coverage is thorough: backslash escapes (r\m), empty-quote literals (r''m, r""m), $(echo rm) substitution, ${0/x/r}m parameter replacement, and the negative case (echo $(echo rm) stays as echo, not promoted to command).

  6. The 12-word prefix cap in _iter_shell_command_word_spans() prevents quadratic blowup on pathological inputs while still handling sudo -u user env VAR=val rm -rf /.

One minor observation: _iter_shell_command_starts() could yield a position inside a double-quoted $(...) context (the quote == '"' branch does starts.append(i + 2) for $(...) inside double quotes), but since the deobfuscation is detection-only and the outer quoting would prevent execution, this is a false-positive-safe over-approximation — not a bug.

LGTM from a security perspective.

@liuhao1024

Copy link
Copy Markdown
Contributor

Verified: shell-expanded command name deobfuscation is correct and well-tested.

Reviewed the full diff — this is a solid security hardening for the approval system's command detection layer. Key observations:

  1. _command_detection_variants() correctly generates deobfuscated variants of shell commands by collapsing quoting, backslash escapes, and simple $(echo ...) / ${var/pat/repl} expansions. The two-pass iteration in _deobfuscate_shell_word_for_detection() handles nested obfuscation (e.g., r''mrm).

  2. _iter_shell_command_word_spans() properly skips wrapper commands (sudo, env, exec, etc.) and their options to find the actual command name at any position. The _SUDO_OPTIONS_WITH_ARG set correctly handles -u, -g, etc. that consume the next token.

  3. _iter_shell_command_starts() handles compound commands (;, &&, ||, |, newlines) and nested $(...) substitutions, so $(echo rm) -rf / correctly identifies rm as the command name inside the substitution.

  4. Non-overreach: The test test_echo_argument_substitution_not_promoted_to_command verifies that echo $(echo rm) -rf / is NOT flagged — command-name deobfuscation only applies to command-position words, not ordinary arguments. This prevents false positives.

  5. Hardline blocklist integration: Both detect_dangerous_command() and detect_hardline_command() now iterate over variants, so all existing pattern checks automatically benefit from deobfuscation.

  6. Test coverage is thorough: backslash escapes, empty quote pairs, $(echo ...) substitution, ${0/x/r}m parameter replacement, and negative cases for echo arguments. The hardline blocklist tests add corresponding obfuscated dangerous commands and non-dangerous echo cases.

LGTM.

@xy200303

xy200303 commented Jun 7, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the detailed review. Yes, the double-quoted substitution case is intentionally conservative in the detection layer: it never executes shell code, and the regression tests keep argument-only cases like echo $(echo rm) -rf / from being promoted into command execution. Appreciate the careful check.

@egilewski egilewski left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes.

The patch closes the direct #36846 repro for command-position shell expansion. On current main 9c9d9113a8ae8f85aaf1ebff74ddd1fdf0b5b9a8, string-only detector probes approved $(echo rm) -rf /, sudo $(echo rm) -rf /, and ${0/x/r}m -rf /; with this PR staged on top, those same probes are hardline-blocked.

However, the PR currently fails its own focused regression suite:

/home/mac/hermes-agent/.venv/bin/python -m pytest -o addopts='' -p no:cacheprovider tests/tools/test_hardline_blocklist.py -q
=> 2 failed, 106 passed

Both failures are the new allowlist cases in tests/tools/test_hardline_blocklist.py: echo r\m -rf / and echo r''m -rf / are hardline-blocked even though they are argument text. The root cause appears to be that _normalize_command_for_detection() still globally strips backslashes and empty quotes before the command-position scanner runs, so those inputs become echo rm -rf / and match the hardline regex.

Checked against PR head b2406bbe606c5d287f5748eb964b5a0fb0cddf16; git diff --cached --check HEAD passed.

Signed: GPT-5.5-xhigh in Codex

xy200303 and others added 2 commits June 9, 2026 11:16
Co-authored-by: egilewski <1078345+egilewski@users.noreply.github.com>
@xy200303

xy200303 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for catching this. I reproduced the failure on the GitHub merge result rather than the standalone PR head, and updated the branch accordingly.

What changed:

  • Removed the global shell backslash / empty-quote stripping from _normalize_command_for_detection(), so argument text like echo r\m -rf / and echo r''m -rf / is no longer rewritten into echo rm -rf / before scanning.
  • Kept shell deobfuscation scoped to command-position words in _command_detection_variants(), so command-position forms like r\m -rf / and r''m -rf / are still blocked.
  • Preserved the current-main absolute HERMES_HOME config/env protection and made that rewrite understand Windows path separators too, so the cross-platform focused suite stays green.

Verification:

python -m pytest -o addopts="" -p no:cacheprovider tests\tools\test_hardline_blocklist.py -q
# 108 passed

python -m pytest -o addopts="" -p no:cacheprovider tests\tools\test_approval.py tests\tools\test_hardline_blocklist.py -q
# 321 passed

git diff --check
# passed

I also added Co-authored-by: egilewski <1078345+egilewski@users.noreply.github.com> to the follow-up commit for the fix direction from this review.

Co-authored-by: egilewski <1078345+egilewski@users.noreply.github.com>
@xy200303 xy200303 force-pushed the fix/approval-shell-expansion-bypass branch from 59902f0 to 79522af Compare June 9, 2026 03:40
@xy200303 xy200303 requested a review from egilewski June 9, 2026 03:41

@egilewski egilewski left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes.

The branch update fixes the false-positive regression from the previous review: with the patch replayed on current GitHub main 3dcfbbfc4918e7a2138fa7107669d4f09c32bf0f, direct probes now hardline-block the original command-position forms $(echo rm) -rf /, backtick echo rm, ${0/x/r}m -rf /, and wrapper variants, while argument-only cases like echo r\m -rf / and echo r''m -rf / stay allowed. The focused tests now pass:

  • /home/mac/hermes-agent/.venv/bin/python -B -m pytest -o addopts='' -p no:cacheprovider tests/tools/test_hardline_blocklist.py tests/tools/test_approval.py tests/tools/test_command_guards.py -q -> 339 passed
  • git diff --check HEAD -> passed
  • coderabbit review --plain --base 3dcfbbfc4918e7a2138fa7107669d4f09c32bf0f --type uncommitted -> No findings

The remaining blocker is that the same command-position shell-expansion class is still bypassable with other literal substitutions/parameter forms. On the patched replay at PR head 79522afd8f998bb9d84008a3a898992c3f05da86, direct detect_dangerous_command, detect_hardline_command, and check_all_command_guards(..., "local") probes all returned non-dangerous/non-hardline/approved for:

  • $(printf rm) -rf /
  • $(printf %s rm) -rf /
  • $(printf r)m -rf /
  • $(echo -n rm) -rf /
  • ${unset:-rm} -rf /

Those are still command-position words that a shell can resolve to rm before the existing hardline rm -rf / floor sees them, so the PR currently narrows the repro to an echo-specific subset rather than closing the #36846 command-substitution/parameter-expansion bypass class.

Signed: GPT-5.5-xhigh in Codex

Co-authored-by: egilewski <1078345+egilewski@users.noreply.github.com>
@xy200303

xy200303 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

Thanks again, this was a good catch. I pushed a follow-up that keeps the command-position-only scanner but broadens the literal, non-executing shell-substitution folding.

What changed:

  • Added command-word folding for literal printf substitutions: $(printf rm), $(printf %s rm), and split forms like $(printf r)m.
  • Added support for echo -n rm inside command substitution.
  • Added support for simple default parameter expansion in command position, e.g. ${unset:-rm}.
  • Kept the folding scoped to command-position words only, with allowlist coverage for echo $(printf rm) -rf /, echo $(printf %s rm) -rf /, echo $(echo -n rm) -rf /, and echo ${unset:-rm} -rf /.

Verification:

python -B -m pytest -o addopts="" -p no:cacheprovider tests\tools\test_hardline_blocklist.py tests\tools\test_approval.py tests\tools\test_command_guards.py -q
# 353 passed

python -m pytest -o addopts="" -p no:cacheprovider tests\tools\test_approval.py tests\tools\test_hardline_blocklist.py -q
# 335 passed

git diff --check
# passed

The follow-up commit also includes the same Co-authored-by trailer for your review-driven fix direction.

@xy200303 xy200303 requested a review from egilewski June 9, 2026 04:24

@egilewski egilewski left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes.

I rechecked PR head 6244d2780fa8cb7c82f04c5b2af1262727af3ba1 against current GitHub main dbbd1d4d050146c8e2d0cd01eaa8543993dd98f9. The branch now merges cleanly and fixes the previous false-positive/test issues, but the hardline/yolo boundary still has simple command-position substitution bypasses.

The improved cases do work: in the merged PR worktree, $(printf rm) -rf /, $(printf %s rm) -rf /, $(printf r)m -rf /, $(echo -n rm) -rf /, and ${unset:-rm} -rf / are hardline-blocked, while argument-only examples such as echo $(printf rm) -rf / and echo r\m -rf / are no longer promoted.

The remaining issue is that bash still resolves these command-position words to argv0=rm, while the PR returns detect_dangerous_command == False, detect_hardline_command == False, and check_all_command_guards(..., "local") -> approved=True with HERMES_YOLO_MODE=1:

$(printf "\162m") -rf /
$(printf %b "\162m") -rf /
$(printf "\x72m") -rf /
$(printf %b "\x72m") -rf /
$(printf rm | cat) -rf /
$(printf rm;) -rf /

Validation:

  • git merge --no-commit --no-ff refs/remotes/upstream/pr/40663 from dbbd1d4d050146c8e2d0cd01eaa8543993dd98f9: clean, staged diff limited to scripts/release.py, tools/approval.py, and the two approval tests.
  • git merge-tree --write-tree dbbd1d4d050146c8e2d0cd01eaa8543993dd98f9 refs/remotes/upstream/pr/40663: exited 0.
  • /home/mac/hermes-agent/.venv/bin/python -B -m pytest -o addopts='' -p no:cacheprovider tests/tools/test_hardline_blocklist.py tests/tools/test_approval.py tests/tools/test_command_guards.py -q: 353 passed.
  • git diff --cached --check HEAD: passed.
  • coderabbit review --plain --base dbbd1d4d050146c8e2d0cd01eaa8543993dd98f9 --type uncommitted: No findings.

Recommendation: keep the command-position scanner, but either broaden the safe folding for these simple printf escape/pipeline/terminator forms or fail closed when a command-position substitution cannot be normalized soundly enough for the hardline floor.

Signed: GPT-5.5-xhigh in Codex

@egilewski

Copy link
Copy Markdown
Contributor

Hi, human here: very nice of you to add me to co-authored-by, stealing that for my AGENTS.md.

Copy link
Copy Markdown
Contributor

I opened #43658 as a quickfix for the original issue, aimed at a faster merge with a smaller and narrower patch.

It does not cover as much as this PR: #40663 still has the broader command-position deobfuscation direction and more complete shell-word modeling. My recommendation is to keep #40663 open rather than treating #43658 as a replacement, and also open a new broader issue to track the remaining/generalized command-position shell-expansion coverage after the quickfix lands.

egilewski added a commit to egilewski/hermes-agent that referenced this pull request Jun 10, 2026
Dynamic shell command names were still opaque to the dangerous-command detector, so payloads that resolve a command name through shell substitution could bypass shell.exec approval and the hardline yolo floor for catastrophic command shapes.

This keeps the fix compact by using command-position detection rather than a broad shell parser. Static shell spelling tricks are deobfuscated only for command-position words, active dynamic command names require dangerous approval, and dynamic names with catastrophic tails or literal shutdown-style names remain hardline-blocked. Argument-only text such as echo examples is left alone.

Validation:

- pytest: tests/tools/test_approval.py tests/tools/test_hardline_blocklist.py tests/tools/test_command_guards.py tests/tools/test_yolo_mode.py tests/tools/test_cron_approval_mode.py tests/tools/test_managed_browserbase_and_modal.py tests/test_tui_gateway_server.py selected shell.exec regression tests

- git diff --check

- py_compile: tools/approval.py and touched tests

This is a smaller and less versatile alternative to NousResearch#40663 aimed for a quick resolution of a P0 security issue.

Fixes NousResearch#36847

Related NousResearch#36846

Co-authored-by: xy200303 <3483421977@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/tools Tool registry, model_tools, toolsets P1 High — major feature broken, no workaround tool/terminal Terminal execution and process management type/security Security vulnerability or hardening

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants