fix(approval): detect shell-expanded command names#40663
Conversation
|
Verified: shell-expanded command name deobfuscation is correct and well-tested. Reviewed the full diff — this is a solid security hardening. Key observations:
One minor observation: LGTM from a security perspective. |
|
Verified: shell-expanded command name deobfuscation is correct and well-tested. Reviewed the full diff — this is a solid security hardening for the approval system's command detection layer. Key observations:
LGTM. |
|
Thanks for the detailed review. Yes, the double-quoted substitution case is intentionally conservative in the detection layer: it never executes shell code, and the regression tests keep argument-only cases like |
egilewski
left a comment
There was a problem hiding this comment.
Requesting changes.
The patch closes the direct #36846 repro for command-position shell expansion. On current main 9c9d9113a8ae8f85aaf1ebff74ddd1fdf0b5b9a8, string-only detector probes approved $(echo rm) -rf /, sudo $(echo rm) -rf /, and ${0/x/r}m -rf /; with this PR staged on top, those same probes are hardline-blocked.
However, the PR currently fails its own focused regression suite:
/home/mac/hermes-agent/.venv/bin/python -m pytest -o addopts='' -p no:cacheprovider tests/tools/test_hardline_blocklist.py -q
=> 2 failed, 106 passed
Both failures are the new allowlist cases in tests/tools/test_hardline_blocklist.py: echo r\m -rf / and echo r''m -rf / are hardline-blocked even though they are argument text. The root cause appears to be that _normalize_command_for_detection() still globally strips backslashes and empty quotes before the command-position scanner runs, so those inputs become echo rm -rf / and match the hardline regex.
Checked against PR head b2406bbe606c5d287f5748eb964b5a0fb0cddf16; git diff --cached --check HEAD passed.
Signed: GPT-5.5-xhigh in Codex
…proval-shell-expansion-bypass
Co-authored-by: egilewski <1078345+egilewski@users.noreply.github.com>
|
Thanks for catching this. I reproduced the failure on the GitHub merge result rather than the standalone PR head, and updated the branch accordingly. What changed:
Verification: I also added |
Co-authored-by: egilewski <1078345+egilewski@users.noreply.github.com>
59902f0 to
79522af
Compare
egilewski
left a comment
There was a problem hiding this comment.
Requesting changes.
The branch update fixes the false-positive regression from the previous review: with the patch replayed on current GitHub main 3dcfbbfc4918e7a2138fa7107669d4f09c32bf0f, direct probes now hardline-block the original command-position forms $(echo rm) -rf /, backtick echo rm, ${0/x/r}m -rf /, and wrapper variants, while argument-only cases like echo r\m -rf / and echo r''m -rf / stay allowed. The focused tests now pass:
/home/mac/hermes-agent/.venv/bin/python -B -m pytest -o addopts='' -p no:cacheprovider tests/tools/test_hardline_blocklist.py tests/tools/test_approval.py tests/tools/test_command_guards.py -q->339 passedgit diff --check HEAD-> passedcoderabbit review --plain --base 3dcfbbfc4918e7a2138fa7107669d4f09c32bf0f --type uncommitted->No findings
The remaining blocker is that the same command-position shell-expansion class is still bypassable with other literal substitutions/parameter forms. On the patched replay at PR head 79522afd8f998bb9d84008a3a898992c3f05da86, direct detect_dangerous_command, detect_hardline_command, and check_all_command_guards(..., "local") probes all returned non-dangerous/non-hardline/approved for:
$(printf rm) -rf /$(printf %s rm) -rf /$(printf r)m -rf /$(echo -n rm) -rf /${unset:-rm} -rf /
Those are still command-position words that a shell can resolve to rm before the existing hardline rm -rf / floor sees them, so the PR currently narrows the repro to an echo-specific subset rather than closing the #36846 command-substitution/parameter-expansion bypass class.
Signed: GPT-5.5-xhigh in Codex
Co-authored-by: egilewski <1078345+egilewski@users.noreply.github.com>
|
Thanks again, this was a good catch. I pushed a follow-up that keeps the command-position-only scanner but broadens the literal, non-executing shell-substitution folding. What changed:
Verification: The follow-up commit also includes the same |
egilewski
left a comment
There was a problem hiding this comment.
Requesting changes.
I rechecked PR head 6244d2780fa8cb7c82f04c5b2af1262727af3ba1 against current GitHub main dbbd1d4d050146c8e2d0cd01eaa8543993dd98f9. The branch now merges cleanly and fixes the previous false-positive/test issues, but the hardline/yolo boundary still has simple command-position substitution bypasses.
The improved cases do work: in the merged PR worktree, $(printf rm) -rf /, $(printf %s rm) -rf /, $(printf r)m -rf /, $(echo -n rm) -rf /, and ${unset:-rm} -rf / are hardline-blocked, while argument-only examples such as echo $(printf rm) -rf / and echo r\m -rf / are no longer promoted.
The remaining issue is that bash still resolves these command-position words to argv0=rm, while the PR returns detect_dangerous_command == False, detect_hardline_command == False, and check_all_command_guards(..., "local") -> approved=True with HERMES_YOLO_MODE=1:
$(printf "\162m") -rf /
$(printf %b "\162m") -rf /
$(printf "\x72m") -rf /
$(printf %b "\x72m") -rf /
$(printf rm | cat) -rf /
$(printf rm;) -rf /
Validation:
git merge --no-commit --no-ff refs/remotes/upstream/pr/40663fromdbbd1d4d050146c8e2d0cd01eaa8543993dd98f9: clean, staged diff limited toscripts/release.py,tools/approval.py, and the two approval tests.git merge-tree --write-tree dbbd1d4d050146c8e2d0cd01eaa8543993dd98f9 refs/remotes/upstream/pr/40663: exited 0./home/mac/hermes-agent/.venv/bin/python -B -m pytest -o addopts='' -p no:cacheprovider tests/tools/test_hardline_blocklist.py tests/tools/test_approval.py tests/tools/test_command_guards.py -q:353 passed.git diff --cached --check HEAD: passed.coderabbit review --plain --base dbbd1d4d050146c8e2d0cd01eaa8543993dd98f9 --type uncommitted:No findings.
Recommendation: keep the command-position scanner, but either broaden the safe folding for these simple printf escape/pipeline/terminator forms or fail closed when a command-position substitution cannot be normalized soundly enough for the hardline floor.
Signed: GPT-5.5-xhigh in Codex
|
Hi, human here: very nice of you to add me to co-authored-by, stealing that for my AGENTS.md. |
|
I opened #43658 as a quickfix for the original issue, aimed at a faster merge with a smaller and narrower patch. It does not cover as much as this PR: #40663 still has the broader command-position deobfuscation direction and more complete shell-word modeling. My recommendation is to keep #40663 open rather than treating #43658 as a replacement, and also open a new broader issue to track the remaining/generalized command-position shell-expansion coverage after the quickfix lands. |
Dynamic shell command names were still opaque to the dangerous-command detector, so payloads that resolve a command name through shell substitution could bypass shell.exec approval and the hardline yolo floor for catastrophic command shapes. This keeps the fix compact by using command-position detection rather than a broad shell parser. Static shell spelling tricks are deobfuscated only for command-position words, active dynamic command names require dangerous approval, and dynamic names with catastrophic tails or literal shutdown-style names remain hardline-blocked. Argument-only text such as echo examples is left alone. Validation: - pytest: tests/tools/test_approval.py tests/tools/test_hardline_blocklist.py tests/tools/test_command_guards.py tests/tools/test_yolo_mode.py tests/tools/test_cron_approval_mode.py tests/tools/test_managed_browserbase_and_modal.py tests/test_tui_gateway_server.py selected shell.exec regression tests - git diff --check - py_compile: tools/approval.py and touched tests This is a smaller and less versatile alternative to NousResearch#40663 aimed for a quick resolution of a P0 security issue. Fixes NousResearch#36847 Related NousResearch#36846 Co-authored-by: xy200303 <3483421977@qq.com>
Summary
$(echo rm)/ backtick echo substitutions, and${.../.../replacement}parameter replacement.HARDLINE_PATTERNSandDANGEROUS_PATTERNSinstead of adding a second blocklist, so approval behavior and descriptions stay consistent.Plain-language model
The fix is intentionally conservative: it does not try to parse all of shell, and it never executes shell code. It only asks one question that matters for approval safety: what word is the shell about to treat as the command name?
A shell command is like a sentence where the executable name is the verb. The dangerous part of this repro is not that the text contains
rm; it is that the command-position word will becomermafter shell spelling rules are applied.Examples:
$(echo rm) -rf /Here the command-position word is
$(echo rm), so the detector creates a safe detection-only variant:That variant hits the existing hardline/dangerous rules.
sudo $(echo rm) -rf /Here
sudois a wrapper, so the scanner skips it and inspects the next command-position word, which also deobfuscates torm.Here the command name is
echo;$(echo rm)is just an argument. The patch deliberately does not rewrite that argument into a command name, so this stays unflagged by the new path.The important part is not just deobfuscating
rm; it is deobfuscating only at places where the shell would actually look for an executable.Why this is larger than a simple normalizer change
The tempting fix is to globally strip
\,'',"", or replace$(echo rm)across the whole command string. That catches the repro, but it also changes the meaning of data arguments. For example,echo $(echo rm) -rf /should not be treated as if the top-level command isrm -rf /, and quoted/logged examples should not accidentally get promoted into the hardline path.This patch therefore does a small, non-executing shell-word scan and only creates extra detection variants for words that appear where a shell would parse an executable name:
;,&&,||,|, and newlinessudo,env,exec,nohup,setsid,time,command, andbuiltinThat is why most of the diff is helper code: it is there to keep the fix scoped to command positions instead of rewriting arbitrary user data.
Main implementation points
_command_detection_variants()yields the original normalized command plus any command-name-deobfuscated variants._iter_shell_command_starts()finds shell command start positions while respecting quotes and simple substitutions._read_shell_word()reads one shell word without executing or resolving it._deobfuscate_shell_word_for_detection()collapses only syntax-level spelling tricks in that word:r\m->rmr''m/r""m->rm$(echo rm)/`echo rm`->rmfor literal echo payloads${0/x/r}m->rmfor the issue repro's simple replacement shapedetect_hardline_command()anddetect_dangerous_command()now run the existing compiled patterns over those variants.Safety boundaries
echopayloads, not arbitrary nested commands.echo $(echo rm) -rf /,echo r\m -rf /, andecho r''m -rf /to document that argument-only text is not promoted into a hardline/dangerous command.Verification
python -m pytest -o addopts="" tests\tools\test_approval.py tests\tools\test_hardline_blocklist.py -qpython -m pytest -o addopts="" tests\tools\test_command_guards.py -qgit diff --checkNotes