[codex] Guard execute_code behind gateway approvals by egilewski · Pull Request #30893 · NousResearch/hermes-agent

egilewski · 2026-05-23T10:33:02Z

This is a fully-automated contribution trying to expand on the success of #30432.

Summary

This closes the remaining execute_code bypass in gateway manual approval mode. Current main already has a blocking gateway approval queue for terminal commands, but execute_code can spawn a local Python child before any terminal guard sees the subprocess/os calls inside that script.

Root Cause

execute_code scripts run arbitrary Python. A script can call subprocess.run(...), os.system(...), ctypes, or other process/file APIs directly, so dangerous shell behavior can happen without going through terminal() and without matching DANGEROUS_PATTERNS as a shell command.

What Changed

Added check_execute_code_guard() in tools/approval.py.
Before local/SSH execute_code spawns the child process, gateway/ask contexts now submit a one-shot approval request through the existing blocking gateway approval queue.
User denial, timeout, missing notify callback, or guard failure all fail closed before the script runs.
approvals.mode: off and session/process YOLO still bypass this guard intentionally.
Container/cloud backends keep the existing approval behavior, matching terminal command approval's existing skip for isolated backends.
Cron sessions with approvals.cron_mode: deny now block local/SSH execute_code, because no user is present to approve arbitrary script execution.
Shared the gateway approval wait helper with the existing terminal approval path so approval waits continue feeding the inactivity watchdog.

Regression Coverage

Added tests/tools/test_code_execution.py coverage for:

gateway denial blocks execute_code before the child process is spawned, verified with a marker file that never appears;
one-shot gateway approval allows the script to continue and return normal output.

Validation

.venv/bin/python -m ruff check tools/approval.py tools/code_execution_tool.py tests/tools/test_code_execution.py
.venv/bin/python -m py_compile tools/approval.py tools/code_execution_tool.py tests/tools/test_code_execution.py
HOME=/tmp/hermes-test-home scripts/run_tests.sh tests/gateway/test_approve_deny_commands.py
HOME=/tmp/hermes-test-home scripts/run_tests.sh tests/tools/test_cron_approval_mode.py
HOME=/tmp/hermes-test-home .venv/bin/python -m pytest tests/tools/test_code_execution.py::TestExecuteCodeEdgeCases::test_gateway_execute_code_denial_blocks_child_process -q
HOME=/tmp/hermes-test-home .venv/bin/python -m pytest tests/tools/test_code_execution.py::TestExecuteCodeEdgeCases::test_gateway_execute_code_runs_after_one_shot_approval -q
HOME=/tmp/hermes-test-home scripts/run_tests.sh tests/tools/test_code_execution.py passed outside the local filesystem sandbox; inside the sandbox this file hit the existing UDS bind restriction (PermissionError: [Errno 1] Operation not permitted) across pre-existing execute_code tests.

Assumptions

The intended security contract for gateway manual approvals is fail-closed for local/SSH arbitrary code execution, because the generated script can bypass command-string inspection.
Docker, Singularity, Modal, Daytona, and Vercel Sandbox should retain the existing container/cloud approval behavior used by terminal commands.
A one-shot approval is the right scope for execute_code; /approve session or /approve always resolves the current wait but this guard does not persist a broad allowlist for future scripts.

egilewski · 2026-05-23T12:42:53Z

Follow-up CI check: current PR head 72ff96b16 is green.

The signed follow-up commit stabilized the two unrelated failures from the previous run:

tests/tools/test_browser_supervisor.py: Chrome startup failures now force-kill the process before skipping when CDP never becomes available.
tests/acp/test_server.py: ACP model-switch handoff tests now patch the ACP resolver directly, so unrelated provider registry state cannot shadow the requested provider.

Verification observed on GitHub Actions:

test (1) through test (6): success
ruff enforcement (blocking) and ruff + ty diff: success
Nix on Ubuntu and macOS: success
Docker build amd64 and arm64: success
Supply-chain, attribution, history, and e2e checks: success

Non-blocking note: the Docker workflow still emits GitHub's Node.js 20 action deprecation warning for the pinned docker/setup-buildx-action; it did not fail the run.

teknium1 · 2026-05-29T10:45:31Z

Superseded by #34497 (merged). The whole-script entry guard (check_execute_code_guard) is adapted from your approach — one-shot gateway approval before the child spawns, fail-closed on deny/timeout/missing-notify, container/cloud backends skipped, cron-deny blocks. Thanks; credited in the salvage.

#34497

alt-glitch added type/security Security vulnerability or hardening comp/gateway Gateway runner, session dispatch, delivery tool/code-exec execute_code sandbox P1 High — major feature broken, no workaround labels May 23, 2026

Guard execute_code in gateway approvals

b3941fd

egilewski force-pushed the codex/fix-gateway-execute-code-approval branch from bf39ea6 to b3941fd Compare May 23, 2026 11:46

egilewski marked this pull request as ready for review May 23, 2026 11:46

Stabilize unrelated CI tests

72ff96b

lsaether mentioned this pull request May 24, 2026

[Bug]: ACP denied edits can be silently reattempted through alternate write-capable tools #31682

Open

This was referenced May 26, 2026

feat(tools): add pre-dispatch enforcement hook to ToolRegistry tenuo-ai/hermes-agent#2

Open

feat(tools): add pre-dispatch enforcement hook to ToolRegistry #32719

Open

egilewski mentioned this pull request May 27, 2026

[codex] Preserve context in execute_code RPC threads #33246

Closed

banditburai mentioned this pull request May 28, 2026

fix(security): restore approval/sudo context in execute_code RPC threads + guard entry points #34131

Closed

teknium1 mentioned this pull request May 29, 2026

fix(security): restore approval/sudo context in execute_code + guard entry points (salvage #34131) #34497

Merged

teknium1 closed this May 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] Guard execute_code behind gateway approvals#30893

[codex] Guard execute_code behind gateway approvals#30893
egilewski wants to merge 2 commits into
NousResearch:mainfrom
egilewski:codex/fix-gateway-execute-code-approval

egilewski commented May 23, 2026

Uh oh!

egilewski commented May 23, 2026

Uh oh!

teknium1 commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

egilewski commented May 23, 2026

Summary

Root Cause

What Changed

Regression Coverage

Validation

Assumptions

Uh oh!

egilewski commented May 23, 2026

Uh oh!

teknium1 commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants