Skip to content

fix(security): restore approval/sudo context in execute_code + guard entry points (salvage #34131)#34497

Merged
teknium1 merged 5 commits into
mainfrom
hermes/hermes-9ad10701
May 29, 2026
Merged

fix(security): restore approval/sudo context in execute_code + guard entry points (salvage #34131)#34497
teknium1 merged 5 commits into
mainfrom
hermes/hermes-9ad10701

Conversation

@teknium1

@teknium1 teknium1 commented May 29, 2026

Copy link
Copy Markdown
Contributor

Closes #4146 · Closes #27303 · Closes #33057 · Closes #30882 · supersedes #27304, #30893, #33246

Salvage of #34131 (@firefly / banditburai) onto current main, plus a small diagnosability follow-up. Restores the dangerous-command approval guard across every execute_code entry point and stops the sandbox child env from leaking config.

Summary

execute_code's RPC worker threads were raw threading.Threads — they started with an empty context, dropping the approval/session ContextVars (routing gateway sessions into the non-interactive auto-approve branch) and the thread-local approval/sudo callbacks (so the CLI prompt couldn't reach the user). Dangerous commands a sandbox script issued therefore ran without approval. The approval gate is a documented heuristic in SECURITY.md, not a security boundary; this restores its intended behavior — it doesn't add one.

Fixes

Issue Vector Fix
#4146 sandbox terminal() skips the approval guard guard fires once context is restored; terminal stays allowlisted
#27303 broad HERMES_* env passthrough leaks config scrub → operational allowlist + DSN/WEBHOOK secret block
#30882 gateway manual-approval silently bypassed via execute_code one-shot entry guard, fail-closed
#33057 approval/session context lost in execute_code RPC threads shared context + callback propagation into both RPC threads

How

  • tools/thread_context.py — one audited propagate_context_to_thread() (ContextVar + thread-local approval/sudo capture→install→clear). Applied to both execute_code RPC threads; agent/tool_executor.py refactored onto it (one copy, not three).
  • check_execute_code_guard — one-shot whole-script approval in gateway/ask/cron-deny before the child spawns; fail-closed on deny/timeout/missing-notify; skips isolated container backends. Gateway wait loop deduped into _await_gateway_decision.
  • Env scrub — broad HERMES_ prefix → explicit {HOME,PROFILE,CONFIG,ENV} allowlist + DSN/WEBHOOK secret substrings. Skill/config env_passthrough remains the explicit opt-in (checked first).
  • Gateway — startup warning when approvals.mode=manual with no risk assessor.

Follow-up mitigation (this salvage)

Tightening the env scrub is correct, but a sandbox script importing a repo module that reads a non-allowlisted HERMES_* var at import time would otherwise see it silently unset. _scrub_child_env now emits a one-shot debug log naming the dropped non-secret HERMES_* vars and pointing at the env_passthrough opt-in. Secret-shaped vars are never named in the log.

Behavior changes (no regression to common workflows)

  • Plain local interactive CLI and plain local non-interactive (no gateway, no cron-deny) still auto-run execute_code — unchanged. Does not flip headless-local to fail-closed.
  • Gateway approvals.mode=manual now prompts before a sandbox script runs (consistent with how terminal() already behaves). off/yolo bypasses, same as terminal.
  • Cron with cron_mode=deny (the default) now blocks execute_code, closing the hole in an existing policy that already blocks terminal commands. cron_mode: approve restores it.

Validation

  • Cluster suites: 48 passed / 0 failed (incl. 2 new mitigation tests). ruff + py_compile clean.
  • E2E (real imports, real file I/O): cron-deny blocks before child spawn (no marker file); headless-local still auto-runs; docker backend skips; gateway deny path blocks via the real queue; env_passthrough opt-in restores a dropped HERMES_* var post-scrub.

Authorship: @firefly's 4 commits preserved via rebase-merge; the diagnosability follow-up is a separate maintainer commit.

Infographic

execute_code approval hardening

banditburai and others added 5 commits May 29, 2026 01:26
Worker threads that dispatch Hermes tools started with an empty contextvars.Context and no thread-local approval/sudo callbacks. Add tools/thread_context.propagate_context_to_thread factoring that capture/install/clear lifecycle (mirrors the GHSA-qg5c-hvr5-hjgr pattern), and refactor agent/tool_executor onto it so the security-critical logic lives in one audited place. Update the contextvar-propagation source guard for the new call shape.

Refs #33057
…+ guard entry

Wrap both execute_code RPC threads (local UDS + remote file-RPC) with propagate_context_to_thread so gateway sessions no longer fall into check_dangerous_command's non-interactive auto-approve branch and the CLI approval prompt stays reachable. Add check_execute_code_guard: one-shot fail-closed approval of the whole script in gateway/ask/cron-deny before the child spawns (skips isolated backends; command-string built only past the early returns). Drop the broad HERMES_ env passthrough for an explicit operational allowlist plus DSN/WEBHOOK secret substrings, and update the POSIX-equivalence oracle.

Refs #4146, #27303, #30882, #33057
When approvals.mode=manual with security.tirith_enabled off and no auxiliary.approval model, dangerous commands and execute_code scripts can only be gated by live in-chat approval; with routing fixed they now fail closed (block) rather than silently auto-run. Surface that at startup so operators knowingly enable tirith or auxiliary.approval for unattended gateways.

Refs #30882
Cover context+callback propagation and teardown-clears, a source guard that both RPC threads stay wrapped, the check_execute_code_guard decision matrix (isolated backend, headless-local, cron-deny, gateway approve/deny/timeout/missing-notify, smart mode, session-yolo), the env-scrub allowlist/secret rules, and a behavioral test that execute_code() blocks before spawning on denial.

Refs #4146, #27303, #30882, #33057
… scrub

Follow-up mitigation for the #27303 env-scrub tightening. Dropping the
broad HERMES_ prefix in favor of a 4-var operational allowlist is correct
hardening, but a sandbox script that imports a repo module reading a
non-allowlisted HERMES_* var at import time would otherwise see it
silently unset. _scrub_child_env now emits a one-shot debug log naming the
dropped non-secret HERMES_* vars and pointing at the env_passthrough
opt-in escape hatch. Secret-shaped vars are never named in the log.

Tests: dropped vars are logged + env_passthrough named; no log when
nothing is dropped; secret vars excluded from the diagnostic.
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-9ad10701 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9439 on HEAD, 9439 on base (➖ 0)

🆕 New issues (1):

Rule Count
unresolved-import 1
First entries
tests/tools/test_execute_code_approval_cluster.py:22: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`

✅ Fixed issues (1):

Rule Count
call-non-callable 1
First entries
tools/approval.py:1237: [call-non-callable] call-non-callable: Object of type `~None` is not callable

Unchanged: 4897 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/gateway Gateway runner, session dispatch, delivery P1 High — major feature broken, no workaround tool/code-exec execute_code sandbox type/security Security vulnerability or hardening

Projects

None yet

3 participants