fix(security): remove terminal from sandbox allowed tools in execute_code#4143
fix(security): remove terminal from sandbox allowed tools in execute_code#4143pablontiv wants to merge 2 commits into
Conversation
…code The terminal tool in SANDBOX_ALLOWED_TOOLS allowed sandboxed Python code to execute arbitrary shell commands via the RPC stub, bypassing the check_dangerous_command() approval system that protects the direct terminal tool call. This is not a real sandbox (no container/namespace isolation) — rm -rf would execute as the user on the host. Changes: - Remove "terminal" from SANDBOX_ALLOWED_TOOLS frozenset (7 tools → 6) - Remove _TOOL_STUBS["terminal"] entry (never generated post-fix) - Remove _TERMINAL_BLOCKED_PARAMS stripping block (unreachable post-fix) - Remove terminal from _TOOL_DOC_LINES and schema description - Change fallback to frozenset() to prevent terminal re-introduction when enabled_tools is None/empty - Update doc comment (7 tools → 6 tools)
…om sandbox Update tests that used terminal as a sandbox tool to use web_search instead, since terminal is no longer available in SANDBOX_ALLOWED_TOOLS. Changes: - test_generates_subset, test_non_allowed_tools_ignored, test_rpc_infrastructure_present, test_convenience_helpers_present: use web_search instead of terminal as the example tool - test_single_tool_call: use web_search instead of terminal - test_multi_tool_chain: use web_search + read_file instead of terminal + read_file - test_stubs_cover_all_schema_params: remove terminal special-case (_BLOCKED_TERMINAL_PARAMS no longer relevant) - test_subset_only_lists_enabled_tools, test_single_tool, test_import_examples_prefer_web_search: update to reflect terminal is no longer a sandbox tool - test_none/empty/nonoverlapping fallback tests: update to use tools that remain in the sandbox (web_search, read_file) - test_real_scenario_all_sandbox_tools_disabled: update comment (terminal no longer in SANDBOX_ALLOWED_TOOLS) - website docs: remove terminal from sandbox tools list
|
Closing — the security premise doesn't hold up against the actual code. Terminal calls from the execute_code sandbox go through the full approval pipeline. The call chain:
The existing Removing terminal from the sandbox would significantly reduce Thanks for looking into the security surface though — always good to have people auditing these paths. |
Summary
The code execution sandbox (
execute_code) intools/code_execution_tool.pyincludedterminalinSANDBOX_ALLOWED_TOOLS, allowing LLM-generated Python code running inside the sandbox to execute arbitrary shell commands via the RPC stub — bypassing thecheck_dangerous_command()approval system that protects directterminaltool calls.This is not a real sandbox (no container/namespace isolation).
rm -rf /would execute as the user on the host, with no approval prompt.Fixes #4146
What changed
tools/code_execution_tool.py"terminal"fromSANDBOX_ALLOWED_TOOLSfrozenset (7 → 6 tools)_TOOL_STUBS["terminal"]entry (dead code — never generated post-fix)_TERMINAL_BLOCKED_PARAMSvariable and its stripping block (unreachable post-fix)_TOOL_DOC_LINESand schema description"terminal() is foreground-only..."from description"terminal"fromimport_examplestuplefrozenset()(wasSANDBOX_ALLOWED_TOOLS) to prevent terminal re-introduction whenenabled_toolsis None/emptytests/tools/test_code_execution.pyterminalas a sandbox tool to useweb_searchinsteadwebsite/docs/user-guide/features/code-execution.mdterminalfrom sandbox tools listHow to test
hermes chat -q "Write a Python script that runs in the sandbox and calls terminal('echo hello')"terminalis not available in sandboxhermes chat -q "Use execute_code to search the web"web_searchhermes chat -q "Use terminal to run echo hello"Platforms tested
Linux (primary), macOS (UDS compatible). This fix only affects Unix-like systems (Windows already had sandbox disabled).
Security note
This is a critical security fix — shell injection from sandbox code. The
terminaltool remains available as a direct LLM tool call, protected bycheck_dangerous_command()with the 78-pattern dangerous command detection and user approval flow.