Skip to content

[codex] Preserve context in execute_code RPC threads#33246

Closed
egilewski wants to merge 1 commit into
NousResearch:mainfrom
egilewski:codex/fix-execute-code-rpc-context
Closed

[codex] Preserve context in execute_code RPC threads#33246
egilewski wants to merge 1 commit into
NousResearch:mainfrom
egilewski:codex/fix-execute-code-rpc-context

Conversation

@egilewski

@egilewski egilewski commented May 27, 2026

Copy link
Copy Markdown
Contributor

Disclaimer: this is a fully-automated contribution trying to expand on the success of #30432.

Fixes #33057.

Summary

This preserves Hermes session/approval context when execute_code sandbox scripts call back into Hermes tools through the parent-process RPC bridge.

Root Cause

execute_code starts raw threading.Thread workers for its RPC paths:

  • the local Unix-domain-socket RPC server thread;
  • the remote file-polling RPC thread.

Raw Python threads start with an empty contextvars.Context. That means any ContextVar bound around the agent turn, including approval/session routing state, is not visible when the sandbox calls back into model_tools.handle_function_call(...).

For approval-sensitive flows, that is the wrong boundary: a sandboxed script can call hermes_tools.terminal(...), and the resulting terminal approval checks need the same session context as the agent turn that launched execute_code.

What Changed

  • Added a small _context_thread_target(...) helper in tools/code_execution_tool.py.
  • Wrapped both execute-code RPC thread targets in contextvars.copy_context().run(...).
  • Added regression coverage proving a sandbox tool call sees:
    • an arbitrary parent ContextVar;
    • the approval session key from tools.approval;
    • the original execute-code task_id.

Relationship To #30893

#30893 guards the execute_code entry point in gateway approval contexts: the script itself must be approved before it starts.

This PR is complementary. It keeps the already-running sandbox's RPC tool calls attached to the same turn context, so approval/session-sensitive tool dispatch inside execute_code does not fall back to missing or stale context.

Validation

  • .venv/bin/ruff check tools/code_execution_tool.py tests/tools/test_code_execution.py
  • .venv/bin/python -m py_compile tools/code_execution_tool.py tests/tools/test_code_execution.py
  • .venv/bin/python -m pytest tests/tools/test_code_execution.py::TestExecuteCodeEdgeCases::test_rpc_thread_preserves_contextvars -q --tb=short
  • .venv/bin/python -m pytest tests/tools/test_code_execution.py -q --tb=short

Note: the focused/full execute_code pytest runs were executed outside my local Codex filesystem sandbox because the sandbox blocks Unix-domain-socket bind with PermissionError: [Errno 1] Operation not permitted. The same tests pass when run in the normal local environment.

@alt-glitch alt-glitch added type/security Security vulnerability or hardening comp/agent Core agent loop, run_agent.py, prompt builder tool/code-exec execute_code sandbox P2 Medium — degraded but workaround exists labels May 27, 2026
@liuhao1024

Copy link
Copy Markdown
Contributor

Good catch — the ContextVar leakage through RPC threads is a real correctness issue.

I verified the two threading.Thread call sites at lines 908 and 1155 in code_execution_tool.py — both are RPC-handling threads (poll loop and server loop) that invoke handle_function_call, which depends on session context. The two other threads at lines 1307/1313 are stdout/stderr drain threads that don't call tool functions, so they correctly don't need the wrapper. The fix is well-scoped.

One note: _context_thread_target returns a closure that captures the context at call time. If execute_code is ever called from multiple threads concurrently (unlikely today but worth noting), each call site will snapshot its own context independently — which is the correct behavior.

The test properly validates both ContextVar propagation and session_key inheritance. LGTM.

@egilewski

Copy link
Copy Markdown
Contributor Author

Pretty sure the failed build is just an Actions' fluke, but I don't have access to restart that job, and don't want to force-push to work around that.

@teknium1

Copy link
Copy Markdown
Contributor

Superseded by #34497 (merged). Same root cause you identified — raw RPC threads start with an empty contextvars.Context. The merged fix also restores the thread-local approval/sudo callbacks (not just the ContextVar) via a shared propagate_context_to_thread helper, so the CLI approval prompt reaches the user too. Thanks; credited in the salvage.

#34497

@teknium1 teknium1 closed this May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists tool/code-exec execute_code sandbox type/security Security vulnerability or hardening

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(cli): dangerous command approval bypass in execute_code RPC threads

4 participants