Skip to content

fix(agent): propagate ContextVars to concurrent tool worker threads#16660

Closed
banditburai wants to merge 2 commits into
NousResearch:mainfrom
banditburai:fix/contextvar-propagation-tool-executor
Closed

fix(agent): propagate ContextVars to concurrent tool worker threads#16660
banditburai wants to merge 2 commits into
NousResearch:mainfrom
banditburai:fix/contextvar-propagation-tool-executor

Conversation

@banditburai

Copy link
Copy Markdown
Contributor

Companion to 0046d17 (threading.local callback propagation) and 603a68b (per-session ContextVar isolation).

_execute_tool_calls_concurrent submits tools via executor.submit(_run_tool, ...) without copy_context().run, so worker threads do not inherit the parent's ContextVar values — including _approval_session_key set by the gateway before agent.run. Worker tools fall through tools/approval.py:get_current_session_key's resolution order to the os.environ fallback ("default" session key), silently collapsing per-session dispatch.

Fix: wrap _run_tool in copy_context().run at submit time, mirroring asyncio.to_thread. The existing threading.local callback propagation at run_agent.py:8796-8806 (from 0046d17 / GHSA-qg5c-hvr5-hjgr) is preserved — it handles a different propagation surface that ContextVars cannot carry.

`_execute_tool_calls_concurrent` submits tools via `executor.submit(_run_tool, ...)`
without `copy_context().run`, so worker threads do not inherit the parent's
ContextVar values — including `_approval_session_key` set by the gateway before
`agent.run`. Worker tools fall through `tools/approval.py:get_current_session_key`'s
resolution order to the `os.environ` fallback ("default" session key), silently
collapsing per-session dispatch for any tool that runs on a worker thread.

Fix: wrap the submitted callable in `contextvars.copy_context().run`, mirroring
`asyncio.to_thread`'s implementation. The existing threading.local callback
propagation (0046d17 / GHSA-qg5c-hvr5-hjgr) is preserved unchanged — it
handles a different propagation surface that ContextVars cannot carry.
@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder comp/gateway Gateway runner, session dispatch, delivery labels Apr 27, 2026
@syahidfrd

Copy link
Copy Markdown

Confirming a real-world reproduction on v2026.4.23 (latest public release, bf196a3f).

Setup: Two concurrent gateway sessions on Slack, separate channels A and B, with different active users in each channel. Both had registered approval notify callbacks via register_gateway_notify.

Trigger: Session A's agent decided to run a recursive delete, which fired the dangerous-command approval flow.

Observed: the approval card (with Allow Once / Allow Session / Deny buttons) was delivered to channel B, not channel A. The user in channel B saw an approval prompt for a command they had no context for, while session A's agent thread was blocked waiting for a response that would never come from there.

Matches this PR's analysis exactly. The tool ran in a worker thread spawned without copy_context().run, so _approval_session_key.get() returned empty. Resolution fell through to os.environ["HERMES_SESSION_KEY"] (set at gateway/run.py:10106), which had been overwritten by session B's most recent agent step. _gateway_notify_cbs.get(session_key) therefore returned session B's callback, routing the prompt to the wrong channel.

P1 feels right — the misroute happens at the worst possible moment (dangerous-command confirmation), and the wrong user could plausibly click "Allow Once" thinking it's their own session.

Happy to retest once this is merged.

Five tests under tests/eval_018/ that go from FAIL → PASS when an
agent successfully completes one of the five prompts authored in
talaria-evals/prompts/edit_*.{md,yaml}:

  test_file_tools_typed       tools/file_tools.py — public-fn annotations
  test_cli_errors             cli.py — CliError subclass + ≥3 raise sites
  test_rename_compactor       agent/* — ContextCompressor → ContextCompactor
  test_usage_pricing_docstrings  agent/usage_pricing.py — class docstrings
  test_title_timeout          agent/title_generator.py — timeout default 60.0

Plus conftest.py with an autouse AnchorStateManager.reset() (defensive
even though each eval run is a subprocess) and an empty __init__.py.

All five tests fail at this commit by design — they are the assertion
side of the eval; agents earn the pass by editing the named files.
teknium1 added a commit that referenced this pull request Apr 30, 2026
…18123)

Propagates ContextVars (notably `tools.approval._approval_session_key`) into concurrent tool worker threads via `copy_context().run` — mirrors `asyncio.to_thread` semantics.

Fixes approval-card cross-session misrouting in concurrent gateway traffic. Repro'd on Slack: session A's dangerous-command approval was delivered to channel B (@syahidfrd).

Salvages #16660 — core 4-LOC fix preserved, unrelated `tests/eval_018/` scope contamination dropped. Adds 5 regression guards including an AST-level source check on the real call site.

Closes #16660.

Co-authored-by: firefly <promptsiren@gmail.com>
Co-authored-by: banditburai <banditburai@users.noreply.github.com>
@teknium1

Copy link
Copy Markdown
Contributor

Salvaged as #18123, merged as e5dad4ac5.

The core 4-LOC fix was correct — diagnosis of the executor.submit → fresh-context trap was spot on, and @syahidfrd's Slack repro (dangerous-command approval card delivered to the wrong channel) confirmed the severity. Credited you and @firefly via Co-authored-by: on the merge.

Dropped the tests/eval_018/ directory — those files were talaria eval oracles (cli.CliError, ContextCompressor→ContextCompactor rename, talaria.tools._anchor_state, etc.) that fail 5/5 on hermes main. Looks like a cross-project commit that landed here by accident.

Added a regression suite: tests/run_agent/test_tool_executor_contextvar_propagation.py — 5 guards including an AST-level check on the real run_agent.py call site that fires with a concrete diagnostic if the copy_context().run wrapper is ever removed.

Thanks for finding this — the cross-session approval misroute is a real P1 and your diagnosis pointed straight at the fix.

donald131 pushed a commit to donald131/hermes-agent that referenced this pull request May 2, 2026
…ousResearch#18123)

Propagates ContextVars (notably `tools.approval._approval_session_key`) into concurrent tool worker threads via `copy_context().run` — mirrors `asyncio.to_thread` semantics.

Fixes approval-card cross-session misrouting in concurrent gateway traffic. Repro'd on Slack: session A's dangerous-command approval was delivered to channel B (@syahidfrd).

Salvages NousResearch#16660 — core 4-LOC fix preserved, unrelated `tests/eval_018/` scope contamination dropped. Adds 5 regression guards including an AST-level source check on the real call site.

Closes NousResearch#16660.

Co-authored-by: firefly <promptsiren@gmail.com>
Co-authored-by: banditburai <banditburai@users.noreply.github.com>
nickdlkk pushed a commit to nickdlkk/hermes-agent that referenced this pull request May 11, 2026
…ousResearch#18123)

Propagates ContextVars (notably `tools.approval._approval_session_key`) into concurrent tool worker threads via `copy_context().run` — mirrors `asyncio.to_thread` semantics.

Fixes approval-card cross-session misrouting in concurrent gateway traffic. Repro'd on Slack: session A's dangerous-command approval was delivered to channel B (@syahidfrd).

Salvages NousResearch#16660 — core 4-LOC fix preserved, unrelated `tests/eval_018/` scope contamination dropped. Adds 5 regression guards including an AST-level source check on the real call site.

Closes NousResearch#16660.

Co-authored-by: firefly <promptsiren@gmail.com>
Co-authored-by: banditburai <banditburai@users.noreply.github.com>
jsboige pushed a commit to jsboige/hermes-agent that referenced this pull request May 14, 2026
…ousResearch#18123)

Propagates ContextVars (notably `tools.approval._approval_session_key`) into concurrent tool worker threads via `copy_context().run` — mirrors `asyncio.to_thread` semantics.

Fixes approval-card cross-session misrouting in concurrent gateway traffic. Repro'd on Slack: session A's dangerous-command approval was delivered to channel B (@syahidfrd).

Salvages NousResearch#16660 — core 4-LOC fix preserved, unrelated `tests/eval_018/` scope contamination dropped. Adds 5 regression guards including an AST-level source check on the real call site.

Closes NousResearch#16660.

Co-authored-by: firefly <promptsiren@gmail.com>
Co-authored-by: banditburai <banditburai@users.noreply.github.com>
dannyJ848 pushed a commit to dannyJ848/hermes-agent that referenced this pull request May 17, 2026
…ousResearch#18123)

Propagates ContextVars (notably `tools.approval._approval_session_key`) into concurrent tool worker threads via `copy_context().run` — mirrors `asyncio.to_thread` semantics.

Fixes approval-card cross-session misrouting in concurrent gateway traffic. Repro'd on Slack: session A's dangerous-command approval was delivered to channel B (@syahidfrd).

Salvages NousResearch#16660 — core 4-LOC fix preserved, unrelated `tests/eval_018/` scope contamination dropped. Adds 5 regression guards including an AST-level source check on the real call site.

Closes NousResearch#16660.

Co-authored-by: firefly <promptsiren@gmail.com>
Co-authored-by: banditburai <banditburai@users.noreply.github.com>
@banditburai banditburai deleted the fix/contextvar-propagation-tool-executor branch May 28, 2026 18:06
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…ousResearch#18123)

Propagates ContextVars (notably `tools.approval._approval_session_key`) into concurrent tool worker threads via `copy_context().run` — mirrors `asyncio.to_thread` semantics.

Fixes approval-card cross-session misrouting in concurrent gateway traffic. Repro'd on Slack: session A's dangerous-command approval was delivered to channel B (@syahidfrd).

Salvages NousResearch#16660 — core 4-LOC fix preserved, unrelated `tests/eval_018/` scope contamination dropped. Adds 5 regression guards including an AST-level source check on the real call site.

Closes NousResearch#16660.

Co-authored-by: firefly <promptsiren@gmail.com>
Co-authored-by: banditburai <banditburai@users.noreply.github.com>
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…ousResearch#18123)

Propagates ContextVars (notably `tools.approval._approval_session_key`) into concurrent tool worker threads via `copy_context().run` — mirrors `asyncio.to_thread` semantics.

Fixes approval-card cross-session misrouting in concurrent gateway traffic. Repro'd on Slack: session A's dangerous-command approval was delivered to channel B (@syahidfrd).

Salvages NousResearch#16660 — core 4-LOC fix preserved, unrelated `tests/eval_018/` scope contamination dropped. Adds 5 regression guards including an AST-level source check on the real call site.

Closes NousResearch#16660.

Co-authored-by: firefly <promptsiren@gmail.com>
Co-authored-by: banditburai <banditburai@users.noreply.github.com>
Seven74AI pushed a commit to Seven74AI/hermes-agent that referenced this pull request Jun 13, 2026
…ousResearch#18123)

Propagates ContextVars (notably `tools.approval._approval_session_key`) into concurrent tool worker threads via `copy_context().run` — mirrors `asyncio.to_thread` semantics.

Fixes approval-card cross-session misrouting in concurrent gateway traffic. Repro'd on Slack: session A's dangerous-command approval was delivered to channel B (@syahidfrd).

Salvages NousResearch#16660 — core 4-LOC fix preserved, unrelated `tests/eval_018/` scope contamination dropped. Adds 5 regression guards including an AST-level source check on the real call site.

Closes NousResearch#16660.

Co-authored-by: firefly <promptsiren@gmail.com>
Co-authored-by: banditburai <banditburai@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/gateway Gateway runner, session dispatch, delivery P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants