Skip to content

fix(agent): propagate approval callbacks to concurrent tool worker threads#13734

Closed
andrewhosf wants to merge 1 commit into
NousResearch:mainfrom
andrewhosf:fix/concurrent-approval-callbacks
Closed

fix(agent): propagate approval callbacks to concurrent tool worker threads#13734
andrewhosf wants to merge 1 commit into
NousResearch:mainfrom
andrewhosf:fix/concurrent-approval-callbacks

Conversation

@andrewhosf

Copy link
Copy Markdown
Contributor

Bug #13617 Regression: Concurrent tool execution deadlocks on dangerous-command approval

Summary

When the agent executes a batch of tools concurrently (e.g. read_file + terminal in the same turn), any dangerous-command approval prompt deadlocks. The user can type a selection but pressing Return has no effect; the prompt times out after 60 seconds and denies the command.

Root Cause

_execute_tool_calls_concurrent() in run_agent.py dispatches tool calls to a ThreadPoolExecutor. The dangerous-command approval callback (and sudo password callback) are stored in threading.local() inside tools/terminal_tool.py. Worker threads spawned by the executor cannot see callbacks registered in the parent agent thread, so _get_approval_callback() returns None.

When the callback is None, tools/approval.py falls back to plain input() in a background daemon thread. However, the Hermes CLI uses prompt_toolkit, which puts the terminal in raw mode. The background thread's input() competes with prompt_toolkit for stdin. The user can type characters (they may echo), but the Enter key is captured by prompt_toolkit and never reaches input(), causing a deadlock until the timeout fires.

Affected Code Paths

  • run_agent.pyAIAgent._execute_tool_calls_concurrent()_run_tool() worker function
  • Any concurrent batch containing terminal, execute_code, or other tools that trigger dangerous-command detection

Fix

Capture the parent thread's approval/sudo callbacks before launching the thread pool, then register them locally inside each worker thread (and clear them on exit). This mirrors the existing fix pattern used in cli.py for the main agent worker thread.

Files Changed

  • run_agent.py

Key Diff Points

  1. Import callback getters/setters from tools.terminal_tool
  2. Before defining _run_tool, snapshot _get_approval_callback() and _get_sudo_password_callback()
  3. Inside _run_tool, after set_activity_callback(), register the callbacks:
    if _parent_approval_cb is not None:
        _set_approval_callback(_parent_approval_cb)
    if _parent_sudo_cb is not None:
        _set_sudo_password_callback(_parent_sudo_cb)
  4. At the end of _run_tool, clear them:
    _set_approval_callback(None)
    _set_sudo_password_callback(None)

Regression Tests

  • tests/cli/test_cli_approval_ui.py::TestApprovalCallbackThreadLocalWiring
    • test_main_thread_registration_is_invisible_to_child_thread
    • test_child_thread_registration_is_visible_and_cleared_in_finally

All 10 approval UI tests pass after the fix.

Related

…reads

When tools execute concurrently via ThreadPoolExecutor, worker threads
could not see the thread-local approval/sudo callbacks registered by
the CLI. This caused dangerous-command prompts to fall back to plain
input(), which deadlocks against prompt_toolkit's raw terminal mode.

Capture parent-thread callbacks before launching workers, register
them locally in each _run_tool thread, and clear them on exit.

Mirrors the existing fix pattern from cli.py run_agent() for the
main agent worker thread (GHSA-qg5c-hvr5-hjgr / NousResearch#13617).
@alt-glitch alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder tool/terminal Terminal execution and process management labels Apr 22, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #13697 (merged) — both fix thread-local approval callback propagation to concurrent tool workers for bug #13617.

1 similar comment
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #13697 (merged) — both fix thread-local approval callback propagation to concurrent tool workers for bug #13617.

@andrewhosf andrewhosf closed this Apr 22, 2026
@andrewhosf andrewhosf reopened this Apr 22, 2026
@andrewhosf

Copy link
Copy Markdown
Contributor Author

@alt-glitch thanks for the pointer. I checked #13697 — it fixes the main agent worker thread in cli.py (the daemon thread spawned by chat()), but it does not touch run_agent.py or _execute_tool_calls_concurrent().

The ThreadPoolExecutor worker threads used for concurrent tool execution are a separate spawn path that still hit the same threading.local() gap. This PR covers that missed case.

teknium1 added a commit that referenced this pull request Apr 27, 2026
PR #13734 fixed the concurrent-tool-executor vector (ThreadPoolExecutor
workers didn't inherit the CLI's TLS approval callback). Two vectors
remained that could still land in the deadlocking input() fallback:

1. _spawn_background_review spawns a raw threading.Thread with no
   approval callback installed, so any dangerous-command guard the
   review agent trips falls back to input() -> deadlock against the
   parent's prompt_toolkit TUI (same class as delegate_task subagents,
   fixed in 023b1bf / #15491). Install a _bg_review_auto_deny
   callback at thread start, clear on finally.

2. prompt_dangerous_approval's fallback unconditionally spawned a
   daemon thread calling input() when approval_callback was None.
   That fallback can never succeed under prompt_toolkit because the
   user's Enter goes to pt's raw-mode stdin capture. Detect an active
   pt Application via get_app_or_none() and fail closed (deny + log)
   instead, so future threads that forget to install a callback
   degrade gracefully instead of hanging 60s invisibly.

Regression guards:
- tests/run_agent/test_background_review.py verifies the review
  worker thread sees a callable auto-deny callback mid-run and that
  the slot is cleared in the finally block.
- tests/tools/test_approval.py TestFailClosedUnderPromptToolkit
  verifies prompt_dangerous_approval returns 'deny' fast under a
  mocked pt Application, and that a real callback still wins over
  the guard.
teknium1 added a commit that referenced this pull request Apr 27, 2026
Release-notes contributor attribution for the salvaged PR #13734 fix.
@teknium1

Copy link
Copy Markdown
Contributor

Merged via #16574 — your commit was cherry-picked onto current main with authorship preserved (0046d17). Added follow-on fixes for the background-review vector and a fail-closed guard in prompt_dangerous_approval. Thanks for the diagnosis and the fix!

@teknium1 teknium1 closed this Apr 27, 2026
cluricaun28 referenced this pull request in cluricaun28/Logos Apr 28, 2026
PR #13734 fixed the concurrent-tool-executor vector (ThreadPoolExecutor
workers didn't inherit the CLI's TLS approval callback). Two vectors
remained that could still land in the deadlocking input() fallback:

1. _spawn_background_review spawns a raw threading.Thread with no
   approval callback installed, so any dangerous-command guard the
   review agent trips falls back to input() -> deadlock against the
   parent's prompt_toolkit TUI (same class as delegate_task subagents,
   fixed in 023b1bf / #15491). Install a _bg_review_auto_deny
   callback at thread start, clear on finally.

2. prompt_dangerous_approval's fallback unconditionally spawned a
   daemon thread calling input() when approval_callback was None.
   That fallback can never succeed under prompt_toolkit because the
   user's Enter goes to pt's raw-mode stdin capture. Detect an active
   pt Application via get_app_or_none() and fail closed (deny + log)
   instead, so future threads that forget to install a callback
   degrade gracefully instead of hanging 60s invisibly.

Regression guards:
- tests/run_agent/test_background_review.py verifies the review
  worker thread sees a callable auto-deny callback mid-run and that
  the slot is cleared in the finally block.
- tests/tools/test_approval.py TestFailClosedUnderPromptToolkit
  verifies prompt_dangerous_approval returns 'deny' fast under a
  mocked pt Application, and that a real callback still wins over
  the guard.
ulasbilgen pushed a commit to ulasbilgen/hermes-adhd-agent that referenced this pull request May 1, 2026
…esearch#15216)

PR NousResearch#13734 fixed the concurrent-tool-executor vector (ThreadPoolExecutor
workers didn't inherit the CLI's TLS approval callback). Two vectors
remained that could still land in the deadlocking input() fallback:

1. _spawn_background_review spawns a raw threading.Thread with no
   approval callback installed, so any dangerous-command guard the
   review agent trips falls back to input() -> deadlock against the
   parent's prompt_toolkit TUI (same class as delegate_task subagents,
   fixed in 26c542e / NousResearch#15491). Install a _bg_review_auto_deny
   callback at thread start, clear on finally.

2. prompt_dangerous_approval's fallback unconditionally spawned a
   daemon thread calling input() when approval_callback was None.
   That fallback can never succeed under prompt_toolkit because the
   user's Enter goes to pt's raw-mode stdin capture. Detect an active
   pt Application via get_app_or_none() and fail closed (deny + log)
   instead, so future threads that forget to install a callback
   degrade gracefully instead of hanging 60s invisibly.

Regression guards:
- tests/run_agent/test_background_review.py verifies the review
  worker thread sees a callable auto-deny callback mid-run and that
  the slot is cleared in the finally block.
- tests/tools/test_approval.py TestFailClosedUnderPromptToolkit
  verifies prompt_dangerous_approval returns 'deny' fast under a
  mocked pt Application, and that a real callback still wins over
  the guard.
ulasbilgen pushed a commit to ulasbilgen/hermes-adhd-agent that referenced this pull request May 1, 2026
Release-notes contributor attribution for the salvaged PR NousResearch#13734 fix.
donald131 pushed a commit to donald131/hermes-agent that referenced this pull request May 2, 2026
…esearch#15216)

PR NousResearch#13734 fixed the concurrent-tool-executor vector (ThreadPoolExecutor
workers didn't inherit the CLI's TLS approval callback). Two vectors
remained that could still land in the deadlocking input() fallback:

1. _spawn_background_review spawns a raw threading.Thread with no
   approval callback installed, so any dangerous-command guard the
   review agent trips falls back to input() -> deadlock against the
   parent's prompt_toolkit TUI (same class as delegate_task subagents,
   fixed in 023b1bf / NousResearch#15491). Install a _bg_review_auto_deny
   callback at thread start, clear on finally.

2. prompt_dangerous_approval's fallback unconditionally spawned a
   daemon thread calling input() when approval_callback was None.
   That fallback can never succeed under prompt_toolkit because the
   user's Enter goes to pt's raw-mode stdin capture. Detect an active
   pt Application via get_app_or_none() and fail closed (deny + log)
   instead, so future threads that forget to install a callback
   degrade gracefully instead of hanging 60s invisibly.

Regression guards:
- tests/run_agent/test_background_review.py verifies the review
  worker thread sees a callable auto-deny callback mid-run and that
  the slot is cleared in the finally block.
- tests/tools/test_approval.py TestFailClosedUnderPromptToolkit
  verifies prompt_dangerous_approval returns 'deny' fast under a
  mocked pt Application, and that a real callback still wins over
  the guard.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…esearch#15216)

PR NousResearch#13734 fixed the concurrent-tool-executor vector (ThreadPoolExecutor
workers didn't inherit the CLI's TLS approval callback). Two vectors
remained that could still land in the deadlocking input() fallback:

1. _spawn_background_review spawns a raw threading.Thread with no
   approval callback installed, so any dangerous-command guard the
   review agent trips falls back to input() -> deadlock against the
   parent's prompt_toolkit TUI (same class as delegate_task subagents,
   fixed in 023b1bf / NousResearch#15491). Install a _bg_review_auto_deny
   callback at thread start, clear on finally.

2. prompt_dangerous_approval's fallback unconditionally spawned a
   daemon thread calling input() when approval_callback was None.
   That fallback can never succeed under prompt_toolkit because the
   user's Enter goes to pt's raw-mode stdin capture. Detect an active
   pt Application via get_app_or_none() and fail closed (deny + log)
   instead, so future threads that forget to install a callback
   degrade gracefully instead of hanging 60s invisibly.

Regression guards:
- tests/run_agent/test_background_review.py verifies the review
  worker thread sees a callable auto-deny callback mid-run and that
  the slot is cleared in the finally block.
- tests/tools/test_approval.py TestFailClosedUnderPromptToolkit
  verifies prompt_dangerous_approval returns 'deny' fast under a
  mocked pt Application, and that a real callback still wins over
  the guard.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
Release-notes contributor attribution for the salvaged PR NousResearch#13734 fix.
dannyJ848 pushed a commit to dannyJ848/hermes-agent that referenced this pull request May 17, 2026
…esearch#15216)

PR NousResearch#13734 fixed the concurrent-tool-executor vector (ThreadPoolExecutor
workers didn't inherit the CLI's TLS approval callback). Two vectors
remained that could still land in the deadlocking input() fallback:

1. _spawn_background_review spawns a raw threading.Thread with no
   approval callback installed, so any dangerous-command guard the
   review agent trips falls back to input() -> deadlock against the
   parent's prompt_toolkit TUI (same class as delegate_task subagents,
   fixed in 03fc4d6 / NousResearch#15491). Install a _bg_review_auto_deny
   callback at thread start, clear on finally.

2. prompt_dangerous_approval's fallback unconditionally spawned a
   daemon thread calling input() when approval_callback was None.
   That fallback can never succeed under prompt_toolkit because the
   user's Enter goes to pt's raw-mode stdin capture. Detect an active
   pt Application via get_app_or_none() and fail closed (deny + log)
   instead, so future threads that forget to install a callback
   degrade gracefully instead of hanging 60s invisibly.

Regression guards:
- tests/run_agent/test_background_review.py verifies the review
  worker thread sees a callable auto-deny callback mid-run and that
  the slot is cleared in the finally block.
- tests/tools/test_approval.py TestFailClosedUnderPromptToolkit
  verifies prompt_dangerous_approval returns 'deny' fast under a
  mocked pt Application, and that a real callback still wins over
  the guard.
dannyJ848 pushed a commit to dannyJ848/hermes-agent that referenced this pull request May 17, 2026
Release-notes contributor attribution for the salvaged PR NousResearch#13734 fix.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…esearch#15216)

PR NousResearch#13734 fixed the concurrent-tool-executor vector (ThreadPoolExecutor
workers didn't inherit the CLI's TLS approval callback). Two vectors
remained that could still land in the deadlocking input() fallback:

1. _spawn_background_review spawns a raw threading.Thread with no
   approval callback installed, so any dangerous-command guard the
   review agent trips falls back to input() -> deadlock against the
   parent's prompt_toolkit TUI (same class as delegate_task subagents,
   fixed in 023b1bf / NousResearch#15491). Install a _bg_review_auto_deny
   callback at thread start, clear on finally.

2. prompt_dangerous_approval's fallback unconditionally spawned a
   daemon thread calling input() when approval_callback was None.
   That fallback can never succeed under prompt_toolkit because the
   user's Enter goes to pt's raw-mode stdin capture. Detect an active
   pt Application via get_app_or_none() and fail closed (deny + log)
   instead, so future threads that forget to install a callback
   degrade gracefully instead of hanging 60s invisibly.

Regression guards:
- tests/run_agent/test_background_review.py verifies the review
  worker thread sees a callable auto-deny callback mid-run and that
  the slot is cleared in the finally block.
- tests/tools/test_approval.py TestFailClosedUnderPromptToolkit
  verifies prompt_dangerous_approval returns 'deny' fast under a
  mocked pt Application, and that a real callback still wins over
  the guard.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
Release-notes contributor attribution for the salvaged PR NousResearch#13734 fix.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…esearch#15216)

PR NousResearch#13734 fixed the concurrent-tool-executor vector (ThreadPoolExecutor
workers didn't inherit the CLI's TLS approval callback). Two vectors
remained that could still land in the deadlocking input() fallback:

1. _spawn_background_review spawns a raw threading.Thread with no
   approval callback installed, so any dangerous-command guard the
   review agent trips falls back to input() -> deadlock against the
   parent's prompt_toolkit TUI (same class as delegate_task subagents,
   fixed in 5d62d78 / NousResearch#15491). Install a _bg_review_auto_deny
   callback at thread start, clear on finally.

2. prompt_dangerous_approval's fallback unconditionally spawned a
   daemon thread calling input() when approval_callback was None.
   That fallback can never succeed under prompt_toolkit because the
   user's Enter goes to pt's raw-mode stdin capture. Detect an active
   pt Application via get_app_or_none() and fail closed (deny + log)
   instead, so future threads that forget to install a callback
   degrade gracefully instead of hanging 60s invisibly.

Regression guards:
- tests/run_agent/test_background_review.py verifies the review
  worker thread sees a callable auto-deny callback mid-run and that
  the slot is cleared in the finally block.
- tests/tools/test_approval.py TestFailClosedUnderPromptToolkit
  verifies prompt_dangerous_approval returns 'deny' fast under a
  mocked pt Application, and that a real callback still wins over
  the guard.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
Release-notes contributor attribution for the salvaged PR NousResearch#13734 fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder tool/terminal Terminal execution and process management type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants