Skip to content

fix: agent loop stability — stale reasoning, vision API key, TUI prompt#25404

Open
LifeJiggy wants to merge 5 commits into
NousResearch:mainfrom
LifeJiggy:fix/agent-loop-stability
Open

fix: agent loop stability — stale reasoning, vision API key, TUI prompt#25404
LifeJiggy wants to merge 5 commits into
NousResearch:mainfrom
LifeJiggy:fix/agent-loop-stability

Conversation

@LifeJiggy

Copy link
Copy Markdown
Contributor

What does this PR do?

Three independent bug fixes improving agent loop reliability and CLI UX:

  1. Stale reasoning no longer leaks across turns (Bug: stale reasoning reused when current turn has no reasoning_content #17052). When the current assistant response has no reasoning field (e.g. a simple "hello" reply after a complex tool-calling turn), the display now correctly shows no reasoning. Previously the code searched backwards for the most recent non-None reasoning, which picked up the tool-call step's reasoning from earlier in the same turn — causing confusing stale display.

  2. Vision backend respects runtime API keys ([Bug]: resolve_provider_client ignores explicit_api_key when calling _try_openrouter() #18338). _resolve_strict_vision_backend() called _try_openrouter() without forwarding explicit_api_key, so the vision pipeline always fell back to OPENROUTER_API_KEY env var even when resolve_provider_client was called with a credential-pool key. Added the parameter and pass-through.

  3. /clear//new//reset confirmation no longer drops keystrokes ([Bug]: /clear, /new, /reset, /undo confirmation prompt cannot be answered — keystrokes leak into chat composer #22958). _prompt_text_input() used run_in_terminal(input()) which fought prompt_toolkit's stdin ownership — keystrokes leaked into the agent composer instead of the confirmation prompt. Replaced with prompt_toolkit.prompt() which properly cooperates with the TUI event loop.

Related Issue
Fixes #17052, #18338, #22958

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)

Changes Made

  • run_agent.py — Changed last_reasoning loop from if msg.get("role") == "assistant" and msg.get("reasoning") to if msg.get("role") == "assistant": last_reasoning = msg.get("reasoning"). Stops at the last assistant message; uses None when the last turn has no reasoning.

  • agent/auxiliary_client.py — Added explicit_api_key: str = None parameter to _resolve_strict_vision_backend() and forwarded it to _try_openrouter(explicit_api_key=...).

  • cli.py — Replaced run_in_terminal(_ask) with prompt_toolkit.shortcuts.prompt() in _prompt_text_input() when TUI is active. Raw input() fallback preserved for non-TUI modes (unit tests, non-interactive calls).

How to Test

  1. Stale reasoning: Send a complex message that triggers tool calls with reasoning, then send "hello". The previous turn's reasoning should NOT appear in the response box.
  2. Vision API key: Configure OpenRouter via credential pool with explicit_api_key. Send an image analysis request. Verify the vision backend uses the pool key, not OPENROUTER_API_KEY env var.
  3. Confirmation prompt: Run /clear in the CLI TUI. Type a choice (1/2/3). The keystroke should register in the prompt, not appear as a chat message to the agent.

- Fix stale reasoning reuse when current turn has no reasoning
  (NousResearch#17052). Stop at the LAST assistant message and use its
  reasoning field even when None — previously we searched
  backwards for non-None reasoning, leaking tool-call
  reasoning from earlier in the same turn.

- Add explicit_api_key parameter to _resolve_strict_vision_backend
  and forward to _try_openrouter (NousResearch#18338). Vision backend now
  respects runtime-provided API keys instead of falling back to
  env var.

- Replace input() with prompt_toolkit.prompt() in
  _prompt_text_input for TUI mode (NousResearch#22958). Built-in input()
  fought prompt_toolkit's stdin ownership; keystrokes leaked
  into the agent composer instead of the confirmation prompt.
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard comp/tui Terminal UI (ui-tui/ + tui_gateway/) tool/vision Vision analysis and image generation labels May 14, 2026
@wesleysimplicio

Copy link
Copy Markdown
Contributor

I think the strict-vision OpenRouter path still drops the explicit API key. This PR teaches _resolve_strict_vision_backend() about explicit_api_key, but the strict vision call sites still appear to call it without the resolved key for OpenRouter/auto fallback. That means AUXILIARY_VISION_API_KEY or credential-pool resolved keys may never reach _try_openrouter() in those paths. Could we thread explicit_api_key=resolved_api_key through the strict vision OpenRouter call sites and add a focused regression test for that path?

…usResearch#18338)

_resolve_strict_vision_backend() accepted explicit_api_key but
none of the 3 call sites in resolve_vision_provider_client()
passed the resolved key. The key from _resolve_task_provider_model
is now forwarded to all strict vision backend calls so OpenRouter
and other vision providers receive the credential-pool or
config-resolved API key instead of falling back to the env var.
@LifeJiggy

Copy link
Copy Markdown
Contributor Author

The reviewer's concern about the vision API key not reaching _try_openrouter() is addressed — all 3 call sites in resolve_vision_provider_client() now pass explicit_api_key=resolved_api_key from the task-level resolved credentials.

@LifeJiggy

Copy link
Copy Markdown
Contributor Author

@teknium1 PTAL

LifeJiggy added 2 commits May 27, 2026 21:25
…auxiliary_client.py (use model kwarg), fix tests
Re-insert _execute_tool_calls_concurrent, _execute_tool_calls_sequential,
_handle_max_iterations, run_conversation, _run_codex_app_server_turn
that were mistakenly deleted when removing branch forwarder stubs.
Fix test mock lambdas to accept explicit_api_key kwarg.
@LifeJiggy

Copy link
Copy Markdown
Contributor Author

@egilewski pTAL

@egilewski

Copy link
Copy Markdown
Contributor

@egilewski pTAL

Tests are failing.

And, I'm sorry, but my tokens are limited, so for now I'm focusing on the most important for me security issues.

- test_prompt_text_input_thread_safety: mock ptk_prompt instead of
  run_in_terminal (cli.py switched from run_in_terminal to ptk_prompt)
- test_auxiliary_main_first: add explicit_api_key=None to
  mock_strict.assert_called_once_with (auxiliary_client.py now forwards
  explicit_api_key to _resolve_strict_vision_backend)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder comp/cli CLI entry point, hermes_cli/, setup wizard comp/tui Terminal UI (ui-tui/ + tui_gateway/) P2 Medium — degraded but workaround exists tool/vision Vision analysis and image generation type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: stale reasoning reused when current turn has no reasoning_content

4 participants