Skip to content

fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)#1654

Merged
teknium1 merged 9 commits into
mainfrom
hermes/hermes-6bb9911e
Mar 17, 2026
Merged

fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)#1654
teknium1 merged 9 commits into
mainfrom
hermes/hermes-6bb9911e

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

Fixes #1624 — CLI blocks during tool usage on macOS and only resumes when the user types.

Root cause

The interrupt polling loop in chat() (line 5320) waits on the interrupt queue with timeout=0.1, but on queue.Empty it just passes without invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy buffer (from patch_stdout) only flushes when the renderer runs a pass — which is triggered by input events. No input → no render → buffered output never appears.

Fix

One line: call self._invalidate(min_interval=0.15) on each queue timeout. This forces prompt_toolkit to flush pending stdout output from the agent thread every ~150ms, keeping the display responsive during tool execution.

Why this affects macOS specifically

Linux's terminal I/O tends to flush more aggressively, and some terminal emulators (GNOME Terminal, etc.) poll for updates. macOS Terminal.app and iTerm2 rely more heavily on the application's own flush cycle. When prompt_toolkit's renderer isn't actively running, output stalls.

Test plan

  • CLI import clean (from cli import HermesCLI succeeds)
  • Manual verification: agent output should appear progressively during tool calls without needing to type

teknium1 and others added 9 commits March 17, 2026 01:45
When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message.  This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error.  Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.

Three-layer fix:

1. run_agent.py — Fallback heuristic: when a 400 error has a very short
   generic message AND the session is large (>40% of context or >80
   messages), treat it as a probable context overflow and trigger
   compression instead of aborting.

2. run_agent.py + gateway/run.py — Don't persist failed messages:
   when the agent returns failed=True before generating any response,
   skip writing the user's message to the transcript/DB. This prevents
   the session from growing on each failure.

3. gateway/run.py — Smarter error messages: detect context-overflow
   failures and suggest /compact or /reset specifically, instead of a
   generic 'try again' that will fail identically.
Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):

1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
   (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
   was the original injection vector — untrusted skill descriptions
   in the catalog contained adversarial text that the model executed.

2. skill_view: warns when skills are loaded from outside the trusted
   ~/.hermes/skills/ directory, and detects common injection patterns
   in skill content ("ignore previous instructions", "<system>", etc.).

Cherry-picked from PR #1562 by ygd58.
…1552)

Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.

- Apply truncate_message() chunking in _send_to_platform() before
  dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
  in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement

Cherry-picked from PR #1557 by llbn.
Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:

- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
  in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests

Cherry-picked from PR #1566 by crazywriter1.
…eze (#1624)

The interrupt polling loop in chat() waited on the queue without
invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy
buffer only flushed on input events, causing the CLI to appear frozen
during tool execution until the user typed a key.

Fix: call _invalidate() on each queue timeout (every ~100ms, throttled
to 150ms) to force the renderer to flush buffered agent output.
@teknium1 teknium1 merged commit 8992bab into main Mar 17, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: CLI blocks during tool usage and only resumes when typing

4 participants