feat(tui): voice mode CLI parity (VAD loop + TTS + crash forensics)#14810
Merged
Conversation
When the user runs /voice and then presses Ctrl+B in the TUI, three
handlers collaborate to consume the chord and none of them dispatch
voice.record:
- isAction() is platform-aware — on macOS it requires Cmd (meta/super),
so Ctrl+B fails the match in useInputHandlers and never triggers
voiceStart/voiceStop.
- TextInput's Ctrl+B pass-through list doesn't include 'b', so the
keystroke falls through to the wordMod backward-word branch on Linux
and to the printable-char insertion branch on macOS — the latter is
exactly what timmie reported ("enters a b into the tui").
- /voice emits "voice: on" with no hint, so the user has no way to
know Ctrl+B is the recording toggle.
Introduces isVoiceToggleKey(key, ch) in lib/platform.ts that matches
raw Ctrl+B on every platform (mirrors tips.py and config.yaml's
voice.record_key default) and additionally accepts Cmd+B on macOS so
existing muscle memory keeps working. Wires it into useInputHandlers,
adds Ctrl+B to TextInput's pass-through list so the global handler
actually receives the chord, and appends "press Ctrl+B to record" to
the /voice on message.
Empirically verified with hermes --tui: Ctrl+B no longer leaks 'b'
into the composer and now dispatches the voice.record RPC (the
downstream ImportError for hermes_cli.voice is a separate upstream
bug — follow-up patch).
tui_gateway/server.py:3486/3491/3509 imports start_recording, stop_and_transcribe, and speak_text from hermes_cli.voice, but the module never existed (not in git history — never shipped, never deleted). Every voice.record / voice.tts RPC call hit the ImportError branch and the TUI surfaced it as "voice module not available — install audio dependencies" even on boxes with sounddevice / faster-whisper / numpy installed. Adds a thin wrapper on top of tools.voice_mode (recording + transcription) and tools.tts_tool (text-to-speech): - start_recording() — idempotent; stores the active AudioRecorder in a module-global guarded by a Lock so repeat Ctrl+B presses don't fight over the mic. - stop_and_transcribe() — returns None for no-op / no-speech / Whisper-hallucination cases so the TUI's existing "no speech detected" path keeps working unchanged. - speak_text(text) — lazily imports tts_tool (optional provider SDKs stay unloaded until the first /voice tts call), parses the tool's JSON result, and plays the audio via play_audio_file. Paired with the Ctrl+B keybinding fix in the prior commit, the TUI voice pipeline now works end-to-end for the first time.
The TUI had drifted from the CLI's voice model in two ways:
- /voice on was lighting up the microphone immediately and Ctrl+B was
interpreted as a mode toggle. The CLI separates the two: /voice on
just flips the umbrella bit, recording only starts once the user
presses Ctrl+B, which also sets _voice_continuous so the VAD loop
auto-restarts until the user presses Ctrl+B again or three silent
cycles pass.
- /voice tts was missing entirely, so users couldn't turn agent reply
speech on/off from inside the TUI.
This commit brings the TUI to parity.
Python
- hermes_cli/voice.py: continuous-mode API (start_continuous,
stop_continuous, is_continuous_active) layered on the existing PTT
wrappers. The silence callback transcribes, fires on_transcript,
tracks consecutive no-speech cycles, and auto-restarts — mirroring
cli.py:_voice_stop_and_transcribe + _restart_recording.
- tui_gateway/server.py:
- voice.toggle now supports on / off / tts / status. The umbrella
bit lives in HERMES_VOICE + display.voice_enabled; tts lives in
HERMES_VOICE_TTS + display.voice_tts. /voice off also tears down
any active continuous loop so a toggle-off really releases the
microphone.
- voice.record start/stop now drives start_continuous/stop_continuous.
start is refused with a clear error when the mode is off, matching
cli.py:handle_voice_record's early return on `not _voice_mode`.
- New voice.transcript / voice.status events emit through
_voice_emit (remembers the sid that last enabled the mode so
events land in the right session).
TypeScript
- gatewayTypes.ts: voice.status + voice.transcript event
discriminants; VoiceToggleResponse gains tts; VoiceRecordResponse
gains status for the new "started/stopped" responses.
- interfaces.ts: GatewayEventHandlerContext gains composer.setInput +
submission.submitRef + voice.{setRecording, setProcessing,
setVoiceEnabled}; InputHandlerContext.voice gains enabled +
setVoiceEnabled for the mode-aware Ctrl+B handler.
- createGatewayEventHandler.ts: voice.status drives REC/STT badges;
voice.transcript auto-submits when the composer is empty (CLI
_pending_input.put parity) and appends when a draft is in flight.
no_speech_limit flips voice off + sys line.
- useInputHandlers.ts: Ctrl+B now calls voice.record (start/stop),
not voice.toggle, and nudges the user with a sys line when the
mode is off instead of silently flipping it on.
- useMainApp.ts: wires the new event-handler context fields.
- slash/commands/session.ts: /voice handles on / off / tts / status
with CLI-matching output ("voice: mode on · tts off").
Backward compat preserved for voice.record (was always PTT shape;
gateway still honours start/stop with mode-gating added).
Three issues surfaced during end-to-end testing of the CLI-parity voice
loop and are fixed together because they all blocked "speak → agent
responds → TTS reads it back" from working at all:
1. Wrong result key (hermes_cli/voice.py)
transcribe_recording() returns {"success": bool, "transcript": str},
matching cli.py:_voice_stop_and_transcribe. The wrapper was reading
result.get("text"), which is None, so every successful Groq / local
STT response was thrown away and the 3-strikes halt fired after
three silent-looking cycles. Fixed by reading "transcript" and also
honouring "success" like the CLI does. Updated the loop simulation
tests to return the correct shape.
2. TTS speak-back was missing (tui_gateway/server.py + hermes_cli/voice.py)
The TUI had a voice.toggle "tts" subcommand but nothing downstream
actually read the flag — agent replies never spoke. Mirrored
cli.py:8747-8754's dispatch: on message.complete with status ==
"complete", if _voice_tts_enabled() is true, spawn a daemon thread
running speak_text(response). Rewrote speak_text as a full port of
cli.py:_voice_speak_response — same markdown-strip regex pipeline
(code blocks, links, bold/italic, inline code, headers, list bullets,
horizontal rules, excessive newlines), same 4000-char cap, same
explicit mp3 output path, same MP3-over-OGG playback choice (afplay
misbehaves on OGG), same cleanup of both extensions. Keeps TUI TTS
audible output byte-for-byte identical to the classic CLI.
3. Auto-submit swallowed on non-empty composer (createGatewayEventHandler.ts)
The voice.transcript handler branched on prev input via a setInput
updater and fired submitRef.current inside the updater when prev was
empty. React strict mode double-invokes state updaters, which would
queue the submit twice; and when the composer had any content the
transcript was merely appended — the agent never saw it. CLI
_pending_input.put(transcript) unconditionally feeds the transcript
as the next turn, so match that: always clear the composer and
setTimeout(() => submitRef.current(text), 0) outside any updater.
Side effect can't run twice this way, and a half-typed draft on the
rare occasion is a fair trade vs. silently dropping the turn.
Also added peak_rms to the rec.stop debug line so "recording too quiet"
is diagnosable at a glance when HERMES_VOICE_DEBUG=1.
TTS feedback loop (hermes_cli/voice.py)
The VAD loop kept the microphone live while speak_text played the
agent's reply over the speakers, so the reply itself was picked up,
transcribed, and submitted — the agent then replied to its own echo
("Ha, looks like we're in a loop").
Ported cli.py:_voice_tts_done synchronisation:
- _tts_playing: threading.Event (initially set = "not playing").
- speak_text cancels the active recorder before opening the speakers,
clears _tts_playing, and on exit waits 300 ms before re-starting the
recorder — long enough for the OS audio device to settle so afplay
and sounddevice don't race for it.
- _continuous_on_silence now waits on _tts_playing (up to 60 s) before
re-arming the mic with another 300 ms gap, mirroring
cli.py:10619-10621. If the user flips voice off during the wait the
loop exits cleanly instead of fighting for the device.
Without both halves the loop races: if the silence callback fires
before TTS starts it re-arms immediately; if TTS is already playing
the pause-and-resume path catches it.
Red REC badge (ui-tui appChrome + useMainApp)
Classic CLI (cli.py:_get_voice_status_fragments) renders "● REC" in
red and "◉ STT" in amber. TUI was showing a dim "REC" with no dot,
making it hard to spot at a glance. voiceLabel now emits the same
glyphs and appChrome colours them via t.color.error / t.color.warn,
falling back to dim for the idle label.
…rash.log When the gateway subprocess raises an unhandled exception during a voice-mode turn, nothing survives: stdout is the JSON-RPC pipe, stderr flushes but the process is already exiting, and no log file catches Python's default traceback print. The user is left with an undiagnosable "gateway exited" banner. Install: - sys.excepthook → write full traceback to tui_gateway_crash.log + echo the first line to stderr (which the TUI pumps into Activity as a gateway.stderr event). Chains to the default hook so the process still terminates. - threading.excepthook → same, tagged with the thread name so it's clear when the crash came from a daemon thread (beep playback, TTS, silence callback, etc.). - Turn-dispatcher except block now also appends a traceback to the crash log before emitting the user-visible error event — str(e) alone was too terse to identify where in the voice pipeline the failure happened. Zero behavioural change on the happy path; purely forensics.
Gateway exits weren't reaching the panic hook because entry.py calls sys.exit(0) on broken stdout — clean termination, no exception. That left "gateway exited" in the TUI with zero forensic trail when pipe breaks happened mid-turn. Entry.py now tags each exit path — startup-write failure, parse-error- response write failure, per-method response write failure, stdin EOF — with a one-line entry in ~/.hermes/logs/tui_gateway_crash.log and a gateway.stderr breadcrumb. Includes the JSON-RPC method name on the dispatch path, which is the only way to tell "died right after handling voice.toggle on" from "died emitting the second message.complete".
SIG_DFL for SIGPIPE means the kernel reaps the gateway subprocess the instant a background thread (TTS playback, silence callback, voice status emitter) writes to a stdout the TUI stopped reading — before the Python interpreter can run excepthook, threading.excepthook, atexit, or the entry.py post-loop _log_exit. Replace the three SIG_DFL / SIG_IGN bindings with a _log_signal handler that: - records which signal (SIGPIPE / SIGTERM / SIGHUP) fired and when; - dumps the main-thread stack at signal delivery AND every live thread's stack via sys._current_frames — the background-thread write that provoked SIGPIPE is almost always visible here; - writes everything to ~/.hermes/logs/tui_gateway_crash.log and prints a [gateway-signal] breadcrumb to stderr so the TUI Activity surfaces it as well. SIGINT stays ignored (TUI handles Ctrl+C for the user).
Crash-log stack trace (tui_gateway_crash.log) from the user's session pinned the regression: SIGPIPE arrived while main thread was blocked on for-raw-in-sys.stdin — i.e., a background thread (debug print to stderr, most likely from HERMES_VOICE_DEBUG=1) wrote to a pipe whose buffer the TUI hadn't drained yet, and SIG_DFL promptly killed the process. Two fixes that together restore CLI parity: - entry.py: SIGPIPE → SIG_IGN instead of the _log_signal handler that then exited. With SIG_IGN, Python raises BrokenPipeError on the offending write, which write_json already handles with a clean exit via _log_exit. SIGTERM / SIGHUP still route through _log_signal so real termination signals remain diagnosable. - hermes_cli/voice.py:_debug: wrap the stderr print in a BrokenPipeError / OSError try/except. This runs from daemon threads (silence callback, TTS playback, beep), so a broken stderr must not escape and ride up into the main event loop. Verified by spawning the gateway subprocess locally: voice.toggle status → 200 OK, process stays alive, clean exit on stdin close logs "reason=stdin EOF" instead of a silent reap.
The voice.toggle handler was persisting display.voice_enabled / display.voice_tts to config.yaml, so a TUI session that ever turned voice on would re-open with it already on (and the mic badge lit) on every subsequent launch. cli.py treats voice strictly as runtime state: _voice_mode = False at __init__, only /voice on flips it, and nothing writes it back to disk. Drop the _write_config_key calls in voice.toggle on/off/tts and the config.yaml fallback in _voice_mode_enabled / _voice_tts_enabled. State is now env-var-only (HERMES_VOICE / HERMES_VOICE_TTS), scoped to the live gateway subprocess — the next launch starts clean.
13 tasks
teknium1
added a commit
that referenced
this pull request
Apr 30, 2026
…ers (#17727) Covers ~60 merged PRs from Apr 15–29 that shipped user-visible behavior without docs coverage. No functional code changes; docs + static manifest regeneration only. Highlights: Stale / incorrect: - configuration.md: auxiliary auto-routing line was wrong since #11900; now correctly states auto routes to the main model, with a note on the cost trade-off and per-task override pattern. - integrations/providers.md + configuration.md compression intro: removed stale 'Gemini Flash via OpenRouter' claim. - website/static/api/model-catalog.json: rebuilt from hermes_cli/models.py so the live manifest picks up tencent/hy3-preview (and remains in sync for future model-catalog PRs). Platform messaging (#17417 #16997 #16193 #14315 #13151 #11794 #10610 #10283 #10246 #11564 #13178): - Signal: native formatting (bodyRanges), reply quotes, reactions. - Telegram: table rendering (bullets + code-block fallback), disable_link_previews, group_allowed_chats. - Slack: strict_mention config. - Discord: slash_commands disable, send_animation GIF, send_message native media attachments. - DingTalk: require_mention + allowed_users. CLI (#16052 #16539 #16566 #15841 #14798 #10043): - New 'hermes fallback' interactive manager. - New 'hermes update --check', '--backup' flag, and pre-update pairing snapshot behavior. - 'hermes gateway start/restart --all' multi-profile flag. - cron.md: 'hermes tools' as a platform, per-job enabled_toolsets, wakeAgent gate, context_from chaining. Config keys / env vars (#17305 #17026 #17000 #15077 #14557 #14227 #14166 #14730 #17008): - terminal.docker_run_as_host_user, display.runtime_metadata_footer, compression.hygiene_hard_message_limit, HINDSIGHT_TIMEOUT, skills.guard_agent_created, TAVILY_BASE_URL, security.allow_private_urls, agent.api_max_retries, gateway hot-reload of compression/context_length config edits. TUI / CLI UX (#17130 #17113 #17175 #17150 #16707 #12312 #12305 #12934 #14810 #14045 #17286 #17126): - HERMES_TUI_RESUME, HERMES_TUI_THEME, LaTeX rendering, busy-indicator styles, ctrl-x queued-message delete, git branch in status bar, per- prompt elapsed stopwatch, external-editor keybind, markdown stripping, TUI voice-mode parity, /agents overlay, /reload + /mouse. Gateway features (#16506 #15027 #13428 #12116): - Native multimodal image routing based on vision capability. - /usage account-limits section. - /steer slash command (added to reference + explanation in CLI). Plugins / hooks (#12929 #12972 #10763 #16364): - transform_tool_result, transform_terminal_output plugin hooks. - PluginContext.dispatch_tool() documented with slash-command example. - google_meet bundled plugin entry under built-in-plugins.md. Other (#16576 #16572 #16383 #15878 #15608 #15606 #14809 #14767 #14231 #14232 #14307 #13683 #12373 #11891 #11291 #10066): - hermes backup exclusions (WAL/SHM/journal + checkpoints/). - security.md hardline blocklist (floor below --yolo). - FHS install layout for root installs. - openssh-client + docker-cli baked into the Docker image. - MEDIA: tag supported extensions table (docs/office/archives/pdf). - Remote-to-host file sync on SSH/Modal/Daytona teardown. - 'hermes model' -> Configure Auxiliary Models interactive picker. - Podman support via HERMES_DOCKER_BINARY. Providers / STT / one-shot (#15045 #14473 #15704): - alibaba-coding-plan first-class provider entry. - xAI Grok STT as a 6th transcription option. - 'hermes -z' scripted one-shot mode + HERMES_INFERENCE_MODEL. Build: 'docusaurus build' succeeds. No new broken links/anchors; pre-existing warnings unchanged.
nickdlkk
pushed a commit
to nickdlkk/hermes-agent
that referenced
this pull request
May 11, 2026
…ers (NousResearch#17727) Covers ~60 merged PRs from Apr 15–29 that shipped user-visible behavior without docs coverage. No functional code changes; docs + static manifest regeneration only. Highlights: Stale / incorrect: - configuration.md: auxiliary auto-routing line was wrong since NousResearch#11900; now correctly states auto routes to the main model, with a note on the cost trade-off and per-task override pattern. - integrations/providers.md + configuration.md compression intro: removed stale 'Gemini Flash via OpenRouter' claim. - website/static/api/model-catalog.json: rebuilt from hermes_cli/models.py so the live manifest picks up tencent/hy3-preview (and remains in sync for future model-catalog PRs). Platform messaging (NousResearch#17417 NousResearch#16997 NousResearch#16193 NousResearch#14315 NousResearch#13151 NousResearch#11794 NousResearch#10610 NousResearch#10283 NousResearch#10246 NousResearch#11564 NousResearch#13178): - Signal: native formatting (bodyRanges), reply quotes, reactions. - Telegram: table rendering (bullets + code-block fallback), disable_link_previews, group_allowed_chats. - Slack: strict_mention config. - Discord: slash_commands disable, send_animation GIF, send_message native media attachments. - DingTalk: require_mention + allowed_users. CLI (NousResearch#16052 NousResearch#16539 NousResearch#16566 NousResearch#15841 NousResearch#14798 NousResearch#10043): - New 'hermes fallback' interactive manager. - New 'hermes update --check', '--backup' flag, and pre-update pairing snapshot behavior. - 'hermes gateway start/restart --all' multi-profile flag. - cron.md: 'hermes tools' as a platform, per-job enabled_toolsets, wakeAgent gate, context_from chaining. Config keys / env vars (NousResearch#17305 NousResearch#17026 NousResearch#17000 NousResearch#15077 NousResearch#14557 NousResearch#14227 NousResearch#14166 NousResearch#14730 NousResearch#17008): - terminal.docker_run_as_host_user, display.runtime_metadata_footer, compression.hygiene_hard_message_limit, HINDSIGHT_TIMEOUT, skills.guard_agent_created, TAVILY_BASE_URL, security.allow_private_urls, agent.api_max_retries, gateway hot-reload of compression/context_length config edits. TUI / CLI UX (NousResearch#17130 NousResearch#17113 NousResearch#17175 NousResearch#17150 NousResearch#16707 NousResearch#12312 NousResearch#12305 NousResearch#12934 NousResearch#14810 NousResearch#14045 NousResearch#17286 NousResearch#17126): - HERMES_TUI_RESUME, HERMES_TUI_THEME, LaTeX rendering, busy-indicator styles, ctrl-x queued-message delete, git branch in status bar, per- prompt elapsed stopwatch, external-editor keybind, markdown stripping, TUI voice-mode parity, /agents overlay, /reload + /mouse. Gateway features (NousResearch#16506 NousResearch#15027 NousResearch#13428 NousResearch#12116): - Native multimodal image routing based on vision capability. - /usage account-limits section. - /steer slash command (added to reference + explanation in CLI). Plugins / hooks (NousResearch#12929 NousResearch#12972 NousResearch#10763 NousResearch#16364): - transform_tool_result, transform_terminal_output plugin hooks. - PluginContext.dispatch_tool() documented with slash-command example. - google_meet bundled plugin entry under built-in-plugins.md. Other (NousResearch#16576 NousResearch#16572 NousResearch#16383 NousResearch#15878 NousResearch#15608 NousResearch#15606 NousResearch#14809 NousResearch#14767 NousResearch#14231 NousResearch#14232 NousResearch#14307 NousResearch#13683 NousResearch#12373 NousResearch#11891 NousResearch#11291 NousResearch#10066): - hermes backup exclusions (WAL/SHM/journal + checkpoints/). - security.md hardline blocklist (floor below --yolo). - FHS install layout for root installs. - openssh-client + docker-cli baked into the Docker image. - MEDIA: tag supported extensions table (docs/office/archives/pdf). - Remote-to-host file sync on SSH/Modal/Daytona teardown. - 'hermes model' -> Configure Auxiliary Models interactive picker. - Podman support via HERMES_DOCKER_BINARY. Providers / STT / one-shot (NousResearch#15045 NousResearch#14473 NousResearch#15704): - alibaba-coding-plan first-class provider entry. - xAI Grok STT as a 6th transcription option. - 'hermes -z' scripted one-shot mode + HERMES_INFERENCE_MODEL. Build: 'docusaurus build' succeeds. No new broken links/anchors; pre-existing warnings unchanged.
02356abc
pushed a commit
to 02356abc/hermes-agent
that referenced
this pull request
May 14, 2026
…ers (NousResearch#17727) Covers ~60 merged PRs from Apr 15–29 that shipped user-visible behavior without docs coverage. No functional code changes; docs + static manifest regeneration only. Highlights: Stale / incorrect: - configuration.md: auxiliary auto-routing line was wrong since NousResearch#11900; now correctly states auto routes to the main model, with a note on the cost trade-off and per-task override pattern. - integrations/providers.md + configuration.md compression intro: removed stale 'Gemini Flash via OpenRouter' claim. - website/static/api/model-catalog.json: rebuilt from hermes_cli/models.py so the live manifest picks up tencent/hy3-preview (and remains in sync for future model-catalog PRs). Platform messaging (NousResearch#17417 NousResearch#16997 NousResearch#16193 NousResearch#14315 NousResearch#13151 NousResearch#11794 NousResearch#10610 NousResearch#10283 NousResearch#10246 NousResearch#11564 NousResearch#13178): - Signal: native formatting (bodyRanges), reply quotes, reactions. - Telegram: table rendering (bullets + code-block fallback), disable_link_previews, group_allowed_chats. - Slack: strict_mention config. - Discord: slash_commands disable, send_animation GIF, send_message native media attachments. - DingTalk: require_mention + allowed_users. CLI (NousResearch#16052 NousResearch#16539 NousResearch#16566 NousResearch#15841 NousResearch#14798 NousResearch#10043): - New 'hermes fallback' interactive manager. - New 'hermes update --check', '--backup' flag, and pre-update pairing snapshot behavior. - 'hermes gateway start/restart --all' multi-profile flag. - cron.md: 'hermes tools' as a platform, per-job enabled_toolsets, wakeAgent gate, context_from chaining. Config keys / env vars (NousResearch#17305 NousResearch#17026 NousResearch#17000 NousResearch#15077 NousResearch#14557 NousResearch#14227 NousResearch#14166 NousResearch#14730 NousResearch#17008): - terminal.docker_run_as_host_user, display.runtime_metadata_footer, compression.hygiene_hard_message_limit, HINDSIGHT_TIMEOUT, skills.guard_agent_created, TAVILY_BASE_URL, security.allow_private_urls, agent.api_max_retries, gateway hot-reload of compression/context_length config edits. TUI / CLI UX (NousResearch#17130 NousResearch#17113 NousResearch#17175 NousResearch#17150 NousResearch#16707 NousResearch#12312 NousResearch#12305 NousResearch#12934 NousResearch#14810 NousResearch#14045 NousResearch#17286 NousResearch#17126): - HERMES_TUI_RESUME, HERMES_TUI_THEME, LaTeX rendering, busy-indicator styles, ctrl-x queued-message delete, git branch in status bar, per- prompt elapsed stopwatch, external-editor keybind, markdown stripping, TUI voice-mode parity, /agents overlay, /reload + /mouse. Gateway features (NousResearch#16506 NousResearch#15027 NousResearch#13428 NousResearch#12116): - Native multimodal image routing based on vision capability. - /usage account-limits section. - /steer slash command (added to reference + explanation in CLI). Plugins / hooks (NousResearch#12929 NousResearch#12972 NousResearch#10763 NousResearch#16364): - transform_tool_result, transform_terminal_output plugin hooks. - PluginContext.dispatch_tool() documented with slash-command example. - google_meet bundled plugin entry under built-in-plugins.md. Other (NousResearch#16576 NousResearch#16572 NousResearch#16383 NousResearch#15878 NousResearch#15608 NousResearch#15606 NousResearch#14809 NousResearch#14767 NousResearch#14231 NousResearch#14232 NousResearch#14307 NousResearch#13683 NousResearch#12373 NousResearch#11891 NousResearch#11291 NousResearch#10066): - hermes backup exclusions (WAL/SHM/journal + checkpoints/). - security.md hardline blocklist (floor below --yolo). - FHS install layout for root installs. - openssh-client + docker-cli baked into the Docker image. - MEDIA: tag supported extensions table (docs/office/archives/pdf). - Remote-to-host file sync on SSH/Modal/Daytona teardown. - 'hermes model' -> Configure Auxiliary Models interactive picker. - Podman support via HERMES_DOCKER_BINARY. Providers / STT / one-shot (NousResearch#15045 NousResearch#14473 NousResearch#15704): - alibaba-coding-plan first-class provider entry. - xAI Grok STT as a 6th transcription option. - 'hermes -z' scripted one-shot mode + HERMES_INFERENCE_MODEL. Build: 'docusaurus build' succeeds. No new broken links/anchors; pre-existing warnings unchanged.
jsboige
pushed a commit
to jsboige/hermes-agent
that referenced
this pull request
May 14, 2026
…ers (NousResearch#17727) Covers ~60 merged PRs from Apr 15–29 that shipped user-visible behavior without docs coverage. No functional code changes; docs + static manifest regeneration only. Highlights: Stale / incorrect: - configuration.md: auxiliary auto-routing line was wrong since NousResearch#11900; now correctly states auto routes to the main model, with a note on the cost trade-off and per-task override pattern. - integrations/providers.md + configuration.md compression intro: removed stale 'Gemini Flash via OpenRouter' claim. - website/static/api/model-catalog.json: rebuilt from hermes_cli/models.py so the live manifest picks up tencent/hy3-preview (and remains in sync for future model-catalog PRs). Platform messaging (NousResearch#17417 NousResearch#16997 NousResearch#16193 NousResearch#14315 NousResearch#13151 NousResearch#11794 NousResearch#10610 NousResearch#10283 NousResearch#10246 NousResearch#11564 NousResearch#13178): - Signal: native formatting (bodyRanges), reply quotes, reactions. - Telegram: table rendering (bullets + code-block fallback), disable_link_previews, group_allowed_chats. - Slack: strict_mention config. - Discord: slash_commands disable, send_animation GIF, send_message native media attachments. - DingTalk: require_mention + allowed_users. CLI (NousResearch#16052 NousResearch#16539 NousResearch#16566 NousResearch#15841 NousResearch#14798 NousResearch#10043): - New 'hermes fallback' interactive manager. - New 'hermes update --check', '--backup' flag, and pre-update pairing snapshot behavior. - 'hermes gateway start/restart --all' multi-profile flag. - cron.md: 'hermes tools' as a platform, per-job enabled_toolsets, wakeAgent gate, context_from chaining. Config keys / env vars (NousResearch#17305 NousResearch#17026 NousResearch#17000 NousResearch#15077 NousResearch#14557 NousResearch#14227 NousResearch#14166 NousResearch#14730 NousResearch#17008): - terminal.docker_run_as_host_user, display.runtime_metadata_footer, compression.hygiene_hard_message_limit, HINDSIGHT_TIMEOUT, skills.guard_agent_created, TAVILY_BASE_URL, security.allow_private_urls, agent.api_max_retries, gateway hot-reload of compression/context_length config edits. TUI / CLI UX (NousResearch#17130 NousResearch#17113 NousResearch#17175 NousResearch#17150 NousResearch#16707 NousResearch#12312 NousResearch#12305 NousResearch#12934 NousResearch#14810 NousResearch#14045 NousResearch#17286 NousResearch#17126): - HERMES_TUI_RESUME, HERMES_TUI_THEME, LaTeX rendering, busy-indicator styles, ctrl-x queued-message delete, git branch in status bar, per- prompt elapsed stopwatch, external-editor keybind, markdown stripping, TUI voice-mode parity, /agents overlay, /reload + /mouse. Gateway features (NousResearch#16506 NousResearch#15027 NousResearch#13428 NousResearch#12116): - Native multimodal image routing based on vision capability. - /usage account-limits section. - /steer slash command (added to reference + explanation in CLI). Plugins / hooks (NousResearch#12929 NousResearch#12972 NousResearch#10763 NousResearch#16364): - transform_tool_result, transform_terminal_output plugin hooks. - PluginContext.dispatch_tool() documented with slash-command example. - google_meet bundled plugin entry under built-in-plugins.md. Other (NousResearch#16576 NousResearch#16572 NousResearch#16383 NousResearch#15878 NousResearch#15608 NousResearch#15606 NousResearch#14809 NousResearch#14767 NousResearch#14231 NousResearch#14232 NousResearch#14307 NousResearch#13683 NousResearch#12373 NousResearch#11891 NousResearch#11291 NousResearch#10066): - hermes backup exclusions (WAL/SHM/journal + checkpoints/). - security.md hardline blocklist (floor below --yolo). - FHS install layout for root installs. - openssh-client + docker-cli baked into the Docker image. - MEDIA: tag supported extensions table (docs/office/archives/pdf). - Remote-to-host file sync on SSH/Modal/Daytona teardown. - 'hermes model' -> Configure Auxiliary Models interactive picker. - Podman support via HERMES_DOCKER_BINARY. Providers / STT / one-shot (NousResearch#15045 NousResearch#14473 NousResearch#15704): - alibaba-coding-plan first-class provider entry. - xAI Grok STT as a 6th transcription option. - 'hermes -z' scripted one-shot mode + HERMES_INFERENCE_MODEL. Build: 'docusaurus build' succeeds. No new broken links/anchors; pre-existing warnings unchanged.
dannyJ848
pushed a commit
to dannyJ848/hermes-agent
that referenced
this pull request
May 17, 2026
…ers (NousResearch#17727) Covers ~60 merged PRs from Apr 15–29 that shipped user-visible behavior without docs coverage. No functional code changes; docs + static manifest regeneration only. Highlights: Stale / incorrect: - configuration.md: auxiliary auto-routing line was wrong since NousResearch#11900; now correctly states auto routes to the main model, with a note on the cost trade-off and per-task override pattern. - integrations/providers.md + configuration.md compression intro: removed stale 'Gemini Flash via OpenRouter' claim. - website/static/api/model-catalog.json: rebuilt from hermes_cli/models.py so the live manifest picks up tencent/hy3-preview (and remains in sync for future model-catalog PRs). Platform messaging (NousResearch#17417 NousResearch#16997 NousResearch#16193 NousResearch#14315 NousResearch#13151 NousResearch#11794 NousResearch#10610 NousResearch#10283 NousResearch#10246 NousResearch#11564 NousResearch#13178): - Signal: native formatting (bodyRanges), reply quotes, reactions. - Telegram: table rendering (bullets + code-block fallback), disable_link_previews, group_allowed_chats. - Slack: strict_mention config. - Discord: slash_commands disable, send_animation GIF, send_message native media attachments. - DingTalk: require_mention + allowed_users. CLI (NousResearch#16052 NousResearch#16539 NousResearch#16566 NousResearch#15841 NousResearch#14798 NousResearch#10043): - New 'hermes fallback' interactive manager. - New 'hermes update --check', '--backup' flag, and pre-update pairing snapshot behavior. - 'hermes gateway start/restart --all' multi-profile flag. - cron.md: 'hermes tools' as a platform, per-job enabled_toolsets, wakeAgent gate, context_from chaining. Config keys / env vars (NousResearch#17305 NousResearch#17026 NousResearch#17000 NousResearch#15077 NousResearch#14557 NousResearch#14227 NousResearch#14166 NousResearch#14730 NousResearch#17008): - terminal.docker_run_as_host_user, display.runtime_metadata_footer, compression.hygiene_hard_message_limit, HINDSIGHT_TIMEOUT, skills.guard_agent_created, TAVILY_BASE_URL, security.allow_private_urls, agent.api_max_retries, gateway hot-reload of compression/context_length config edits. TUI / CLI UX (NousResearch#17130 NousResearch#17113 NousResearch#17175 NousResearch#17150 NousResearch#16707 NousResearch#12312 NousResearch#12305 NousResearch#12934 NousResearch#14810 NousResearch#14045 NousResearch#17286 NousResearch#17126): - HERMES_TUI_RESUME, HERMES_TUI_THEME, LaTeX rendering, busy-indicator styles, ctrl-x queued-message delete, git branch in status bar, per- prompt elapsed stopwatch, external-editor keybind, markdown stripping, TUI voice-mode parity, /agents overlay, /reload + /mouse. Gateway features (NousResearch#16506 NousResearch#15027 NousResearch#13428 NousResearch#12116): - Native multimodal image routing based on vision capability. - /usage account-limits section. - /steer slash command (added to reference + explanation in CLI). Plugins / hooks (NousResearch#12929 NousResearch#12972 NousResearch#10763 NousResearch#16364): - transform_tool_result, transform_terminal_output plugin hooks. - PluginContext.dispatch_tool() documented with slash-command example. - google_meet bundled plugin entry under built-in-plugins.md. Other (NousResearch#16576 NousResearch#16572 NousResearch#16383 NousResearch#15878 NousResearch#15608 NousResearch#15606 NousResearch#14809 NousResearch#14767 NousResearch#14231 NousResearch#14232 NousResearch#14307 NousResearch#13683 NousResearch#12373 NousResearch#11891 NousResearch#11291 NousResearch#10066): - hermes backup exclusions (WAL/SHM/journal + checkpoints/). - security.md hardline blocklist (floor below --yolo). - FHS install layout for root installs. - openssh-client + docker-cli baked into the Docker image. - MEDIA: tag supported extensions table (docs/office/archives/pdf). - Remote-to-host file sync on SSH/Modal/Daytona teardown. - 'hermes model' -> Configure Auxiliary Models interactive picker. - Podman support via HERMES_DOCKER_BINARY. Providers / STT / one-shot (NousResearch#15045 NousResearch#14473 NousResearch#15704): - alibaba-coding-plan first-class provider entry. - xAI Grok STT as a 6th transcription option. - 'hermes -z' scripted one-shot mode + HERMES_INFERENCE_MODEL. Build: 'docusaurus build' succeeds. No new broken links/anchors; pre-existing warnings unchanged.
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
…ers (NousResearch#17727) Covers ~60 merged PRs from Apr 15–29 that shipped user-visible behavior without docs coverage. No functional code changes; docs + static manifest regeneration only. Highlights: Stale / incorrect: - configuration.md: auxiliary auto-routing line was wrong since NousResearch#11900; now correctly states auto routes to the main model, with a note on the cost trade-off and per-task override pattern. - integrations/providers.md + configuration.md compression intro: removed stale 'Gemini Flash via OpenRouter' claim. - website/static/api/model-catalog.json: rebuilt from hermes_cli/models.py so the live manifest picks up tencent/hy3-preview (and remains in sync for future model-catalog PRs). Platform messaging (NousResearch#17417 NousResearch#16997 NousResearch#16193 NousResearch#14315 NousResearch#13151 NousResearch#11794 NousResearch#10610 NousResearch#10283 NousResearch#10246 NousResearch#11564 NousResearch#13178): - Signal: native formatting (bodyRanges), reply quotes, reactions. - Telegram: table rendering (bullets + code-block fallback), disable_link_previews, group_allowed_chats. - Slack: strict_mention config. - Discord: slash_commands disable, send_animation GIF, send_message native media attachments. - DingTalk: require_mention + allowed_users. CLI (NousResearch#16052 NousResearch#16539 NousResearch#16566 NousResearch#15841 NousResearch#14798 NousResearch#10043): - New 'hermes fallback' interactive manager. - New 'hermes update --check', '--backup' flag, and pre-update pairing snapshot behavior. - 'hermes gateway start/restart --all' multi-profile flag. - cron.md: 'hermes tools' as a platform, per-job enabled_toolsets, wakeAgent gate, context_from chaining. Config keys / env vars (NousResearch#17305 NousResearch#17026 NousResearch#17000 NousResearch#15077 NousResearch#14557 NousResearch#14227 NousResearch#14166 NousResearch#14730 NousResearch#17008): - terminal.docker_run_as_host_user, display.runtime_metadata_footer, compression.hygiene_hard_message_limit, HINDSIGHT_TIMEOUT, skills.guard_agent_created, TAVILY_BASE_URL, security.allow_private_urls, agent.api_max_retries, gateway hot-reload of compression/context_length config edits. TUI / CLI UX (NousResearch#17130 NousResearch#17113 NousResearch#17175 NousResearch#17150 NousResearch#16707 NousResearch#12312 NousResearch#12305 NousResearch#12934 NousResearch#14810 NousResearch#14045 NousResearch#17286 NousResearch#17126): - HERMES_TUI_RESUME, HERMES_TUI_THEME, LaTeX rendering, busy-indicator styles, ctrl-x queued-message delete, git branch in status bar, per- prompt elapsed stopwatch, external-editor keybind, markdown stripping, TUI voice-mode parity, /agents overlay, /reload + /mouse. Gateway features (NousResearch#16506 NousResearch#15027 NousResearch#13428 NousResearch#12116): - Native multimodal image routing based on vision capability. - /usage account-limits section. - /steer slash command (added to reference + explanation in CLI). Plugins / hooks (NousResearch#12929 NousResearch#12972 NousResearch#10763 NousResearch#16364): - transform_tool_result, transform_terminal_output plugin hooks. - PluginContext.dispatch_tool() documented with slash-command example. - google_meet bundled plugin entry under built-in-plugins.md. Other (NousResearch#16576 NousResearch#16572 NousResearch#16383 NousResearch#15878 NousResearch#15608 NousResearch#15606 NousResearch#14809 NousResearch#14767 NousResearch#14231 NousResearch#14232 NousResearch#14307 NousResearch#13683 NousResearch#12373 NousResearch#11891 NousResearch#11291 NousResearch#10066): - hermes backup exclusions (WAL/SHM/journal + checkpoints/). - security.md hardline blocklist (floor below --yolo). - FHS install layout for root installs. - openssh-client + docker-cli baked into the Docker image. - MEDIA: tag supported extensions table (docs/office/archives/pdf). - Remote-to-host file sync on SSH/Modal/Daytona teardown. - 'hermes model' -> Configure Auxiliary Models interactive picker. - Podman support via HERMES_DOCKER_BINARY. Providers / STT / one-shot (NousResearch#15045 NousResearch#14473 NousResearch#15704): - alibaba-coding-plan first-class provider entry. - xAI Grok STT as a 6th transcription option. - 'hermes -z' scripted one-shot mode + HERMES_INFERENCE_MODEL. Build: 'docusaurus build' succeeds. No new broken links/anchors; pre-existing warnings unchanged.
Egavasyug
pushed a commit
to Egavasyug/hermes-agent
that referenced
this pull request
Jun 10, 2026
…ers (NousResearch#17727) Covers ~60 merged PRs from Apr 15–29 that shipped user-visible behavior without docs coverage. No functional code changes; docs + static manifest regeneration only. Highlights: Stale / incorrect: - configuration.md: auxiliary auto-routing line was wrong since NousResearch#11900; now correctly states auto routes to the main model, with a note on the cost trade-off and per-task override pattern. - integrations/providers.md + configuration.md compression intro: removed stale 'Gemini Flash via OpenRouter' claim. - website/static/api/model-catalog.json: rebuilt from hermes_cli/models.py so the live manifest picks up tencent/hy3-preview (and remains in sync for future model-catalog PRs). Platform messaging (NousResearch#17417 NousResearch#16997 NousResearch#16193 NousResearch#14315 NousResearch#13151 NousResearch#11794 NousResearch#10610 NousResearch#10283 NousResearch#10246 NousResearch#11564 NousResearch#13178): - Signal: native formatting (bodyRanges), reply quotes, reactions. - Telegram: table rendering (bullets + code-block fallback), disable_link_previews, group_allowed_chats. - Slack: strict_mention config. - Discord: slash_commands disable, send_animation GIF, send_message native media attachments. - DingTalk: require_mention + allowed_users. CLI (NousResearch#16052 NousResearch#16539 NousResearch#16566 NousResearch#15841 NousResearch#14798 NousResearch#10043): - New 'hermes fallback' interactive manager. - New 'hermes update --check', '--backup' flag, and pre-update pairing snapshot behavior. - 'hermes gateway start/restart --all' multi-profile flag. - cron.md: 'hermes tools' as a platform, per-job enabled_toolsets, wakeAgent gate, context_from chaining. Config keys / env vars (NousResearch#17305 NousResearch#17026 NousResearch#17000 NousResearch#15077 NousResearch#14557 NousResearch#14227 NousResearch#14166 NousResearch#14730 NousResearch#17008): - terminal.docker_run_as_host_user, display.runtime_metadata_footer, compression.hygiene_hard_message_limit, HINDSIGHT_TIMEOUT, skills.guard_agent_created, TAVILY_BASE_URL, security.allow_private_urls, agent.api_max_retries, gateway hot-reload of compression/context_length config edits. TUI / CLI UX (NousResearch#17130 NousResearch#17113 NousResearch#17175 NousResearch#17150 NousResearch#16707 NousResearch#12312 NousResearch#12305 NousResearch#12934 NousResearch#14810 NousResearch#14045 NousResearch#17286 NousResearch#17126): - HERMES_TUI_RESUME, HERMES_TUI_THEME, LaTeX rendering, busy-indicator styles, ctrl-x queued-message delete, git branch in status bar, per- prompt elapsed stopwatch, external-editor keybind, markdown stripping, TUI voice-mode parity, /agents overlay, /reload + /mouse. Gateway features (NousResearch#16506 NousResearch#15027 NousResearch#13428 NousResearch#12116): - Native multimodal image routing based on vision capability. - /usage account-limits section. - /steer slash command (added to reference + explanation in CLI). Plugins / hooks (NousResearch#12929 NousResearch#12972 NousResearch#10763 NousResearch#16364): - transform_tool_result, transform_terminal_output plugin hooks. - PluginContext.dispatch_tool() documented with slash-command example. - google_meet bundled plugin entry under built-in-plugins.md. Other (NousResearch#16576 NousResearch#16572 NousResearch#16383 NousResearch#15878 NousResearch#15608 NousResearch#15606 NousResearch#14809 NousResearch#14767 NousResearch#14231 NousResearch#14232 NousResearch#14307 NousResearch#13683 NousResearch#12373 NousResearch#11891 NousResearch#11291 NousResearch#10066): - hermes backup exclusions (WAL/SHM/journal + checkpoints/). - security.md hardline blocklist (floor below --yolo). - FHS install layout for root installs. - openssh-client + docker-cli baked into the Docker image. - MEDIA: tag supported extensions table (docs/office/archives/pdf). - Remote-to-host file sync on SSH/Modal/Daytona teardown. - 'hermes model' -> Configure Auxiliary Models interactive picker. - Podman support via HERMES_DOCKER_BINARY. Providers / STT / one-shot (NousResearch#15045 NousResearch#14473 NousResearch#15704): - alibaba-coding-plan first-class provider entry. - xAI Grok STT as a 6th transcription option. - 'hermes -z' scripted one-shot mode + HERMES_INFERENCE_MODEL. Build: 'docusaurus build' succeeds. No new broken links/anchors; pre-existing warnings unchanged.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Salvage of #14808 by @0xbyt4, cherry-picked onto current main (10 commits, authorship preserved).
Summary
TUI voice mode reaches parity with the classic CLI. Before this PR the TUI voice pipeline was broken three ways:
isAction()inui-tui/src/lib/platform.tsonly accepts Cmd (key.meta || key.super), so raw Ctrl+B fell through to TextInput and landed a literalbin the composer. Reported by timmie in Discord.tui_gateway/server.pyimportedhermes_cli.voice.{start_recording,stop_and_transcribe,speak_text}buthermes_cli/voice.pynever existed. Everyvoice.record/voice.ttscall returned 'voice module not available' even with all audio deps installed./voice onthen Ctrl+B starts auto-stop → transcribe → submit → auto-restart, 3 silent cycles halt).Changes
hermes_cli/voice.py(548 lines): PTT API + VAD-continuous loop wrappingtools.voice_mode+tools.tts_tool. Mirrorscli.py:_voice_*— 880 Hz start beep, 660 Hz×2 stop beep, 3-strikes silence halt,_tts_playingEvent to break TTS→STT feedback loop.tui_gateway/server.py:voice.togglewith on/off/tts/status subcommands (status surfaces STT provider availability);voice.recorddrives the continuous loop; auto-TTS onmessage.completewhen enabled; runtime-only state (noconfig.yamlpersistence so TUI doesn't auto-REC next launch).tui_gateway/entry.py: SIGPIPESIG_DFL→SIG_IGN(background threads writing to broken pipes no longer silently kill the gateway —write_jsoncatchesBrokenPipeErrorcleanly); signal/exit crash logging to~/.hermes/logs/tui_gateway_crash.log.isVoiceToggleKeyaccepting raw Ctrl+B everywhere + Cmd+B on macOS, TextInput passes Ctrl+B through, event handler wiresvoice.status/voice.transcriptwith auto-submit, status bar gets red ● REC / amber ◉ STT badges.Validation
bvoice.recordRPCvoice.ttsRPCtests/hermes_cli/test_voice_wrapper.pypass under hermetic CI runnertests/tui_gateway/pass (no regressions from panic hook / SIGPIPE change)ui-tui/)hermes_cli.voiceimports clean against currenttools.voice_modeexportsCloses #14808.