feat(plugins): google_meet — join, transcribe, speak, follow up by teknium1 · Pull Request #16364 · NousResearch/hermes-agent

teknium1 · 2026-04-27T04:39:26Z

google_meet — bundled plugin: join, transcribe, speak, follow up

Adds a new bundled plugin plugins/google_meet/ that lets the agent join a Google Meet call, transcribe live captions, optionally speak in realtime, and do the followup work afterwards. Zero changes to core code — everything plugs into the existing register_tool / register_cli_command / register_hook surfaces. Ships as three layered capabilities the user opts into.

What it does

Mode	Agent capability
`transcribe` (default)	Joins, enables live captions, scrapes into a transcript file. Listen-only.
`realtime`	Same as transcribe PLUS speaks into the meeting via OpenAI Realtime. `meet_say(text)` → agent's voice comes out of the call, with barge-in when a human interrupts.

Location	When
Local (default)	Gateway machine runs the Playwright bot directly.
Remote node (`node="<name>"`)	Bot runs on a different machine (e.g. user's Mac with a signed-in Chrome profile). Every `meet_*` tool accepts `node=...`.

Tools the agent sees

Tool	Use
`meet_join(url, mode?, guest_name?, duration?, headed?, node?)`	Start bot.
`meet_status(node?)`	Liveness, admission state, transcript progress, realtime telemetry.
`meet_transcript(last?, node?)`	Read scraped captions.
`meet_leave(node?)`	Close bot, finalize transcript.
`meet_say(text, node?)`	Speak in an active realtime meeting.

CLI the user sees

hermes plugins enable google_meet
hermes meet install [--realtime] [--yes]
hermes meet setup
hermes meet auth
hermes meet join <url> [--mode realtime] [--node <name>] [--headed] [--duration 30m]
hermes meet status | transcript [--last N] | say <text> | stop
hermes meet node run | list | approve <name> <url> <token> | remove | status | ping

hermes meet install does the heavy lifting: pip install playwright websockets, python -m playwright install chromium, and (with --realtime) sudo apt install pulseaudio-utils on Linux or brew install blackhole-2ch ffmpeg on macOS. Prompts before package-manager invocations unless --yes. Won't auto-switch the macOS default-input device — user has to do that manually (deliberate; surprising audio rerouting is worse than one manual step).

How the realtime audio path actually works

meet_say(text)  →  say_queue.jsonl  →  RealtimeSpeaker thread
                                            ↓
                                      OpenAI Realtime WS
                                            ↓
                                      response.audio.delta (PCM16 24kHz)
                                            ↓
                                      speaker.pcm  (appended)
                                            ↓
   Linux:  paplay → null-sink ← Chrome's PULSE_SOURCE=hermes_meet_src
   macOS:  ffmpeg -f audiotoolbox -audio_device_index <BlackHole> ← Chrome

Barge-in path: caption scraper sees a real participant's name → caption drain loop calls RealtimeSession.cancel_response() → sends type: "response.cancel" on the same WS → OpenAI stops generating audio. _looks_like_human_speaker guard prevents self-echo (our own name, Unknown, blank, You never trigger cancel).

Remote node architecture

┌─ gateway (any hermes install) ───────────────────────────────────┐
│  agent → meet_join(url, node='my-mac', mode='realtime')          │
│       └─ NodeClient ── wss (bearer token) ──┐                    │
└─────────────────────────────────────────────┼────────────────────┘
                                              ▼
┌─ node host (user's Mac) ─────────────────────────────────────────┐
│  `hermes meet node run` → NodeServer                              │
│    ├─ start_bot → process_manager.start() → spawns meet_bot      │
│    ├─ Chromium (signed-in profile) → meet.google.com              │
│    ├─ caption scraper → transcript.txt                           │
│    └─ (realtime mode) RealtimeSpeaker + BlackHole/ffmpeg         │
└──────────────────────────────────────────────────────────────────┘

Token auto-generated on first hermes meet node run, persisted at $HERMES_HOME/workspace/meetings/node_token.json. Gateway registers with hermes meet node approve <name> <url> <token>. All 6 RPC types (start_bot, stop, status, transcript, say, ping) validate token + id on every message.

Safety posture

URL gate: only https://meet.google.com/abc-defg-hij, /new, or /lookup/<id> URLs pass. Subdomain hijack + scheme downgrade rejected.
No calendar scanning, no auto-dial. The URL must be explicitly provided each time.
No auto-consent announcement. SKILL.md tells the agent to announce itself.
One active meeting per (gateway, node) pair. A second meet_join leaves the first.
Lobby timeout (HERMES_MEET_LOBBY_TIMEOUT, default 300s) so the bot doesn't sit forever.
Admission detection: inCall stays False until Leave-button / caption region / participants list are actually visible in the DOM. Denied page gets recognized and surfaced as leaveReason: "denied".
Platforms: Linux + macOS only. Windows register() no-ops.
Node host: bearer-token auth, one gateway per node at a time. No TLS built-in — run over LAN or behind a proxy you trust (documented in SKILL.md / README.md).

Status dict reference

Every meet_status() call returns a JSON dict including:

Key	Meaning
`inCall`	Past the lobby.
`lobbyWaiting`	Clicked "Ask to join", waiting on host.
`joinAttemptedAt` / `joinedAt`	Lobby-click + actual-admission timestamps.
`captioning`	Caption observer installed.
`transcriptLines` / `lastCaptionAt`	Transcript progress.
`realtime` / `realtimeReady`	Realtime provisioned / WS open.
`realtimeDevice`	Audio device the bot is feeding.
`audioBytesOut` / `lastAudioOutAt`	Voice activity counters.
`lastBargeInAt`	Most recent `response.cancel` timestamp.
`leaveReason`	`duration_expired` / `lobby_timeout` / `denied` / `page_closed` / null.

Files (all new, zero core changes)

plugins/google_meet/
├── plugin.yaml                         # manifest
├── __init__.py                         # register(ctx) — 5 tools + hook + CLI
├── meet_bot.py                         # Playwright bot subprocess
├── process_manager.py                  # local lifecycle + enqueue_say
├── tools.py                            # agent tools + node-routing helper
├── cli.py                              # hermes meet {setup,install,auth,join,...,node}
├── audio_bridge.py                     # Linux pactl null-sink + mac BlackHole probe
├── realtime/
│   ├── __init__.py
│   └── openai_client.py                # RealtimeSession + RealtimeSpeaker
├── node/
│   ├── __init__.py
│   ├── protocol.py                     # JSON envelope + validation
│   ├── registry.py                     # nodes.json CRUD
│   ├── server.py                       # NodeServer (ws + token auth)
│   ├── client.py                       # NodeClient (short-lived sync WS)
│   └── cli.py                          # hermes meet node subtree
├── SKILL.md                            # agent usage guide
└── README.md                           # dev reference

tests/plugins/
├── test_google_meet_plugin.py          # top-level: URL safety, state, tool routing, CLI wiring, v2/v3 integration
├── test_google_meet_audio.py           # audio_bridge: pactl modules, BlackHole probe, platform gating
├── test_google_meet_realtime.py        # OpenAI client: WS frames, cancel, speaker queue
└── test_google_meet_node.py            # protocol, registry, server dispatch, client RPC, CLI

Validation


Core files touched	0
New files	21 (13 plugin modules, 4 test files, 2 docs, 2 empty `__init__`s)
Lines added	5751
Lines removed	0
Unit tests	171 passing across all google_meet test files + plugin-system regression
E2E on Linux	plugin loads in fresh `HERMES_HOME`, 5 tools register, `on_session_end` hook wires, `hermes meet` CLI tree wires including `node` subtree, `meet_join` with `node="my-mac" mode="realtime"` routes correctly through `NodeClient`, `enqueue_say` rejects transcribe/accepts realtime, every new argparse flag parses, `RealtimeSession.cancel_response()` behaves on both connected and disconnected sessions.
`py_compile`	clean on every file
Rebased cleanly	on top of `origin/main` as of this push

What live testing still needs to validate

The one thing unit + E2E tests can't cover from CI: whether Chrome's WebRTC inside Meet actually hears the PCM we pump into the null-sink / BlackHole, and whether the Realtime voice sounds right at the other end of the call. Everything up through PCM landing in the correct OS audio device is covered by tests. The final 10 feet (virtual device → Meet's mic → remote participants' speakers) wants a live Meet + OpenAI Realtime key + ~15 minutes.

If something glitches in the live test it'll be localized to one of:

audio_bridge.py pactl / BlackHole probe flags
meet_bot._start_realtime_speaker paplay / ffmpeg args
realtime/openai_client.py session.update format config

Each is small and self-contained to iterate on as a follow-up.

Explicitly out of scope (future PRs, not blocking)

Mic → Realtime duplex (bot listens to meeting WebRTC audio, not just captions). Biggest lift, highest ceiling.
Node-host TLS termination + pairing-UX ceremony (today it's bearer-token over plain WS).
Meet create (Google Meet API) and Twilio dial-in.
Windows audio bridge.

teknium1 · 2026-04-27T04:54:59Z

v2 + v3 pushed on top of v1

v2 — agent speaks in-meeting via OpenAI Realtime + virtual audio device.

`audio_bridge.py`: PulseAudio null-sink on Linux (loads + tears down pactl modules cleanly), BlackHole probe on macOS (refuses to switch system default input — user does that).
`realtime/openai_client.py`: sync WebSocket client + file-queue speaker. `RealtimeSession.speak(text)` → response.audio.delta bytes appended to PCM file; `paplay` pumps that into the null-sink; Chrome reads from the virtual source via `PULSE_SOURCE` env.
Opt in with `meet_join(url, mode="realtime")`. Graceful fallback to transcribe mode if any audio-bridge step fails; flagged in status.
`meet_say(text)` is no longer a stub. Refuses against transcribe-mode meetings with a clear error.

v3 — remote node host so the bot can run on a different machine than the gateway.

`node/protocol.py`, `node/registry.py`, `node/server.py`, `node/client.py`, `node/cli.py`.
`hermes meet node run` on the node machine (e.g. user\x27s Mac with a signed-in Chrome); auto-generates and persists a bearer token.
`hermes meet node approve ` on the gateway registers it.
Every `meet_*` tool gets a `node=""` (or `"auto"`) param. Unknown node → clear error.

Numbers

3457 insertions, 151 deletions, 20 files.
100 plugin tests + 58 plugin-system tests = 158 passing.
Zero core files touched.
E2E verified on this Linux box: fresh HERMES_HOME → plugin loads, 5 tools register, node subtree wires into `hermes meet`, `meet_join` with `node=\x27my-mac\x27 mode=\x27realtime\x27` correctly routes through NodeClient, `enqueue_say` behaves per-mode, every new argparse flag parses.

What\x27s still uncertain
The only thing I could not validate from this session: does Chrome on a real Meet actually hear the PCM we pump into the null-sink, and does the Realtime voice sound right at the other end of the call. That needs a live Meet + an OpenAI Realtime key + 15 minutes. Everything up to and including the PCM landing in the correct OS audio device is verified by unit + E2E tests. The final 10 feet (virtual source → Meet\x27s WebRTC) is the bit that has to be poked in reality.

If the live test reveals a glitch, it\x27ll be in one of: `audio_bridge.py` pactl flags, `paplay` args in `meet_bot._start_realtime_speaker`, or the RealtimeSession format config. All three are small and isolated.

Ready for your call on whether to merge now and iterate after a live test, or hold until you\x27ve run it end-to-end on Meet.

…mp, barge-in, richer status Ready-for-live-test follow-up on PR #16364. Five additions that matter for the first live run on a real Meet, in priority order: 1. hermes meet install [--realtime] [--yes] pip install playwright websockets + python -m playwright install chromium --realtime: installs platform audio deps (pulseaudio-utils on Linux via sudo apt, blackhole-2ch + ffmpeg on macOS via brew). Prompts before sudo/brew unless --yes. Refuses on Windows. Refuses to auto-flip the macOS default input — user still selects BlackHole in System Settings (deliberate; surprise audio rerouting is worse than a manual step). 2. Admission detection _detect_admission(page): Leave-button visible OR caption region attached OR participants list present → we're in-call. _detect_denied(page): 'You can\'t join this video call' / 'You were removed' / 'No one responded to your request' → bail out. HERMES_MEET_LOBBY_TIMEOUT (default 300s) caps how long we sit in the lobby before giving up. in_call stays False until admitted. Status surfaces leaveReason: duration_expired | lobby_timeout | denied | page_closed. 3. macOS PCM pump ffmpeg reads speaker.pcm (24kHz s16le mono) and writes to the BlackHole AVFoundation output via -f audiotoolbox -audio_device_index <N>. _mac_audio_device_index() probes ffmpeg -f avfoundation -list_devices true to resolve 'BlackHole 2ch' → numeric index. Falls back to index 0 on probe failure. Linux paplay pump unchanged. 4. Richer status dict _BotState now tracks realtime, realtimeReady, realtimeDevice, audioBytesOut, lastAudioOutAt, lastBargeInAt, joinAttemptedAt, leaveReason. RealtimeSession.audio_bytes_out / last_audio_out_at counters fold into the status file once a second so meet_status() can show the agent's voice activity in near-real-time. 5. Barge-in RealtimeSession.cancel_response() sends type='response.cancel' over the same WS (lock-guarded so it's safe to call from the caption thread while speak() is reading frames). Handles response.cancelled as a terminal frame type. _looks_like_human_speaker() gates triggers so the bot's own name, 'You', 'Unknown', and blanks don't self-cancel. Called from the caption drain loop: when a new caption arrives attributed to a real participant while rt.session exists, we fire cancel_response() and stamp lastBargeInAt. Tests: 20 new unit tests across _BotState telemetry, barge-in gating, admission/denied probe error handling, cancel_response with and without a connected WS, and `hermes meet install` CLI wiring (flag parsing + end-to-end subprocess.run verification + Linux-already-installed fast path). Total 171 passing across all google_meet test files + the plugin-system regression suite. E2E verified on Linux: plugin loads, all 5 tools register, `hermes meet install --realtime --yes` parses, fresh-bot status.json has every new telemetry key, cancel_response on a disconnected session returns False without raising, barge-in helper gates the bot's own name correctly. Still out of scope (for a future PR, not blocking live test): mic → Realtime duplex (the agent listening to meeting audio via WebRTC), node-host TLS/pairing UX, Windows audio, Meet create+Twilio. Docs updated: SKILL.md now lists the installer subcommand, lobby timeout, barge-in caveat, and the full status-dict reference table. README.md quick-start uses hermes meet install.

teknium1 · 2026-04-27T07:23:27Z

Ready-for-live-test pass pushed (commit `0185f90`)

Five additions that matter for the first live Meet run:

`hermes meet install [--realtime] [--yes]` — pip + Chromium + platform audio deps. Prompts before `sudo apt` / `brew`. macOS still asks you to select BlackHole as default input manually (deliberate).
Admission detection — `in_call` stays False until Leave-button/captions/participants are actually visible. Lobby timeout default 5 min (`HERMES_MEET_LOBBY_TIMEOUT` env). Detects host-denied page. `leaveReason` field in status: `duration_expired` | `lobby_timeout` | `denied` | `page_closed`.
macOS PCM pump — ffmpeg tails `speaker.pcm` into the BlackHole AVFoundation output via `-f audiotoolbox -audio_device_index`. Device index probed automatically via `ffmpeg -list_devices`.
Richer status — `realtime`, `realtimeReady`, `realtimeDevice`, `audioBytesOut`, `lastAudioOutAt`, `lastBargeInAt`, `joinAttemptedAt`, `leaveReason` on every `meet_status()`.
Barge-in — `RealtimeSession.cancel_response()` sends `type: response.cancel`. Caption drain loop fires it when a caption attributed to a real participant arrives while the bot is generating audio. Self-echo guarded via `_looks_like_human_speaker` (our name / "Unknown" / blank / "You" all don\x27t trigger).

Numbers

669 insertions, 29 deletions, 6 files.
20 new unit tests. 171/171 passing across all google_meet + plugin-system files.
E2E clean on this Linux box.

Live-test kit (suggested run order on your Mac)

```bash
git fetch && git checkout hermes/hermes-d705211f

hermes plugins enable google_meet
hermes meet install --realtime # will brew-install BH+ffmpeg

System Settings → Sound → Input → BlackHole 2ch

echo "OPENAI_API_KEY=sk-..." >> ~/.hermes/.env
hermes meet setup # should be all green

pick a test Meet you host

hermes meet join https://meet.google.com/xxx-xxxx-xxx --mode realtime --headed

(in another terminal)

hermes meet status
hermes meet say "Hey team, testing the note-taker bot."
hermes meet say "If someone interrupts me, I should stop talking."
```

Things to watch for:

`meet_status` flipping `inCall` true only after you admit the bot
`realtimeReady: true` means the WS opened to OpenAI
`audioBytesOut` climbing when you say something via `meet_say`
Interrupting the bot mid-sentence → `lastBargeInAt` stamp appears

If something\x27s broken the failure mode will be localized to one of: BlackHole device routing (fix is in `_mac_audio_device_index` or your Sound prefs), ffmpeg flags (`meet_bot.py` around the AVFoundation Popen), or the Realtime format config (`realtime/openai_client.py` `connect()`).

Not included, explicitly deferred: mic → Realtime duplex, node-host TLS, Windows support, Meet `create`/Twilio. Want any of those next?

…calls v1 shipping transcribe-only. Spawns headless Chromium via Playwright, joins an explicit https://meet.google.com/ URL, enables live captions, and scrapes them into a transcript file the agent can read across turns. The agent then has the meeting content in context and can do followup work (send recap, file issues, schedule followups) with its regular tools. Surface: - Tools: meet_join, meet_status, meet_transcript, meet_leave, meet_say (meet_say is a v1 stub — returns not-implemented; v2 will wire realtime duplex audio via OpenAI Realtime / Gemini Live + BlackHole / PulseAudio null-sink.) - CLI: hermes meet setup | auth | join | status | transcript | stop - Lifecycle: on_session_end auto-leaves any still-running bot. Safety: - URL regex rejects anything that isn't https://meet.google.com/... - No calendar scanning, no auto-dial, no auto-consent announcement. - Single active meeting per install; a second meet_join leaves the first. - Platform-gated to Linux + macOS (Windows audio routing for v2 untested). - Opt-in: standalone plugin, user must add 'google_meet' to plugins.enabled in config.yaml. Zero core changes. Plugin uses existing register_tool / register_cli_command / register_hook surfaces. 21 new unit tests cover the URL safety gate, transcript dedup + status round-trip, process-manager refusals/start/stop paths, tool-handler JSON shape under each branch, session-end cleanup, and platform-gated register().

v2 \u2014 agent speaks in-meeting audio_bridge.py: PulseAudio null-sink (Linux) + BlackHole probe (macOS). On Linux we load pactl module-null-sink + module-virtual-source, track module ids for teardown; Chrome gets PULSE_SOURCE=<virt src> env so its fake mic reads what we write to the sink. macOS just probes BlackHole 2ch and returns its device name \u2014 the plugin refuses to switch the user's default audio input (that would surprise them). realtime/openai_client.py: sync WebSocket client for the OpenAI Realtime API. RealtimeSession.speak(text) sends conversation.item.create + response.create, accumulates response.audio.delta PCM bytes, appends them to a file. RealtimeSpeaker runs a JSONL-queue loop consuming meet_say calls. 'websockets' is an optional dep imported lazily. meet_bot.py: when HERMES_MEET_MODE=realtime, provisions AudioBridge, starts RealtimeSession + speaker thread, spawns paplay to pump PCM into the null-sink, then cleans everything up on SIGTERM. If any realtime setup step fails, falls back cleanly to transcribe mode with an error flagged in status.json. process_manager.enqueue_say(): writes a JSONL line to say_queue.jsonl; refuses when no active meeting or active meeting is transcribe-only. tools.meet_say: real implementation; requires active mode='realtime'. meet_join: adds mode='transcribe'|'realtime' param. v3 \u2014 remote node host node/protocol.py: JSON envelope (type, id, token, payload) + validate. node/registry.py: $HERMES_HOME/workspace/meetings/nodes.json, with resolve() auto-selecting the sole registered node when name is None. node/server.py: NodeServer \u2014 websockets.serve, bearer-token auth, dispatches start_bot/stop/status/transcript/say/ping onto the local process_manager. Token auto-generated + persisted on first run. node/client.py: NodeClient \u2014 short-lived sync WS per RPC, raises RuntimeError on error envelopes, clean API matching the server. node/cli.py: 'hermes meet node {run,list,approve,remove,status,ping}' subtree; wired into the main meet CLI by cli.py so 'hermes meet node' Just Works. tools.py: every meet_* tool accepts node='<name>'|'auto'; when set, routes through NodeClient to the remote bot instead of running locally. Unknown node \u2192 clear 'no registered meet node matches ...' error. cli.py: 'hermes meet join --node my-mac --mode realtime' and 'hermes meet say "..." --node my-mac' route to the node; 'hermes meet node approve <name> <url> <token>' registers one. Tests 21 v1 tests updated (meet_say is no longer a stub; active-record now carries mode). 20 new audio_bridge + realtime tests. 42 new node tests (protocol/registry/server/client/cli). 17 new v1/v2/v3 integration tests at the plugin level covering enqueue_say edge cases, env var passthrough, mode validation, node routing (known/unknown/auto/ambiguous), and argparse wiring for `hermes meet say` + `hermes meet node` + --mode/--node flags. Total: 100 plugin tests + 58 plugin-system tests = 158 passing. E2E verified on Linux with fresh HERMES_HOME: plugin loads, 5 tools register, on_session_end hook wires, 'hermes meet' CLI tree wires including the node subtree, NodeRegistry round-trips, meet_join routes correctly to NodeClient under node='my-mac' with mode='realtime', enqueue_say accepts realtime/rejects transcribe, argparse parses every new flag cleanly. Zero changes to core. All new code lives under plugins/google_meet/.

…mp, barge-in, richer status Ready-for-live-test follow-up on PR #16364. Five additions that matter for the first live run on a real Meet, in priority order: 1. hermes meet install [--realtime] [--yes] pip install playwright websockets + python -m playwright install chromium --realtime: installs platform audio deps (pulseaudio-utils on Linux via sudo apt, blackhole-2ch + ffmpeg on macOS via brew). Prompts before sudo/brew unless --yes. Refuses on Windows. Refuses to auto-flip the macOS default input — user still selects BlackHole in System Settings (deliberate; surprise audio rerouting is worse than a manual step). 2. Admission detection _detect_admission(page): Leave-button visible OR caption region attached OR participants list present → we're in-call. _detect_denied(page): 'You can\'t join this video call' / 'You were removed' / 'No one responded to your request' → bail out. HERMES_MEET_LOBBY_TIMEOUT (default 300s) caps how long we sit in the lobby before giving up. in_call stays False until admitted. Status surfaces leaveReason: duration_expired | lobby_timeout | denied | page_closed. 3. macOS PCM pump ffmpeg reads speaker.pcm (24kHz s16le mono) and writes to the BlackHole AVFoundation output via -f audiotoolbox -audio_device_index <N>. _mac_audio_device_index() probes ffmpeg -f avfoundation -list_devices true to resolve 'BlackHole 2ch' → numeric index. Falls back to index 0 on probe failure. Linux paplay pump unchanged. 4. Richer status dict _BotState now tracks realtime, realtimeReady, realtimeDevice, audioBytesOut, lastAudioOutAt, lastBargeInAt, joinAttemptedAt, leaveReason. RealtimeSession.audio_bytes_out / last_audio_out_at counters fold into the status file once a second so meet_status() can show the agent's voice activity in near-real-time. 5. Barge-in RealtimeSession.cancel_response() sends type='response.cancel' over the same WS (lock-guarded so it's safe to call from the caption thread while speak() is reading frames). Handles response.cancelled as a terminal frame type. _looks_like_human_speaker() gates triggers so the bot's own name, 'You', 'Unknown', and blanks don't self-cancel. Called from the caption drain loop: when a new caption arrives attributed to a real participant while rt.session exists, we fire cancel_response() and stamp lastBargeInAt. Tests: 20 new unit tests across _BotState telemetry, barge-in gating, admission/denied probe error handling, cancel_response with and without a connected WS, and `hermes meet install` CLI wiring (flag parsing + end-to-end subprocess.run verification + Linux-already-installed fast path). Total 171 passing across all google_meet test files + the plugin-system regression suite. E2E verified on Linux: plugin loads, all 5 tools register, `hermes meet install --realtime --yes` parses, fresh-bot status.json has every new telemetry key, cancel_response on a disconnected session returns False without raising, barge-in helper gates the bot's own name correctly. Still out of scope (for a future PR, not blocking live test): mic → Realtime duplex (the agent listening to meeting audio via WebRTC), node-host TLS/pairing UX, Windows audio, Meet create+Twilio. Docs updated: SKILL.md now lists the installer subcommand, lobby timeout, barge-in caveat, and the full status-dict reference table. README.md quick-start uses hermes meet install.

…ers (#17727) Covers ~60 merged PRs from Apr 15–29 that shipped user-visible behavior without docs coverage. No functional code changes; docs + static manifest regeneration only. Highlights: Stale / incorrect: - configuration.md: auxiliary auto-routing line was wrong since #11900; now correctly states auto routes to the main model, with a note on the cost trade-off and per-task override pattern. - integrations/providers.md + configuration.md compression intro: removed stale 'Gemini Flash via OpenRouter' claim. - website/static/api/model-catalog.json: rebuilt from hermes_cli/models.py so the live manifest picks up tencent/hy3-preview (and remains in sync for future model-catalog PRs). Platform messaging (#17417 #16997 #16193 #14315 #13151 #11794 #10610 #10283 #10246 #11564 #13178): - Signal: native formatting (bodyRanges), reply quotes, reactions. - Telegram: table rendering (bullets + code-block fallback), disable_link_previews, group_allowed_chats. - Slack: strict_mention config. - Discord: slash_commands disable, send_animation GIF, send_message native media attachments. - DingTalk: require_mention + allowed_users. CLI (#16052 #16539 #16566 #15841 #14798 #10043): - New 'hermes fallback' interactive manager. - New 'hermes update --check', '--backup' flag, and pre-update pairing snapshot behavior. - 'hermes gateway start/restart --all' multi-profile flag. - cron.md: 'hermes tools' as a platform, per-job enabled_toolsets, wakeAgent gate, context_from chaining. Config keys / env vars (#17305 #17026 #17000 #15077 #14557 #14227 #14166 #14730 #17008): - terminal.docker_run_as_host_user, display.runtime_metadata_footer, compression.hygiene_hard_message_limit, HINDSIGHT_TIMEOUT, skills.guard_agent_created, TAVILY_BASE_URL, security.allow_private_urls, agent.api_max_retries, gateway hot-reload of compression/context_length config edits. TUI / CLI UX (#17130 #17113 #17175 #17150 #16707 #12312 #12305 #12934 #14810 #14045 #17286 #17126): - HERMES_TUI_RESUME, HERMES_TUI_THEME, LaTeX rendering, busy-indicator styles, ctrl-x queued-message delete, git branch in status bar, per- prompt elapsed stopwatch, external-editor keybind, markdown stripping, TUI voice-mode parity, /agents overlay, /reload + /mouse. Gateway features (#16506 #15027 #13428 #12116): - Native multimodal image routing based on vision capability. - /usage account-limits section. - /steer slash command (added to reference + explanation in CLI). Plugins / hooks (#12929 #12972 #10763 #16364): - transform_tool_result, transform_terminal_output plugin hooks. - PluginContext.dispatch_tool() documented with slash-command example. - google_meet bundled plugin entry under built-in-plugins.md. Other (#16576 #16572 #16383 #15878 #15608 #15606 #14809 #14767 #14231 #14232 #14307 #13683 #12373 #11891 #11291 #10066): - hermes backup exclusions (WAL/SHM/journal + checkpoints/). - security.md hardline blocklist (floor below --yolo). - FHS install layout for root installs. - openssh-client + docker-cli baked into the Docker image. - MEDIA: tag supported extensions table (docs/office/archives/pdf). - Remote-to-host file sync on SSH/Modal/Daytona teardown. - 'hermes model' -> Configure Auxiliary Models interactive picker. - Podman support via HERMES_DOCKER_BINARY. Providers / STT / one-shot (#15045 #14473 #15704): - alibaba-coding-plan first-class provider entry. - xAI Grok STT as a 6th transcription option. - 'hermes -z' scripted one-shot mode + HERMES_INFERENCE_MODEL. Build: 'docusaurus build' succeeds. No new broken links/anchors; pre-existing warnings unchanged.

…ousResearch#16364) * feat(plugins): google_meet — bundled plugin for join+transcribe Meet calls v1 shipping transcribe-only. Spawns headless Chromium via Playwright, joins an explicit https://meet.google.com/ URL, enables live captions, and scrapes them into a transcript file the agent can read across turns. The agent then has the meeting content in context and can do followup work (send recap, file issues, schedule followups) with its regular tools. Surface: - Tools: meet_join, meet_status, meet_transcript, meet_leave, meet_say (meet_say is a v1 stub — returns not-implemented; v2 will wire realtime duplex audio via OpenAI Realtime / Gemini Live + BlackHole / PulseAudio null-sink.) - CLI: hermes meet setup | auth | join | status | transcript | stop - Lifecycle: on_session_end auto-leaves any still-running bot. Safety: - URL regex rejects anything that isn't https://meet.google.com/... - No calendar scanning, no auto-dial, no auto-consent announcement. - Single active meeting per install; a second meet_join leaves the first. - Platform-gated to Linux + macOS (Windows audio routing for v2 untested). - Opt-in: standalone plugin, user must add 'google_meet' to plugins.enabled in config.yaml. Zero core changes. Plugin uses existing register_tool / register_cli_command / register_hook surfaces. 21 new unit tests cover the URL safety gate, transcript dedup + status round-trip, process-manager refusals/start/stop paths, tool-handler JSON shape under each branch, session-end cleanup, and platform-gated register(). * feat(plugins/google_meet): v2 realtime audio + v3 remote node host v2 \u2014 agent speaks in-meeting audio_bridge.py: PulseAudio null-sink (Linux) + BlackHole probe (macOS). On Linux we load pactl module-null-sink + module-virtual-source, track module ids for teardown; Chrome gets PULSE_SOURCE=<virt src> env so its fake mic reads what we write to the sink. macOS just probes BlackHole 2ch and returns its device name \u2014 the plugin refuses to switch the user's default audio input (that would surprise them). realtime/openai_client.py: sync WebSocket client for the OpenAI Realtime API. RealtimeSession.speak(text) sends conversation.item.create + response.create, accumulates response.audio.delta PCM bytes, appends them to a file. RealtimeSpeaker runs a JSONL-queue loop consuming meet_say calls. 'websockets' is an optional dep imported lazily. meet_bot.py: when HERMES_MEET_MODE=realtime, provisions AudioBridge, starts RealtimeSession + speaker thread, spawns paplay to pump PCM into the null-sink, then cleans everything up on SIGTERM. If any realtime setup step fails, falls back cleanly to transcribe mode with an error flagged in status.json. process_manager.enqueue_say(): writes a JSONL line to say_queue.jsonl; refuses when no active meeting or active meeting is transcribe-only. tools.meet_say: real implementation; requires active mode='realtime'. meet_join: adds mode='transcribe'|'realtime' param. v3 \u2014 remote node host node/protocol.py: JSON envelope (type, id, token, payload) + validate. node/registry.py: $HERMES_HOME/workspace/meetings/nodes.json, with resolve() auto-selecting the sole registered node when name is None. node/server.py: NodeServer \u2014 websockets.serve, bearer-token auth, dispatches start_bot/stop/status/transcript/say/ping onto the local process_manager. Token auto-generated + persisted on first run. node/client.py: NodeClient \u2014 short-lived sync WS per RPC, raises RuntimeError on error envelopes, clean API matching the server. node/cli.py: 'hermes meet node {run,list,approve,remove,status,ping}' subtree; wired into the main meet CLI by cli.py so 'hermes meet node' Just Works. tools.py: every meet_* tool accepts node='<name>'|'auto'; when set, routes through NodeClient to the remote bot instead of running locally. Unknown node \u2192 clear 'no registered meet node matches ...' error. cli.py: 'hermes meet join --node my-mac --mode realtime' and 'hermes meet say "..." --node my-mac' route to the node; 'hermes meet node approve <name> <url> <token>' registers one. Tests 21 v1 tests updated (meet_say is no longer a stub; active-record now carries mode). 20 new audio_bridge + realtime tests. 42 new node tests (protocol/registry/server/client/cli). 17 new v1/v2/v3 integration tests at the plugin level covering enqueue_say edge cases, env var passthrough, mode validation, node routing (known/unknown/auto/ambiguous), and argparse wiring for `hermes meet say` + `hermes meet node` + --mode/--node flags. Total: 100 plugin tests + 58 plugin-system tests = 158 passing. E2E verified on Linux with fresh HERMES_HOME: plugin loads, 5 tools register, on_session_end hook wires, 'hermes meet' CLI tree wires including the node subtree, NodeRegistry round-trips, meet_join routes correctly to NodeClient under node='my-mac' with mode='realtime', enqueue_say accepts realtime/rejects transcribe, argparse parses every new flag cleanly. Zero changes to core. All new code lives under plugins/google_meet/. * feat(plugins/google_meet): auto-install, admission detect, mac PCM pump, barge-in, richer status Ready-for-live-test follow-up on PR NousResearch#16364. Five additions that matter for the first live run on a real Meet, in priority order: 1. hermes meet install [--realtime] [--yes] pip install playwright websockets + python -m playwright install chromium --realtime: installs platform audio deps (pulseaudio-utils on Linux via sudo apt, blackhole-2ch + ffmpeg on macOS via brew). Prompts before sudo/brew unless --yes. Refuses on Windows. Refuses to auto-flip the macOS default input — user still selects BlackHole in System Settings (deliberate; surprise audio rerouting is worse than a manual step). 2. Admission detection _detect_admission(page): Leave-button visible OR caption region attached OR participants list present → we're in-call. _detect_denied(page): 'You can\'t join this video call' / 'You were removed' / 'No one responded to your request' → bail out. HERMES_MEET_LOBBY_TIMEOUT (default 300s) caps how long we sit in the lobby before giving up. in_call stays False until admitted. Status surfaces leaveReason: duration_expired | lobby_timeout | denied | page_closed. 3. macOS PCM pump ffmpeg reads speaker.pcm (24kHz s16le mono) and writes to the BlackHole AVFoundation output via -f audiotoolbox -audio_device_index <N>. _mac_audio_device_index() probes ffmpeg -f avfoundation -list_devices true to resolve 'BlackHole 2ch' → numeric index. Falls back to index 0 on probe failure. Linux paplay pump unchanged. 4. Richer status dict _BotState now tracks realtime, realtimeReady, realtimeDevice, audioBytesOut, lastAudioOutAt, lastBargeInAt, joinAttemptedAt, leaveReason. RealtimeSession.audio_bytes_out / last_audio_out_at counters fold into the status file once a second so meet_status() can show the agent's voice activity in near-real-time. 5. Barge-in RealtimeSession.cancel_response() sends type='response.cancel' over the same WS (lock-guarded so it's safe to call from the caption thread while speak() is reading frames). Handles response.cancelled as a terminal frame type. _looks_like_human_speaker() gates triggers so the bot's own name, 'You', 'Unknown', and blanks don't self-cancel. Called from the caption drain loop: when a new caption arrives attributed to a real participant while rt.session exists, we fire cancel_response() and stamp lastBargeInAt. Tests: 20 new unit tests across _BotState telemetry, barge-in gating, admission/denied probe error handling, cancel_response with and without a connected WS, and `hermes meet install` CLI wiring (flag parsing + end-to-end subprocess.run verification + Linux-already-installed fast path). Total 171 passing across all google_meet test files + the plugin-system regression suite. E2E verified on Linux: plugin loads, all 5 tools register, `hermes meet install --realtime --yes` parses, fresh-bot status.json has every new telemetry key, cancel_response on a disconnected session returns False without raising, barge-in helper gates the bot's own name correctly. Still out of scope (for a future PR, not blocking live test): mic → Realtime duplex (the agent listening to meeting audio via WebRTC), node-host TLS/pairing UX, Windows audio, Meet create+Twilio. Docs updated: SKILL.md now lists the installer subcommand, lobby timeout, barge-in caveat, and the full status-dict reference table. README.md quick-start uses hermes meet install.

syahidfrd · 2026-05-01T08:58:32Z

Heads up — looks like the hermes meet ... CLI tree this PR documents in README/SKILL.md isn't actually wired into the top-level argparse, despite landing cleanly:

hermes plugins enable google_meet            # ✓ enabled
hermes meet install
# → hermes: error: argument command: invalid choice: 'meet'

The plugin itself is fine — register() correctly calls ctx.register_cli_command(name="meet", ...) (plugins/google_meet/init.py:92), and tools (meet_join, etc.) are reachable from chat sessions since register_tool() works. The dead path is on the hermes side: PluginContext.register_cli_command writes into PluginManager._cli_commands (hermes_cli/plugins.py:314), but hermes_cli/main.py never reads from there to build subparsers — it only iterates plugins.memory.discover_plugin_cli_commands() (main.py:9224-9236), which is memory-plugin-specific.

The PR's E2E checklist mentions hermes meet CLI tree wires was tested. Looking at tests/plugins/test_google_meet_plugin.py:335-364, that test mocks PluginContext and asserts register_cli_command was called — which passes regardless of whether main.py wires anything up. So CI was happy but the actual top-level hermes meet never reached argparse.

Good news: this isn't a google_meet bug — it's pre-existing and there are already three open PRs targeting it:

fix: wire and harden plugin CLI command registration #13643 (attmous, +181/-27) — most complete: wire + name normalization + collision checks + tests
fix(cli): wire general plugin commands into main parser #9141 (njiangk, +145/-4) — leaner: wire + conflict skip + tests (collaborator marked it as overlap with fix: wire and harden plugin CLI command registration #13643)
fix: wire plugin CLI commands to gateway dispatch and CLI handler lookup #8859 (zureyewz, +115/-13) — broader: also touches gateway dispatch + commands.py

All three sat at P3 before google_meet landed, since plugin-CLI was theoretical. Now with a bundled plugin shipping a broken user-facing CLI, might be worth taking another look at landing one of them (probably #13643 if reconciled with #9141).

Workaround for users on current main: python -m plugins.google_meet.cli <subcommand> — runs the same CLI tree directly via the file's __main__ block.

Tools-via-agent path (e.g. asking the agent to "join this Meet URL") works perfectly. Only the human-facing terminal CLI is affected.

…ousResearch#16364) * feat(plugins): google_meet — bundled plugin for join+transcribe Meet calls v1 shipping transcribe-only. Spawns headless Chromium via Playwright, joins an explicit https://meet.google.com/ URL, enables live captions, and scrapes them into a transcript file the agent can read across turns. The agent then has the meeting content in context and can do followup work (send recap, file issues, schedule followups) with its regular tools. Surface: - Tools: meet_join, meet_status, meet_transcript, meet_leave, meet_say (meet_say is a v1 stub — returns not-implemented; v2 will wire realtime duplex audio via OpenAI Realtime / Gemini Live + BlackHole / PulseAudio null-sink.) - CLI: hermes meet setup | auth | join | status | transcript | stop - Lifecycle: on_session_end auto-leaves any still-running bot. Safety: - URL regex rejects anything that isn't https://meet.google.com/... - No calendar scanning, no auto-dial, no auto-consent announcement. - Single active meeting per install; a second meet_join leaves the first. - Platform-gated to Linux + macOS (Windows audio routing for v2 untested). - Opt-in: standalone plugin, user must add 'google_meet' to plugins.enabled in config.yaml. Zero core changes. Plugin uses existing register_tool / register_cli_command / register_hook surfaces. 21 new unit tests cover the URL safety gate, transcript dedup + status round-trip, process-manager refusals/start/stop paths, tool-handler JSON shape under each branch, session-end cleanup, and platform-gated register(). * feat(plugins/google_meet): v2 realtime audio + v3 remote node host v2 \u2014 agent speaks in-meeting audio_bridge.py: PulseAudio null-sink (Linux) + BlackHole probe (macOS). On Linux we load pactl module-null-sink + module-virtual-source, track module ids for teardown; Chrome gets PULSE_SOURCE=<virt src> env so its fake mic reads what we write to the sink. macOS just probes BlackHole 2ch and returns its device name \u2014 the plugin refuses to switch the user's default audio input (that would surprise them). realtime/openai_client.py: sync WebSocket client for the OpenAI Realtime API. RealtimeSession.speak(text) sends conversation.item.create + response.create, accumulates response.audio.delta PCM bytes, appends them to a file. RealtimeSpeaker runs a JSONL-queue loop consuming meet_say calls. 'websockets' is an optional dep imported lazily. meet_bot.py: when HERMES_MEET_MODE=realtime, provisions AudioBridge, starts RealtimeSession + speaker thread, spawns paplay to pump PCM into the null-sink, then cleans everything up on SIGTERM. If any realtime setup step fails, falls back cleanly to transcribe mode with an error flagged in status.json. process_manager.enqueue_say(): writes a JSONL line to say_queue.jsonl; refuses when no active meeting or active meeting is transcribe-only. tools.meet_say: real implementation; requires active mode='realtime'. meet_join: adds mode='transcribe'|'realtime' param. v3 \u2014 remote node host node/protocol.py: JSON envelope (type, id, token, payload) + validate. node/registry.py: $HERMES_HOME/workspace/meetings/nodes.json, with resolve() auto-selecting the sole registered node when name is None. node/server.py: NodeServer \u2014 websockets.serve, bearer-token auth, dispatches start_bot/stop/status/transcript/say/ping onto the local process_manager. Token auto-generated + persisted on first run. node/client.py: NodeClient \u2014 short-lived sync WS per RPC, raises RuntimeError on error envelopes, clean API matching the server. node/cli.py: 'hermes meet node {run,list,approve,remove,status,ping}' subtree; wired into the main meet CLI by cli.py so 'hermes meet node' Just Works. tools.py: every meet_* tool accepts node='<name>'|'auto'; when set, routes through NodeClient to the remote bot instead of running locally. Unknown node \u2192 clear 'no registered meet node matches ...' error. cli.py: 'hermes meet join --node my-mac --mode realtime' and 'hermes meet say "..." --node my-mac' route to the node; 'hermes meet node approve <name> <url> <token>' registers one. Tests 21 v1 tests updated (meet_say is no longer a stub; active-record now carries mode). 20 new audio_bridge + realtime tests. 42 new node tests (protocol/registry/server/client/cli). 17 new v1/v2/v3 integration tests at the plugin level covering enqueue_say edge cases, env var passthrough, mode validation, node routing (known/unknown/auto/ambiguous), and argparse wiring for `hermes meet say` + `hermes meet node` + --mode/--node flags. Total: 100 plugin tests + 58 plugin-system tests = 158 passing. E2E verified on Linux with fresh HERMES_HOME: plugin loads, 5 tools register, on_session_end hook wires, 'hermes meet' CLI tree wires including the node subtree, NodeRegistry round-trips, meet_join routes correctly to NodeClient under node='my-mac' with mode='realtime', enqueue_say accepts realtime/rejects transcribe, argparse parses every new flag cleanly. Zero changes to core. All new code lives under plugins/google_meet/. * feat(plugins/google_meet): auto-install, admission detect, mac PCM pump, barge-in, richer status Ready-for-live-test follow-up on PR NousResearch#16364. Five additions that matter for the first live run on a real Meet, in priority order: 1. hermes meet install [--realtime] [--yes] pip install playwright websockets + python -m playwright install chromium --realtime: installs platform audio deps (pulseaudio-utils on Linux via sudo apt, blackhole-2ch + ffmpeg on macOS via brew). Prompts before sudo/brew unless --yes. Refuses on Windows. Refuses to auto-flip the macOS default input — user still selects BlackHole in System Settings (deliberate; surprise audio rerouting is worse than a manual step). 2. Admission detection _detect_admission(page): Leave-button visible OR caption region attached OR participants list present → we're in-call. _detect_denied(page): 'You can\'t join this video call' / 'You were removed' / 'No one responded to your request' → bail out. HERMES_MEET_LOBBY_TIMEOUT (default 300s) caps how long we sit in the lobby before giving up. in_call stays False until admitted. Status surfaces leaveReason: duration_expired | lobby_timeout | denied | page_closed. 3. macOS PCM pump ffmpeg reads speaker.pcm (24kHz s16le mono) and writes to the BlackHole AVFoundation output via -f audiotoolbox -audio_device_index <N>. _mac_audio_device_index() probes ffmpeg -f avfoundation -list_devices true to resolve 'BlackHole 2ch' → numeric index. Falls back to index 0 on probe failure. Linux paplay pump unchanged. 4. Richer status dict _BotState now tracks realtime, realtimeReady, realtimeDevice, audioBytesOut, lastAudioOutAt, lastBargeInAt, joinAttemptedAt, leaveReason. RealtimeSession.audio_bytes_out / last_audio_out_at counters fold into the status file once a second so meet_status() can show the agent's voice activity in near-real-time. 5. Barge-in RealtimeSession.cancel_response() sends type='response.cancel' over the same WS (lock-guarded so it's safe to call from the caption thread while speak() is reading frames). Handles response.cancelled as a terminal frame type. _looks_like_human_speaker() gates triggers so the bot's own name, 'You', 'Unknown', and blanks don't self-cancel. Called from the caption drain loop: when a new caption arrives attributed to a real participant while rt.session exists, we fire cancel_response() and stamp lastBargeInAt. Tests: 20 new unit tests across _BotState telemetry, barge-in gating, admission/denied probe error handling, cancel_response with and without a connected WS, and `hermes meet install` CLI wiring (flag parsing + end-to-end subprocess.run verification + Linux-already-installed fast path). Total 171 passing across all google_meet test files + the plugin-system regression suite. E2E verified on Linux: plugin loads, all 5 tools register, `hermes meet install --realtime --yes` parses, fresh-bot status.json has every new telemetry key, cancel_response on a disconnected session returns False without raising, barge-in helper gates the bot's own name correctly. Still out of scope (for a future PR, not blocking live test): mic → Realtime duplex (the agent listening to meeting audio via WebRTC), node-host TLS/pairing UX, Windows audio, Meet create+Twilio. Docs updated: SKILL.md now lists the installer subcommand, lobby timeout, barge-in caveat, and the full status-dict reference table. README.md quick-start uses hermes meet install.

…ers (NousResearch#17727) Covers ~60 merged PRs from Apr 15–29 that shipped user-visible behavior without docs coverage. No functional code changes; docs + static manifest regeneration only. Highlights: Stale / incorrect: - configuration.md: auxiliary auto-routing line was wrong since NousResearch#11900; now correctly states auto routes to the main model, with a note on the cost trade-off and per-task override pattern. - integrations/providers.md + configuration.md compression intro: removed stale 'Gemini Flash via OpenRouter' claim. - website/static/api/model-catalog.json: rebuilt from hermes_cli/models.py so the live manifest picks up tencent/hy3-preview (and remains in sync for future model-catalog PRs). Platform messaging (NousResearch#17417 NousResearch#16997 NousResearch#16193 NousResearch#14315 NousResearch#13151 NousResearch#11794 NousResearch#10610 NousResearch#10283 NousResearch#10246 NousResearch#11564 NousResearch#13178): - Signal: native formatting (bodyRanges), reply quotes, reactions. - Telegram: table rendering (bullets + code-block fallback), disable_link_previews, group_allowed_chats. - Slack: strict_mention config. - Discord: slash_commands disable, send_animation GIF, send_message native media attachments. - DingTalk: require_mention + allowed_users. CLI (NousResearch#16052 NousResearch#16539 NousResearch#16566 NousResearch#15841 NousResearch#14798 NousResearch#10043): - New 'hermes fallback' interactive manager. - New 'hermes update --check', '--backup' flag, and pre-update pairing snapshot behavior. - 'hermes gateway start/restart --all' multi-profile flag. - cron.md: 'hermes tools' as a platform, per-job enabled_toolsets, wakeAgent gate, context_from chaining. Config keys / env vars (NousResearch#17305 NousResearch#17026 NousResearch#17000 NousResearch#15077 NousResearch#14557 NousResearch#14227 NousResearch#14166 NousResearch#14730 NousResearch#17008): - terminal.docker_run_as_host_user, display.runtime_metadata_footer, compression.hygiene_hard_message_limit, HINDSIGHT_TIMEOUT, skills.guard_agent_created, TAVILY_BASE_URL, security.allow_private_urls, agent.api_max_retries, gateway hot-reload of compression/context_length config edits. TUI / CLI UX (NousResearch#17130 NousResearch#17113 NousResearch#17175 NousResearch#17150 NousResearch#16707 NousResearch#12312 NousResearch#12305 NousResearch#12934 NousResearch#14810 NousResearch#14045 NousResearch#17286 NousResearch#17126): - HERMES_TUI_RESUME, HERMES_TUI_THEME, LaTeX rendering, busy-indicator styles, ctrl-x queued-message delete, git branch in status bar, per- prompt elapsed stopwatch, external-editor keybind, markdown stripping, TUI voice-mode parity, /agents overlay, /reload + /mouse. Gateway features (NousResearch#16506 NousResearch#15027 NousResearch#13428 NousResearch#12116): - Native multimodal image routing based on vision capability. - /usage account-limits section. - /steer slash command (added to reference + explanation in CLI). Plugins / hooks (NousResearch#12929 NousResearch#12972 NousResearch#10763 NousResearch#16364): - transform_tool_result, transform_terminal_output plugin hooks. - PluginContext.dispatch_tool() documented with slash-command example. - google_meet bundled plugin entry under built-in-plugins.md. Other (NousResearch#16576 NousResearch#16572 NousResearch#16383 NousResearch#15878 NousResearch#15608 NousResearch#15606 NousResearch#14809 NousResearch#14767 NousResearch#14231 NousResearch#14232 NousResearch#14307 NousResearch#13683 NousResearch#12373 NousResearch#11891 NousResearch#11291 NousResearch#10066): - hermes backup exclusions (WAL/SHM/journal + checkpoints/). - security.md hardline blocklist (floor below --yolo). - FHS install layout for root installs. - openssh-client + docker-cli baked into the Docker image. - MEDIA: tag supported extensions table (docs/office/archives/pdf). - Remote-to-host file sync on SSH/Modal/Daytona teardown. - 'hermes model' -> Configure Auxiliary Models interactive picker. - Podman support via HERMES_DOCKER_BINARY. Providers / STT / one-shot (NousResearch#15045 NousResearch#14473 NousResearch#15704): - alibaba-coding-plan first-class provider entry. - xAI Grok STT as a 6th transcription option. - 'hermes -z' scripted one-shot mode + HERMES_INFERENCE_MODEL. Build: 'docusaurus build' succeeds. No new broken links/anchors; pre-existing warnings unchanged.