fix(agent:end hook): emit full response after post-processing#3
Conversation
Two problems with the previous agent:end emit: 1. The response payload was hard-capped at 500 chars before reaching hooks. Downstream hooks that want the full reply (TTS narration, cross-session bridges, dashboards) had no way around it — 500 chars clips mid-sentence on almost any substantive answer. Observed today on a 970-char Matrix reply: the narration bridge only got the first 500 chars and Ryan heard a truncated thought. 2. The emit fired BEFORE the compression-exhaustion post-processing that appends an auto-reset notice to `response`. In that code path (rare but real), hooks received stale text while the user saw response+notice. Fix: - Move the emit below the compression-exhaustion block so hooks always see the final user-visible response. - Replace the 500-char hard cap with an env-tunable knob, `HERMES_HOOK_RESPONSE_MAX_CHARS`. Default 0 (no cap) — ship the whole reply. Hooks that want a preview trim on their side; hooks that need the whole thing (TTS) just work. - Strip transport-only markers (`[[audio_as_voice]]` and `MEDIA:<path>` lines) from the response before the hook sees it, so hooks render user-visible text only. Verified: a 970-char Matrix reply now flows to the matrix-narration-bridge hook in full; TTS narrates the whole thing instead of clipping at 500.
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request moves the agent:end hook emission to occur after the response is finalized, ensuring hooks receive the final user-visible text. It also introduces a configurable response length cap via the HERMES_HOOK_RESPONSE_MAX_CHARS environment variable and adds logic to strip transport markers. Review feedback suggests using the existing BasePlatformAdapter.extract_media method for more robust marker stripping and updating the session_id in the hook payload to prevent reporting stale IDs after session compression.
| hook_response = re.sub( | ||
| r"(?m)^\[\[audio_as_voice\]\]\s*$|^MEDIA:\S+\s*$", | ||
| "", | ||
| response or "", | ||
| ).strip() |
There was a problem hiding this comment.
The regex used to strip transport markers is fragile and inconsistent with the gateway's core stripping logic. It only matches markers on their own lines and fails if paths contain spaces (since \S+ stops at whitespace). Using the existing BasePlatformAdapter.extract_media method ensures the hook sees exactly what the user sees and handles complex path patterns correctly.
_, hook_response = BasePlatformAdapter.extract_media(response or "")
hook_response = hook_response.strip()| await self.hooks.emit("agent:end", { | ||
| **hook_ctx, | ||
| "response": hook_response, | ||
| }) |
There was a problem hiding this comment.
The hook_ctx dictionary contains a stale session_id if context compression occurred during the agent run, as session_entry.session_id is updated at line 4063 but hook_ctx was created at line 3975. The agent:end hook should report the final session ID.
| await self.hooks.emit("agent:end", { | |
| **hook_ctx, | |
| "response": hook_response, | |
| }) | |
| await self.hooks.emit("agent:end", { | |
| **hook_ctx, | |
| "session_id": session_entry.session_id, | |
| "response": hook_response, | |
| }) |
…on_id Per Gemini review on PR #3. Two improvements: 1. Transport-marker stripping now uses BasePlatformAdapter.extract_media() — the same helper the gateway's downstream send path uses. Handles paths with spaces (the regex broke on \S+) and stays consistent with other extraction sites (gateway/run.py:5639, 5866, 6051). 2. session_id in the hook payload is now read from session_entry.session_id at emit time, not from the potentially-stale hook_ctx built earlier in the turn. When context compression rotates the session id mid-run, hooks now see the final id the user's subsequent messages will land on. Also keeps the HERMES_HOOK_RESPONSE_MAX_CHARS cap unchanged.
Two tandem improvements to the agent:start hook context, same shape as the agent:end fix that landed in #3: 1. Drop the 500-char cap on the message field. The prior truncation silently clipped long user messages before hooks could see them. Hooks that genuinely need the full text (session-replay, cross-platform mirrors, narration, audit logs) had no escape hatch. Now capped only if HERMES_HOOK_MESSAGE_MAX_CHARS is explicitly set; 0 / missing = no cap. Matches the HERMES_HOOK_RESPONSE_MAX_CHARS knob from #3 so operators can size both caps in one place. 2. Add three routing fields to the context: chat_id — source.chat_id (room id on Matrix, channel id on Discord, etc.) thread_id — source.thread_id (optional) message_event_id — event.message_id (Matrix event_id / Telegram message_id / Discord message id) Hooks that want to construct a reply relation (Matrix's m.in_reply_to, Telegram's reply_to_message_id, etc.) now have everything they need without having to rummage through the source/event objects or parse stringified forms. Motivated by the matrix-session-watchdog downstream hook which needs event_id to post tap-to-resume nudges for interrupted Matrix sessions. Without it, the nudge ships as a standalone message and Ryan has to copy-paste the quoted text rather than swipe-reply in Element.
Two tandem improvements to the agent:start hook context, same shape as the agent:end fix that landed in #3: 1. Drop the 500-char cap on the message field. The prior truncation silently clipped long user messages before hooks could see them. Hooks that genuinely need the full text (session-replay, cross-platform mirrors, narration, audit logs) had no escape hatch. Now capped only if HERMES_HOOK_MESSAGE_MAX_CHARS is explicitly set; 0 / missing = no cap. Matches the HERMES_HOOK_RESPONSE_MAX_CHARS knob from #3 so operators can size both caps in one place. 2. Add three routing fields to the context: chat_id — source.chat_id (room id on Matrix, channel id on Discord, etc.) thread_id — source.thread_id (optional) message_event_id — event.message_id (Matrix event_id / Telegram message_id / Discord message id) Hooks that want to construct a reply relation (Matrix's m.in_reply_to, Telegram's reply_to_message_id, etc.) now have everything they need without having to rummage through the source/event objects or parse stringified forms. Motivated by the matrix-session-watchdog downstream hook which needs event_id to post tap-to-resume nudges for interrupted Matrix sessions. Without it, the nudge ships as a standalone message and Ryan has to copy-paste the quoted text rather than swipe-reply in Element.
## Summary - Forwards chat-completions `timeout` into the Codex Responses stream call. - Adds total elapsed-time enforcement while the Responses stream is still yielding events. - Closes the underlying client on timeout to unblock stalled streams, then raises `TimeoutError`. - Adds focused tests for timeout forwarding and total timeout enforcement. ## Why The Codex auxiliary adapter can be used by non-interactive auxiliary work such as context compression. If the stream keeps yielding progress-like events but never completes, SDK socket/read timeouts do not necessarily protect the full operation. This makes the CLI look stuck until the user force-interrupts the whole session. This is a refreshed upstream-ready version of the earlier fork fix around `d3f08e9a0` / PR #3. ## Verification - `python -m py_compile agent/auxiliary_client.py tests/agent/test_auxiliary_client.py` - `python -m pytest -o addopts='' tests/agent/test_auxiliary_client.py::TestCodexAuxiliaryAdapterTimeout -q` - `git diff --check`
## Summary - Forwards chat-completions `timeout` into the Codex Responses stream call. - Adds total elapsed-time enforcement while the Responses stream is still yielding events. - Closes the underlying client on timeout to unblock stalled streams, then raises `TimeoutError`. - Adds focused tests for timeout forwarding and total timeout enforcement. ## Why The Codex auxiliary adapter can be used by non-interactive auxiliary work such as context compression. If the stream keeps yielding progress-like events but never completes, SDK socket/read timeouts do not necessarily protect the full operation. This makes the CLI look stuck until the user force-interrupts the whole session. This is a refreshed upstream-ready version of the earlier fork fix around `d3f08e9a0` / PR #3. ## Verification - `python -m py_compile agent/auxiliary_client.py tests/agent/test_auxiliary_client.py` - `python -m pytest -o addopts='' tests/agent/test_auxiliary_client.py::TestCodexAuxiliaryAdapterTimeout -q` - `git diff --check`
- memory_setup.py: use shlex.split() for plugin dep checks instead of shell=True - transcription_tools.py: avoid shell=True for auto-detected whisper commands (user-provided templates via env var still use shell=True for compatibility) - cli.py: add comment clarifying intentional shell=True for user quick_commands - Add test verifying auto-detected template is shlex-safe Addresses CONTRIBUTING.md Priority #3 (Security hardening — shell injection).
…rch#25071) * tui: make URLs clickable + hover-highlight in any terminal Problem ------- URLs printed by `hermes --tui` were not clickable in basic macOS Terminal.app. Cmd+click did nothing, the cursor didn't change shape — like nothing was detected — even though arrow buttons and other Box onClick handlers worked fine. Root cause ---------- Two layers of dead plumbing: 1. `<Link>` only emitted the underlying `<ink-link>` (which carries the hyperlink metadata into the screen buffer) when `supportsHyperlinks()` said yes. On Apple_Terminal that's false, so the per-cell hyperlink field stayed empty, so `Ink.getHyperlinkAt()` had nothing to return on click. The visible underline was just decorative. 2. `Ink.openHyperlink()` calls `this.onHyperlinkClick?.(url)`, but `onHyperlinkClick` was never assigned anywhere in the codebase. The click pipeline (`App.tsx → onOpenHyperlink → Ink.openHyperlink`) ran but bailed silently on the optional chain. Bonus discovery: even when wired up, there was no hover affordance — terminal apps can't change the system mouse cursor, so users had no visual signal that a cell was clickable. Arrow buttons in the chrome worked because they had explicit `<Box onClick>` styling; inline link URLs didn't. Fix --- - `Link.tsx`: always emit `<ink-link>` regardless of terminal capability. The renderer's `wrapWithOsc8Link` already gates the actual OSC 8 escape on `supportsHyperlinks()` further down — so terminals that don't understand OSC 8 still don't see the escape, but the screen-buffer metadata (which the click dispatcher reads) is now populated everywhere. - `ink.tsx + root.ts`: add `onHyperlinkClick?: (url: string) => void` to `Options` / `RenderOptions`, wire it to the existing `Ink.onHyperlinkClick` field in the constructor. - `src/lib/openExternalUrl.ts`: small platform-aware opener using `child_process.spawn` with arg-array (no shell) — http(s) only, rejects `file:`, `javascript:`, `data:`, etc., so a hostile model can't trigger arbitrary local handlers via `<Link url="file:///...">`. Detached + stdio ignore so closing the TUI doesn't kill the browser and Chrome stderr doesn't leak into the alt screen. - `entry.tsx`: pass `onHyperlinkClick: openExternalUrl` to `ink.render`. - `hyperlinkHover.ts` + Ink hover wiring: track the URL under the pointer in `Ink.hoveredHyperlink`, update it from `dispatchHover`, and inverse- highlight every cell of the matching link in the render-pass overlay (same pattern as `applySearchHighlight`). This is the cursor-hover affordance for clickable links — terminals don't expose cursor shape, so we light up the link itself. - `types/hermes-ink.d.ts`: add `onHyperlinkClick` to the `RenderOptions` shim so consumers (`entry.tsx`) type-check against the new option. Tests ----- - `src/lib/openExternalUrl.test.ts` (15 cases): http(s) accepted; file/js/ data/mailto/ftp/ssh rejected; macOS open(1), Windows cmd.exe start with empty title slot, Linux xdg-open dispatch; shell-metacharacter URLs pass through unmolested as a single argv element; synchronous spawn failure returns false. Verified empirically in Apple Terminal 455.1 (macOS 15.7.3): clicking a URL opens in default browser, hovering inverts the link cells, and moving away clears the highlight. Full TUI suite: 713 passing, 0 type errors. Reverts ------- The earlier attempt that version-gated Apple_Terminal in `supports-hyperlinks.ts` was based on a wrong assumption — Terminal.app silently strips OSC 8 sequences but does not render them as clickable hyperlinks. Reverted to the original allowlist. * tui: address Copilot review — explorer.exe on win32 + comment fixes - openExternalUrl: switch win32 from `cmd.exe /c start` to `explorer.exe`. cmd.exe's `start` builtin reparses the URL through cmd's tokenizer, so `&`, `|`, `^`, `<`, `>` either split the command or get reinterpreted — breaking both the protocol-allowlist safety story AND plain http(s) URLs with `&` in query strings. `explorer.exe <url>` invokes the registered protocol handler directly with no shell. - openExternalUrl.test.ts: rename the win32 test to reflect the new contract and add two regression tests — one with `&|^<>` metachars, one with the common analytics-URL `&` query-param pattern — both pinned to single-argv-element delivery via explorer.exe. - Link.tsx: fix misleading comment. OSC 8 escapes are emitted unconditionally by the renderer (`wrapWithOsc8Link` in render-node-to-output.ts, `oscLink` in log-update.ts). Non-supporting terminals silently strip the sequence, which is why hover/click affordance has to come from the in-process overlay rather than the terminal's own link rendering. Verified: 715/715 tests pass, type-check + build clean. * tui: address Copilot review #2 — async spawn errors + hover scope + docs 1. openExternalUrl: attach a no-op `'error'` listener on the spawned child BEFORE unref(). spawn() returns a ChildProcess synchronously even when the binary is missing (ENOENT on xdg-open / explorer.exe), unreachable, or otherwise unusable; the failure surfaces later as an 'error' event. An unhandled 'error' on an EventEmitter crashes Node, which would tear down the whole TUI. The listener is a deliberate no-op — we already returned `true` synchronously and the user just doesn't see the browser pop. 2. openExternalUrl.test.ts: add a regression test using a real EventEmitter to simulate the async-error path. Pins both the listener-attached contract and the "doesn't throw on emit" behavior. Was 17/17, now 18/18. 3. ink.tsx dispatchHover: bypass `getHyperlinkAt()` and read `cellAt(...).hyperlink` directly. `getHyperlinkAt` falls back to `findPlainTextUrlAt` for cells without an OSC 8 hyperlink, but the render-pass overlay (`applyHyperlinkHoverHighlight`) only matches on `cell.hyperlink === hoveredUrl` — so plain-text URLs would burn re-renders without ever producing the highlight. Hover is now a strictly 1:1 fit for what the overlay can paint. Plain-text URLs still get the click action via the existing dispatch path. 4. root.ts + ink.tsx doc comments: replace the misleading "typically `open` / `xdg-open` / `start` shell" wording with the actual safe recipe — argv-array spawn into `open` / `xdg-open` / `explorer.exe`, with an explicit warning that `cmd.exe /c start` reparses the URL through cmd's tokenizer and is unsafe + breaks `&`-query URLs. Verified: 716/716 tests pass, type-check + build clean. * tui: address Copilot review #3 — hover damage, alt-screen cleanup, opener allowlist 1. ink.tsx onRender: stop folding steady-state hover into hlActive. hlActive forces a full-screen damage diff so previous-frame inverted cells get re-emitted when the highlight set changes. The transition IS the trigger — enter / leave / change-to-other-link. While the pointer just sits on a link the painted cells don't change and the per-cell diff handles the no-op. Folding the steady state in would burn a full-screen diff on every frame. Added a lastRenderedHoveredHyperlink tracker and gate the hlActive bump on `hovered !== lastRendered`. 2. ink.tsx setAltScreenActive: clear hoveredHyperlink (and the tracker) when toggling alt-screen state. Hover dispatch is alt-screen-gated, so once we leave there's no path to clear it. Without this, remounting <AlternateScreen> would paint a phantom hover from the previous session until the next mouse-move arrived. 3. openExternalUrl.ts openCommand: allowlist linux + the BSD family for xdg-open and return null for everything else (aix, sunos, cygwin, haiku, etc.). Previously the default-fallback always returned xdg-open, which made the caller's `if (!command) return false` dead and yielded a misleading `true` on platforms that probably don't have xdg-open. New tests cover the null path AND the openExternalUrl-returns-false-without-spawning behavior. Verified: 718/718 tests pass, type-check + build clean. * tui: address Copilot review #4 — doc comment accuracy 1. openExternalUrl return-value doc: now lists all three false paths (URL rejected / no opener for platform / synchronous spawn throw) plus a note that async 'error' events still return true because the spawn was attempted. 2. ink.tsx onHyperlinkClick field doc: clarifies the callback receives either an OSC 8 hyperlink OR a plain-text URL detected by findPlainTextUrlAt — App.tsx routes both into the same callback. 3. hyperlinkHover applyHyperlinkHoverHighlight doc: drops the misleading 'caller forces full-frame damage' promise. Caller decides; for hover the current caller only forces full damage on transitions. No behavior change. 718/718 tests pass. * tui: address Copilot review #5 — lint fixes 1. ink.tsx: reorder `./hyperlinkHover.js` import before `./screen.js` to satisfy perfectionist/sort-imports. 2. Link.tsx: drop unused `fallback` parameter destructuring + the trailing `void (null as ...)` dead-statement (would trip no-unused-expressions). Kept `fallback?: ReactNode` on the Props interface as a documented compat shim so existing call sites still compile, with a comment explaining why it's no longer wired up. 3. openExternalUrl.test.ts: replace `typeof import('node:child_process').spawn` inline annotations (forbidden by @typescript-eslint/consistent-type-imports) with a `SpawnLike` type alias backed by a real `import type { spawn as SpawnFn }`. No behavior change. 718/718 tests pass, type-check clean, lint clean on all modified files.
Three issues flagged by the Copilot review on this PR: 1. Double JSON emit on stage failure (Copilot #1, #2). When -Stage <name> ran a worker that threw, Invoke-Stage's finally emitted a JSON result frame AND the entry-point catch emitted a second error frame -- producing two concatenated JSON objects on stdout and breaking the one-line-per-invocation contract that drivers parse against. Same issue applied to -Json mode on a full install (every stage's finally plus a final error frame missing duration_ms/skipped). Fix: Invoke-Stage's finally now sets $script:_StageEmittedErrorFrame when it emits a failure frame; the entry-point catch checks the flag and skips its own emit, still exit 1. 2. $prevEAP uninitialized on early try-block throw (Copilot #3). In Install-Uv, Test-Python, Test-Node's winget fallback, _Run-NpmInstall, and the playwright block, '$prevEAP = $ErrorActionPreference' lived as the first statement INSIDE the try. If anything between 'try {' and that line threw (Write-Info on an unusual host, the npx-finding loop, etc.), the catch's 'if ($prevEAP) { ... }' restore was a no-op and EAP could remain relaxed. Fix: hoist '$prevEAP = $ErrorActionPreference' to the line immediately before 'try {' in all five sites. Catch's restore is now always meaningful regardless of where in the try the throw originated. No change to Invoke-Stage's success path or to the four lint-clean EAP sites (Test-Node was the only winget-related catch). All 19 metadata smoke tests still pass.
Removes the 500-char cap on the
agent:endhook'sresponsefield and moves the emit below the compression-exhaustion post-processing so hooks always see the final, user-visible text.Problem
gateway/run.pytruncatedresponseto 500 chars before emittingagent:end:Two consequences:
Hooks that need the full reply can't get it. Narration bridges, cross-session mirrors, dashboards — all received a 500-char preview with no opt-out. Observed today on a 970-char Matrix reply: a new TTS narration hook got the first 500 chars and Ryan heard a clipped thought.
Stale text on compression exhaustion. The emit fires before the compression-exhaustion block that appends an auto-reset notice to
response. Rare but real path where hooks mirror text the user never sees.Fix
response.HERMES_HOOK_RESPONSE_MAX_CHARS. Default 0 = no cap. Hooks that want a preview trim on their side.[[audio_as_voice]],MEDIA:<path>) before the hook sees the response so consumers render user-visible text only.Test plan
python3 -m py_compile gateway/run.pypasses