Skip to content

fix(agent:end hook): emit full response after post-processing#3

Merged
rmulligan merged 2 commits into
mainfrom
fix/agent-end-response-full
Apr 18, 2026
Merged

fix(agent:end hook): emit full response after post-processing#3
rmulligan merged 2 commits into
mainfrom
fix/agent-end-response-full

Conversation

@rmulligan

Copy link
Copy Markdown
Owner

Removes the 500-char cap on the agent:end hook's response field and moves the emit below the compression-exhaustion post-processing so hooks always see the final, user-visible text.

Problem

gateway/run.py truncated response to 500 chars before emitting agent:end:

await self.hooks.emit("agent:end", {
    **hook_ctx,
    "response": (response or "")[:500],
})

Two consequences:

  1. Hooks that need the full reply can't get it. Narration bridges, cross-session mirrors, dashboards — all received a 500-char preview with no opt-out. Observed today on a 970-char Matrix reply: a new TTS narration hook got the first 500 chars and Ryan heard a clipped thought.

  2. Stale text on compression exhaustion. The emit fires before the compression-exhaustion block that appends an auto-reset notice to response. Rare but real path where hooks mirror text the user never sees.

Fix

  • Move the emit below the post-processing so it always fires against the final response.
  • Replace the hard 500-char cap with an env knob: HERMES_HOOK_RESPONSE_MAX_CHARS. Default 0 = no cap. Hooks that want a preview trim on their side.
  • Strip transport-only markers ([[audio_as_voice]], MEDIA:<path>) before the hook sees the response so consumers render user-visible text only.

Test plan

  • python3 -m py_compile gateway/run.py passes
  • CodeRabbit CLI: 2 rounds of findings addressed, final pass "No findings"
  • Runtime verification: trigger agent:end for a >500-char Matrix reply and confirm the narration bridge receives the full text (pending gateway restart on target system)
  • Compression-exhaustion path: confirm the auto-reset notice is included in the hook payload

Two problems with the previous agent:end emit:

1. The response payload was hard-capped at 500 chars before
   reaching hooks. Downstream hooks that want the full reply
   (TTS narration, cross-session bridges, dashboards) had no
   way around it — 500 chars clips mid-sentence on almost any
   substantive answer. Observed today on a 970-char Matrix
   reply: the narration bridge only got the first 500 chars
   and Ryan heard a truncated thought.

2. The emit fired BEFORE the compression-exhaustion
   post-processing that appends an auto-reset notice to
   `response`. In that code path (rare but real), hooks
   received stale text while the user saw response+notice.

Fix:

- Move the emit below the compression-exhaustion block so
  hooks always see the final user-visible response.
- Replace the 500-char hard cap with an env-tunable knob,
  `HERMES_HOOK_RESPONSE_MAX_CHARS`. Default 0 (no cap) — ship
  the whole reply. Hooks that want a preview trim on their
  side; hooks that need the whole thing (TTS) just work.
- Strip transport-only markers (`[[audio_as_voice]]` and
  `MEDIA:<path>` lines) from the response before the hook
  sees it, so hooks render user-visible text only.

Verified: a 970-char Matrix reply now flows to the
matrix-narration-bridge hook in full; TTS narrates the whole
thing instead of clipping at 500.
@coderabbitai

coderabbitai Bot commented Apr 18, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5195c768-9f6c-458c-9878-f1fd4f686c22

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/agent-end-response-full

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request moves the agent:end hook emission to occur after the response is finalized, ensuring hooks receive the final user-visible text. It also introduces a configurable response length cap via the HERMES_HOOK_RESPONSE_MAX_CHARS environment variable and adds logic to strip transport markers. Review feedback suggests using the existing BasePlatformAdapter.extract_media method for more robust marker stripping and updating the session_id in the hook payload to prevent reporting stale IDs after session compression.

Comment thread gateway/run.py Outdated
Comment on lines +4180 to +4184
hook_response = re.sub(
r"(?m)^\[\[audio_as_voice\]\]\s*$|^MEDIA:\S+\s*$",
"",
response or "",
).strip()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regex used to strip transport markers is fragile and inconsistent with the gateway's core stripping logic. It only matches markers on their own lines and fails if paths contain spaces (since \S+ stops at whitespace). Using the existing BasePlatformAdapter.extract_media method ensures the hook sees exactly what the user sees and handles complex path patterns correctly.

            _, hook_response = BasePlatformAdapter.extract_media(response or "")
            hook_response = hook_response.strip()

Comment thread gateway/run.py
Comment on lines +4191 to +4194
await self.hooks.emit("agent:end", {
**hook_ctx,
"response": hook_response,
})

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The hook_ctx dictionary contains a stale session_id if context compression occurred during the agent run, as session_entry.session_id is updated at line 4063 but hook_ctx was created at line 3975. The agent:end hook should report the final session ID.

Suggested change
await self.hooks.emit("agent:end", {
**hook_ctx,
"response": hook_response,
})
await self.hooks.emit("agent:end", {
**hook_ctx,
"session_id": session_entry.session_id,
"response": hook_response,
})

…on_id

Per Gemini review on PR #3. Two improvements:

1. Transport-marker stripping now uses
   BasePlatformAdapter.extract_media() — the same helper the
   gateway's downstream send path uses. Handles paths with
   spaces (the regex broke on \S+) and stays consistent with
   other extraction sites (gateway/run.py:5639, 5866, 6051).

2. session_id in the hook payload is now read from
   session_entry.session_id at emit time, not from the
   potentially-stale hook_ctx built earlier in the turn. When
   context compression rotates the session id mid-run, hooks
   now see the final id the user's subsequent messages will
   land on.

Also keeps the HERMES_HOOK_RESPONSE_MAX_CHARS cap unchanged.
@rmulligan rmulligan merged commit 16cf7db into main Apr 18, 2026
1 check passed
@rmulligan rmulligan deleted the fix/agent-end-response-full branch April 18, 2026 20:01
rmulligan added a commit that referenced this pull request Apr 18, 2026
Two tandem improvements to the agent:start hook context, same
shape as the agent:end fix that landed in #3:

1. Drop the 500-char cap on the message field. The prior
   truncation silently clipped long user messages before hooks
   could see them. Hooks that genuinely need the full text
   (session-replay, cross-platform mirrors, narration, audit
   logs) had no escape hatch. Now capped only if
   HERMES_HOOK_MESSAGE_MAX_CHARS is explicitly set; 0 / missing
   = no cap. Matches the HERMES_HOOK_RESPONSE_MAX_CHARS knob
   from #3 so operators can size both caps in one place.

2. Add three routing fields to the context:
     chat_id          — source.chat_id (room id on Matrix,
                        channel id on Discord, etc.)
     thread_id        — source.thread_id (optional)
     message_event_id — event.message_id (Matrix event_id /
                        Telegram message_id / Discord message id)

   Hooks that want to construct a reply relation (Matrix's
   m.in_reply_to, Telegram's reply_to_message_id, etc.) now
   have everything they need without having to rummage through
   the source/event objects or parse stringified forms.

Motivated by the matrix-session-watchdog downstream hook which
needs event_id to post tap-to-resume nudges for interrupted
Matrix sessions. Without it, the nudge ships as a standalone
message and Ryan has to copy-paste the quoted text rather than
swipe-reply in Element.
rmulligan added a commit that referenced this pull request Apr 18, 2026
Two tandem improvements to the agent:start hook context, same
shape as the agent:end fix that landed in #3:

1. Drop the 500-char cap on the message field. The prior
   truncation silently clipped long user messages before hooks
   could see them. Hooks that genuinely need the full text
   (session-replay, cross-platform mirrors, narration, audit
   logs) had no escape hatch. Now capped only if
   HERMES_HOOK_MESSAGE_MAX_CHARS is explicitly set; 0 / missing
   = no cap. Matches the HERMES_HOOK_RESPONSE_MAX_CHARS knob
   from #3 so operators can size both caps in one place.

2. Add three routing fields to the context:
     chat_id          — source.chat_id (room id on Matrix,
                        channel id on Discord, etc.)
     thread_id        — source.thread_id (optional)
     message_event_id — event.message_id (Matrix event_id /
                        Telegram message_id / Discord message id)

   Hooks that want to construct a reply relation (Matrix's
   m.in_reply_to, Telegram's reply_to_message_id, etc.) now
   have everything they need without having to rummage through
   the source/event objects or parse stringified forms.

Motivated by the matrix-session-watchdog downstream hook which
needs event_id to post tap-to-resume nudges for interrupted
Matrix sessions. Without it, the nudge ships as a standalone
message and Ryan has to copy-paste the quoted text rather than
swipe-reply in Element.
rmulligan pushed a commit that referenced this pull request May 11, 2026
## Summary
- Forwards chat-completions `timeout` into the Codex Responses stream call.
- Adds total elapsed-time enforcement while the Responses stream is still yielding events.
- Closes the underlying client on timeout to unblock stalled streams, then raises `TimeoutError`.
- Adds focused tests for timeout forwarding and total timeout enforcement.

## Why
The Codex auxiliary adapter can be used by non-interactive auxiliary work such as context compression. If the stream keeps yielding progress-like events but never completes, SDK socket/read timeouts do not necessarily protect the full operation. This makes the CLI look stuck until the user force-interrupts the whole session.

This is a refreshed upstream-ready version of the earlier fork fix around `d3f08e9a0` / PR #3.

## Verification
- `python -m py_compile agent/auxiliary_client.py tests/agent/test_auxiliary_client.py`
- `python -m pytest -o addopts='' tests/agent/test_auxiliary_client.py::TestCodexAuxiliaryAdapterTimeout -q`
- `git diff --check`
rmulligan pushed a commit that referenced this pull request May 11, 2026
## Summary
- Forwards chat-completions `timeout` into the Codex Responses stream call.
- Adds total elapsed-time enforcement while the Responses stream is still yielding events.
- Closes the underlying client on timeout to unblock stalled streams, then raises `TimeoutError`.
- Adds focused tests for timeout forwarding and total timeout enforcement.

## Why
The Codex auxiliary adapter can be used by non-interactive auxiliary work such as context compression. If the stream keeps yielding progress-like events but never completes, SDK socket/read timeouts do not necessarily protect the full operation. This makes the CLI look stuck until the user force-interrupts the whole session.

This is a refreshed upstream-ready version of the earlier fork fix around `d3f08e9a0` / PR #3.

## Verification
- `python -m py_compile agent/auxiliary_client.py tests/agent/test_auxiliary_client.py`
- `python -m pytest -o addopts='' tests/agent/test_auxiliary_client.py::TestCodexAuxiliaryAdapterTimeout -q`
- `git diff --check`
rmulligan pushed a commit that referenced this pull request May 17, 2026
- memory_setup.py: use shlex.split() for plugin dep checks instead of shell=True
- transcription_tools.py: avoid shell=True for auto-detected whisper commands
  (user-provided templates via env var still use shell=True for compatibility)
- cli.py: add comment clarifying intentional shell=True for user quick_commands
- Add test verifying auto-detected template is shlex-safe

Addresses CONTRIBUTING.md Priority #3 (Security hardening — shell injection).
rmulligan pushed a commit that referenced this pull request May 17, 2026
…rch#25071)

* tui: make URLs clickable + hover-highlight in any terminal

Problem
-------
URLs printed by `hermes --tui` were not clickable in basic macOS Terminal.app.
Cmd+click did nothing, the cursor didn't change shape — like nothing was
detected — even though arrow buttons and other Box onClick handlers worked
fine.

Root cause
----------
Two layers of dead plumbing:

1. `<Link>` only emitted the underlying `<ink-link>` (which carries the
   hyperlink metadata into the screen buffer) when `supportsHyperlinks()`
   said yes. On Apple_Terminal that's false, so the per-cell hyperlink
   field stayed empty, so `Ink.getHyperlinkAt()` had nothing to return on
   click. The visible underline was just decorative.

2. `Ink.openHyperlink()` calls `this.onHyperlinkClick?.(url)`, but
   `onHyperlinkClick` was never assigned anywhere in the codebase. The
   click pipeline (`App.tsx → onOpenHyperlink → Ink.openHyperlink`) ran
   but bailed silently on the optional chain.

Bonus discovery: even when wired up, there was no hover affordance —
terminal apps can't change the system mouse cursor, so users had no
visual signal that a cell was clickable. Arrow buttons in the chrome
worked because they had explicit `<Box onClick>` styling; inline link
URLs didn't.

Fix
---
- `Link.tsx`: always emit `<ink-link>` regardless of terminal capability.
  The renderer's `wrapWithOsc8Link` already gates the actual OSC 8 escape
  on `supportsHyperlinks()` further down — so terminals that don't
  understand OSC 8 still don't see the escape, but the screen-buffer
  metadata (which the click dispatcher reads) is now populated everywhere.

- `ink.tsx + root.ts`: add `onHyperlinkClick?: (url: string) => void` to
  `Options` / `RenderOptions`, wire it to the existing `Ink.onHyperlinkClick`
  field in the constructor.

- `src/lib/openExternalUrl.ts`: small platform-aware opener using
  `child_process.spawn` with arg-array (no shell) — http(s) only, rejects
  `file:`, `javascript:`, `data:`, etc., so a hostile model can't trigger
  arbitrary local handlers via `<Link url="file:///...">`. Detached + stdio
  ignore so closing the TUI doesn't kill the browser and Chrome stderr
  doesn't leak into the alt screen.

- `entry.tsx`: pass `onHyperlinkClick: openExternalUrl` to `ink.render`.

- `hyperlinkHover.ts` + Ink hover wiring: track the URL under the pointer
  in `Ink.hoveredHyperlink`, update it from `dispatchHover`, and inverse-
  highlight every cell of the matching link in the render-pass overlay
  (same pattern as `applySearchHighlight`). This is the cursor-hover
  affordance for clickable links — terminals don't expose cursor shape,
  so we light up the link itself.

- `types/hermes-ink.d.ts`: add `onHyperlinkClick` to the `RenderOptions`
  shim so consumers (`entry.tsx`) type-check against the new option.

Tests
-----
- `src/lib/openExternalUrl.test.ts` (15 cases): http(s) accepted; file/js/
  data/mailto/ftp/ssh rejected; macOS open(1), Windows cmd.exe start with
  empty title slot, Linux xdg-open dispatch; shell-metacharacter URLs
  pass through unmolested as a single argv element; synchronous spawn
  failure returns false.

Verified empirically in Apple Terminal 455.1 (macOS 15.7.3): clicking a
URL opens in default browser, hovering inverts the link cells, and
moving away clears the highlight. Full TUI suite: 713 passing, 0
type errors.

Reverts
-------
The earlier attempt that version-gated Apple_Terminal in
`supports-hyperlinks.ts` was based on a wrong assumption — Terminal.app
silently strips OSC 8 sequences but does not render them as clickable
hyperlinks. Reverted to the original allowlist.

* tui: address Copilot review — explorer.exe on win32 + comment fixes

- openExternalUrl: switch win32 from `cmd.exe /c start` to `explorer.exe`.
  cmd.exe's `start` builtin reparses the URL through cmd's tokenizer, so
  `&`, `|`, `^`, `<`, `>` either split the command or get reinterpreted —
  breaking both the protocol-allowlist safety story AND plain http(s) URLs
  with `&` in query strings. `explorer.exe <url>` invokes the registered
  protocol handler directly with no shell.

- openExternalUrl.test.ts: rename the win32 test to reflect the new
  contract and add two regression tests — one with `&|^<>` metachars,
  one with the common analytics-URL `&` query-param pattern — both pinned
  to single-argv-element delivery via explorer.exe.

- Link.tsx: fix misleading comment. OSC 8 escapes are emitted
  unconditionally by the renderer (`wrapWithOsc8Link` in
  render-node-to-output.ts, `oscLink` in log-update.ts). Non-supporting
  terminals silently strip the sequence, which is why hover/click
  affordance has to come from the in-process overlay rather than the
  terminal's own link rendering.

Verified: 715/715 tests pass, type-check + build clean.

* tui: address Copilot review #2 — async spawn errors + hover scope + docs

1. openExternalUrl: attach a no-op `'error'` listener on the spawned
   child BEFORE unref(). spawn() returns a ChildProcess synchronously
   even when the binary is missing (ENOENT on xdg-open / explorer.exe),
   unreachable, or otherwise unusable; the failure surfaces later as
   an 'error' event. An unhandled 'error' on an EventEmitter crashes
   Node, which would tear down the whole TUI. The listener is a
   deliberate no-op — we already returned `true` synchronously and the
   user just doesn't see the browser pop.

2. openExternalUrl.test.ts: add a regression test using a real
   EventEmitter to simulate the async-error path. Pins both the
   listener-attached contract and the "doesn't throw on emit" behavior.
   Was 17/17, now 18/18.

3. ink.tsx dispatchHover: bypass `getHyperlinkAt()` and read
   `cellAt(...).hyperlink` directly. `getHyperlinkAt` falls back to
   `findPlainTextUrlAt` for cells without an OSC 8 hyperlink, but the
   render-pass overlay (`applyHyperlinkHoverHighlight`) only matches on
   `cell.hyperlink === hoveredUrl` — so plain-text URLs would burn
   re-renders without ever producing the highlight. Hover is now a
   strictly 1:1 fit for what the overlay can paint. Plain-text URLs
   still get the click action via the existing dispatch path.

4. root.ts + ink.tsx doc comments: replace the misleading "typically
   `open` / `xdg-open` / `start` shell" wording with the actual safe
   recipe — argv-array spawn into `open` / `xdg-open` / `explorer.exe`,
   with an explicit warning that `cmd.exe /c start` reparses the URL
   through cmd's tokenizer and is unsafe + breaks `&`-query URLs.

Verified: 716/716 tests pass, type-check + build clean.

* tui: address Copilot review #3 — hover damage, alt-screen cleanup, opener allowlist

1. ink.tsx onRender: stop folding steady-state hover into hlActive.
   hlActive forces a full-screen damage diff so previous-frame inverted
   cells get re-emitted when the highlight set changes. The transition
   IS the trigger — enter / leave / change-to-other-link. While the
   pointer just sits on a link the painted cells don't change and the
   per-cell diff handles the no-op. Folding the steady state in would
   burn a full-screen diff on every frame. Added a
   lastRenderedHoveredHyperlink tracker and gate the hlActive bump on
   `hovered !== lastRendered`.

2. ink.tsx setAltScreenActive: clear hoveredHyperlink (and the tracker)
   when toggling alt-screen state. Hover dispatch is alt-screen-gated,
   so once we leave there's no path to clear it. Without this, remounting
   <AlternateScreen> would paint a phantom hover from the previous
   session until the next mouse-move arrived.

3. openExternalUrl.ts openCommand: allowlist linux + the BSD family for
   xdg-open and return null for everything else (aix, sunos, cygwin,
   haiku, etc.). Previously the default-fallback always returned
   xdg-open, which made the caller's `if (!command) return false` dead
   and yielded a misleading `true` on platforms that probably don't
   have xdg-open. New tests cover the null path AND the
   openExternalUrl-returns-false-without-spawning behavior.

Verified: 718/718 tests pass, type-check + build clean.

* tui: address Copilot review #4 — doc comment accuracy

1. openExternalUrl return-value doc: now lists all three false paths
   (URL rejected / no opener for platform / synchronous spawn throw)
   plus a note that async 'error' events still return true because the
   spawn was attempted.

2. ink.tsx onHyperlinkClick field doc: clarifies the callback receives
   either an OSC 8 hyperlink OR a plain-text URL detected by
   findPlainTextUrlAt — App.tsx routes both into the same callback.

3. hyperlinkHover applyHyperlinkHoverHighlight doc: drops the misleading
   'caller forces full-frame damage' promise. Caller decides; for hover
   the current caller only forces full damage on transitions.

No behavior change. 718/718 tests pass.

* tui: address Copilot review #5 — lint fixes

1. ink.tsx: reorder `./hyperlinkHover.js` import before `./screen.js` to
   satisfy perfectionist/sort-imports.

2. Link.tsx: drop unused `fallback` parameter destructuring + the
   trailing `void (null as ...)` dead-statement (would trip
   no-unused-expressions). Kept `fallback?: ReactNode` on the Props
   interface as a documented compat shim so existing call sites still
   compile, with a comment explaining why it's no longer wired up.

3. openExternalUrl.test.ts: replace `typeof import('node:child_process').spawn`
   inline annotations (forbidden by @typescript-eslint/consistent-type-imports)
   with a `SpawnLike` type alias backed by a real `import type { spawn as SpawnFn }`.

No behavior change. 718/718 tests pass, type-check clean, lint clean on
all modified files.
rmulligan pushed a commit that referenced this pull request May 17, 2026
Three issues flagged by the Copilot review on this PR:

1. Double JSON emit on stage failure (Copilot #1, #2). When -Stage <name>
   ran a worker that threw, Invoke-Stage's finally emitted a JSON result
   frame AND the entry-point catch emitted a second error frame --
   producing two concatenated JSON objects on stdout and breaking the
   one-line-per-invocation contract that drivers parse against. Same
   issue applied to -Json mode on a full install (every stage's finally
   plus a final error frame missing duration_ms/skipped).

   Fix: Invoke-Stage's finally now sets $script:_StageEmittedErrorFrame
   when it emits a failure frame; the entry-point catch checks the flag
   and skips its own emit, still exit 1.

2. $prevEAP uninitialized on early try-block throw (Copilot #3). In
   Install-Uv, Test-Python, Test-Node's winget fallback,
   _Run-NpmInstall, and the playwright block, '$prevEAP =
   $ErrorActionPreference' lived as the first statement INSIDE the
   try. If anything between 'try {' and that line threw (Write-Info on
   an unusual host, the npx-finding loop, etc.), the catch's
   'if ($prevEAP) { ... }' restore was a no-op and EAP could remain
   relaxed.

   Fix: hoist '$prevEAP = $ErrorActionPreference' to the line
   immediately before 'try {' in all five sites. Catch's restore is
   now always meaningful regardless of where in the try the throw
   originated.

No change to Invoke-Stage's success path or to the four lint-clean EAP
sites (Test-Node was the only winget-related catch). All 19 metadata
smoke tests still pass.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant