Skip to content

feat(hooks): add before_llm_call, after_llm_call, before_response_emit tool/content mutating hooks [claude, human developer oversight]#33916

Closed
zeroaltitude wants to merge 43 commits intoopenclaw:mainfrom
zeroaltitude:feat/hook-llm-call-response-emit
Closed

feat(hooks): add before_llm_call, after_llm_call, before_response_emit tool/content mutating hooks [claude, human developer oversight]#33916
zeroaltitude wants to merge 43 commits intoopenclaw:mainfrom
zeroaltitude:feat/hook-llm-call-response-emit

Conversation

@zeroaltitude
Copy link
Copy Markdown
Contributor

@zeroaltitude zeroaltitude commented Mar 4, 2026

TL;DR

Adds three sequential / mutating plugin hooks that allow policy plugins to intercept and gate:

  • LLM inputs (before_llm_call)
  • Tool calls produced by the LLM (after_llm_call)
  • The final assistant response before delivery (before_response_emit)

Includes an after_llm_call "gate bridge" so tool-call blocking/filtering works without restructuring the pi-agent-core loop. No behavior changes unless a plugin registers these hooks.

New hooks (contracts)

Hook Fires Can modify Can block Primary use
before_llm_call Before each LLM API call messages/systemPrompt/tools yes Tool allowlisting, input policies
after_llm_call After LLM response, before tool execution toolCalls yes Batch tool-call auditing/filtering/blocking
before_response_emit When final assistant text is selected, before delivery content, allContent yes Output policies / redaction / response gating

These follow the same sequential "merge results across handlers" pattern as before_tool_call / message_sending.

Architecture: after_llm_call gate bridge

after_llm_call runs from an event subscription callback, but tool execution happens elsewhere. To avoid invasive loop refactors:

  1. after_llm_call writes its decision into a per-session gate (after-llm-call-gate.ts)
  2. before_tool_call checks the gate before each tool executes (block or allowlist)
  3. Gate clears on turn_start and cleanup to prevent stale decisions
  4. Monotonic messageEndSeq counter prevents intra-turn race conditions between slow async handlers

Review guide (start here)

  1. API surface: src/plugins/types.ts (new hook names + event/result types)
  2. Hook runner semantics: src/plugins/hooks.ts (sequential/FIFO merge + error handling)
  3. before_llm_call integration: src/agents/pi-embedded-runner/run/hook-stream-wrapper.ts (+ tests)
  4. after_llm_call integration + gate:
    • src/agents/pi-embedded-runner/run/attempt.ts (emits hook events)
    • src/agents/pi-embedded-runner/run/after-llm-call-gate.ts (+ tests)
    • src/agents/pi-tools.before-tool-call.ts (enforcement)
  5. before_response_emit integration: src/agents/pi-embedded-runner/run/hook-response-emit.ts (+ tests)
    • Supports both single-message (content) and multi-turn (allContent) modification
    • Run-scoped via preRunMessageCount with compaction-safe fallback
  6. Glue: src/hooks/internal-hooks.ts, src/gateway/server.impl.ts, src/gateway/server-startup.ts

Security / correctness notes

  • before_response_emit blocking clears/redacts content from all current-run assistant messages (not just the last) so it does not persist into session history
  • Multi-part assistant messages are handled so modified content doesn't leave stale fragments
  • Handler execution order is FIFO (registration order), consistent with registry expectations
  • Run-scoped rewrites never corrupt previously-delivered session history, even after compaction
  • allContent enables full multi-turn PII redaction across tool-loop iterations

Testing

Adds/extends unit & integration tests across:

  • Stream wrapper (before_llm_call)
  • after_llm_call gate bridge (block/filter/clear/isolation)
  • before_response_emit (modification, blocking, session-history scrubbing, allContent, run scoping, compaction fallback)
  • Hook runner merge semantics / no-handler paths
  • getRunScopedMessages (normal, compaction, defensive fallback)

AI-Assisted Development 🤖

This PR was developed with AI assistance (Claude/OpenClaw agent "Tank").

  • Degree of testing: Comprehensive — 75+ unit tests across 5 test files. pnpm check clean.
  • Human oversight: All changes reviewed and directed by Eddie Abrams
  • Understanding confirmed: The contributor understands the embedded agent loop, gate bridge architecture, and hook merge semantics

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 68a376e318

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
Comment thread src/agents/pi-embedded-runner/run/hook-response-emit.ts Outdated
Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 4, 2026

Greptile Summary

This PR adds three sequential/modifying plugin hooks — before_llm_call, after_llm_call, and before_response_emit — that allow policy plugins to intercept and gate LLM inputs, tool execution, and assistant responses before delivery. After several rounds of review iteration, the implementation is in good shape.

What changed and why it fits:

  • The hooks follow the same sequential FIFO merge pattern as before_tool_call and message_sending, keeping the system consistent
  • A per-session gate bridge (after-llm-call-gate.ts) decouples the subscription-based after_llm_call event from the synchronous before_tool_call enforcement point, avoiding invasive agent loop restructuring
  • Merge semantics are security-first: intersection latch for tools/toolCalls, one-way latch for block, and first-writer-wins for all content fields

Previously raised issues that have been addressed:

  • hookCtx and hookIterationRefEarly TDZ issues resolved by early declaration
  • block: true in before_response_emit now returns a tagged { blocked: boolean } result (not "" sentinel)
  • clearAllAssistantContent now clears entire content arrays to [], removing tool_use blocks that could carry sensitive input data
  • allContent splice is bounded to original assistantTexts.length to prevent plugin content injection
  • rewriteAllAssistantContent now filters to text-bearing messages only, matching getRunScopedMessages counting
  • tools-without-toolCallId allowlist bypass in the gate fixed
  • messages/systemPrompt merge uses first-writer-wins semantics
  • content/allContent merge uses first-writer-wins with a cross-lock
  • PluginHookAfterLlmCallResult is exported from hooks.ts
  • allContent field added to PluginHookBeforeResponseEmitEvent type
  • latencyMs/tokenUsage JSDoc updated to mark them as reserved/unpopulated
  • Duplicate/stale test comment in hooks.extended.test.ts corrected

Remaining concerns:

  • rewriteLastAssistantContent uses "content" in m as its filter, while getRunScopedMessages uses a stricter text-bearing (extractAssistantText().length > 0) filter. In edge cases where runMessages ends with a tool-call-only assistant message, the rewrite would silently land on the wrong message and be a no-op, leaving session history out of sync with what was delivered
  • The mock in hook-response-emit.test.ts includes stubs for runContextAssembled, runLoopIterationStart, and runLoopIterationEnd which don't exist in the current HookRunner type — these appear to be pre-emptive stubs for hooks from a separate in-progress branch
  • Plugin authors returning { messages: [] } from before_llm_call will silently wipe all conversation history for that LLM call; a warn-level log when an explicitly-empty array is passed would reduce this footgun's blast radius

Confidence Score: 4/5

  • PR is safe to merge — all critical blocking/filtering and session-history-corruption issues from prior review rounds have been resolved; remaining items are edge cases and style improvements.
  • The implementation is architecturally sound and the security-critical paths (block propagation, history clearing on block, merge semantics preventing allowlist widening) have been carefully addressed across multiple review iterations. The remaining issues are minor: a filter inconsistency in rewriteLastAssistantContent that affects only edge-case tool-call-only message sequences, test stubs coupling to unreleased work, and a footgun for plugin authors who pass messages: []. No regressions to existing hooks are introduced since the new code paths are gated behind hasHooks() checks.
  • Pay close attention to src/agents/pi-embedded-runner/run/hook-response-emit.ts (filter consistency in rewriteLastAssistantContent) and src/agents/pi-embedded-runner/run/hook-response-emit.test.ts (stubs for unreleased hook methods).

Comments Outside Diff (3)

  1. src/agents/pi-embedded-runner/run/hook-response-emit.test.ts, line 605-611 (link)

    Mock stubs reference methods absent from HookRunner

    The mock includes runContextAssembled, runLoopIterationStart, and runLoopIterationEnd (lines 607–609), but none of these methods exist in the HookRunner type returned by createHookRunner in hooks.ts. The as unknown as HookRunner cast on line 611 silences the TypeScript error.

    Based on the comment in hooks.extended.test.ts:

    // NOTE: context_assembled and loop_iteration_start/end tests are in
    // hooks.context-loop.test.ts (feat/hook-context-assembled-loop-iteration).
    

    these hooks belong to an in-progress branch that has not yet merged. Pre-emptively stubbing them here creates an implicit coupling: if those hooks are renamed or restructured before merging, this test mock becomes incorrect without any compiler warning.

    Consider removing the three stubs until the other branch lands, or adding a comment cross-referencing the branch name so the coupling is explicit and auditable.

  2. src/agents/pi-embedded-runner/run/hook-response-emit.ts, line 200-208 (link)

    rewriteLastAssistantContent uses a weaker filter than getRunScopedMessages

    getRunScopedMessages (line 181) only counts assistant messages where extractAssistantText(m).length > 0 — correctly excluding tool-call-only messages. rewriteLastAssistantContent scans for "content" in m (any assistant with a content field), which can match tool-call-only messages whose content is an array of { type: "tool_use" } blocks.

    If runMessages ever ends with a tool-call-only assistant message (e.g., the LLM's last action was a pure tool dispatch that terminated mid-loop), the toReversed().find(...) here would land on that tool-call-only message. rewriteSingleAssistantMessage would then be a silent no-op (no text parts to update), and the actual last text-bearing assistant message would remain unmodified in session history despite assistantTexts and delivery reflecting the plugin's modified content — an inconsistency between what was delivered and what's persisted.

    Applying the same text-bearing filter would close this gap:

    const sessionMsg = [...messages]
      .toReversed()
      .find((m) => m.role === "assistant" && "content" in m && extractAssistantText(m).length > 0);
  3. src/agents/pi-embedded-runner/run/hook-stream-wrapper.ts, line 86-89 (link)

    messages: [] silently wipes all conversation history for the LLM call

    The outer guard correctly uses result?.messages !== undefined (line 73), so an explicit { messages: [] } passes the condition. Then at line 88:

    messages: (result.messages ?? context.messages) as Message[],

    result.messages is [] (truthy, not nullish), so ?? does not fall back — the LLM call receives an empty message array, wiping the entire conversation context for that request.

    Plugin authors who want to return a partial result (e.g., only modify systemPrompt) might write { messages: [], systemPrompt: "…" } intending "no message change", unknowingly clearing history. The code comment says "Return undefined for 'no change'", but the type AgentMessage[] | undefined and the ?? idiom can easily mislead.

    Consider adding a defensive guard or an assertion, or at minimum surfacing a warn-level log when result.messages is explicitly []:

    if (result.messages !== undefined) {
      if (result.messages.length === 0) {
        log.warn("before_llm_call: hook returned messages: [] — all conversation history will be cleared for this LLM call");
      }
      // ...apply modification
    }

Last reviewed commit: a94eed9

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
Comment thread src/agents/pi-embedded-runner/run/hook-response-emit.ts
Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
@zeroaltitude zeroaltitude changed the title feat(hooks): add before_llm_call, after_llm_call, before_response_emit modifying hooks feat(hooks): add before_llm_call, after_llm_call, before_response_emit modifying hooks [claude, human developer oversight] Mar 4, 2026
zeroaltitude added a commit to zeroaltitude/openclaw that referenced this pull request Mar 4, 2026
…tion tracking

- after_llm_call is now void (parallel, observational) — modifying tool
  calls requires agent loop interception points that don't exist yet.
  Documented as a future upgrade path.
- before_response_emit block now returns empty string instead of
  undefined, so callers can distinguish blocked from unmodified.
- hookIterationRef declared before event subscription (fixes temporal
  dead zone warning). Subscription now activates for before_llm_call
  too, so iteration counter updates even without after_llm_call hooks.

Addresses greptile + codex-connector review on PR openclaw#33916.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e07ec3d846

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/hook-response-emit.ts Outdated
Comment thread src/agents/pi-embedded-runner/run/hook-response-emit.ts Outdated
Comment thread src/agents/pi-embedded-runner/run/hook-stream-wrapper.ts Outdated
@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: XL labels Mar 4, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d71f865cff

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
Comment thread src/agents/pi-embedded-runner/run/hook-response-emit.ts Outdated
zeroaltitude added a commit to zeroaltitude/openclaw that referenced this pull request Mar 4, 2026
…re_response_emit

- before_response_emit: use explicit undefined check instead of truthiness
  for content, so plugins can return empty string as valid modified content
- before_llm_call: check systemPrompt !== undefined instead of truthiness,
  so plugins can clear the system prompt with empty string

Addresses codex P2 reviews on PR openclaw#33916.
Comment thread src/agents/pi-embedded-runner/run/hook-stream-wrapper.ts Outdated
@openclaw-barnacle openclaw-barnacle Bot added the gateway Gateway runtime label Mar 4, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 70ea643c8e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/plugins/registry.ts
Comment thread src/hooks/bundled/session-memory/handler.ts Outdated
@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create an environment for this repo.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 4, 2026

You're correct on points 1-3. I was wrong about tools: [] — empty arrays are truthy in JavaScript, so the code correctly filters to an empty tool set. The systemPrompt and content checks also correctly use !== undefined and strict equality respectively.

However, the after_llm_call contradiction remains valid. You say it's "intentional design" to be void/observational, but your PR description explicitly claims the opposite:

  • Title: "modifying hooks"
  • Table: after_llm_call marked "Modifiable"
  • Result payload: { toolCalls?, block?, blockReason? }
  • Comparison: "llm_output is void/observation; after_llm_call can filter tool calls"
  • Motivation: "requires: Blocking tool calls when context is tainted"

The code comment says interception "doesn't exist yet" — that's acknowledging a limitation, not describing intentional design. If after_llm_call is void like llm_output, then this PR doesn't deliver the security capabilities you described.

Good catch on the history clearing — I see you added clearAssistantContent() in before_response_emit block handling. That fixes the PII leakage concern.

Action items:

  1. Update PR description to accurately reflect that after_llm_call is void/observational (remove claims about blocking/filtering)
  2. Or implement the blocking/filtering capability the description promises

The mismatch between documentation and implementation will confuse plugin authors who need the blocking capability you described.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c59c9a7b4b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
Comment thread src/plugins/hooks.ts
Comment thread src/plugins/hooks.llm-response.test.ts Outdated
Comment thread src/agents/pi-embedded-runner/run/hook-response-emit.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6edb9d8d5f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e7d6095cca

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 40e86563be

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/hook-stream-wrapper.ts
Comment thread src/agents/pi-embedded-runner/run/attempt.ts
Comment thread src/agents/pi-embedded-runner/run/attempt.ts
Comment thread src/hooks/bundled/session-memory/handler.ts Outdated
Comment thread src/plugins/hooks.llm-response.test.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c01752f27c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
@zeroaltitude zeroaltitude marked this pull request as draft March 5, 2026 05:42
@zeroaltitude zeroaltitude marked this pull request as ready for review March 5, 2026 06:08
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ffb3f95226

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts
Comment thread src/agents/pi-embedded-runner/run/attempt.ts
Comment thread src/agents/pi-embedded-runner/run/after-llm-call-gate.ts
Comment thread src/agents/pi-embedded-runner/run/hook-stream-wrapper.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b68cdb19f6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e35a8b1884

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts
Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
Accidentally committed .openclaw-tank/memory/2026-03-05.md. Removed from
tracking and added .openclaw-*/ to .gitignore to prevent future leaks.
…ength mismatch

1. before_response_emit merge semantics: content and allContent now use
   first-writer-wins (acc ?? next) instead of last-writer-wins (next ?? acc).
   Once a higher-priority security plugin sets redacted content, later
   utility plugins cannot override it. Matches the one-way latch on block.

2. rewriteAllAssistantContent logs a warning when allContent length differs
   from the assistant message count, making silent data loss visible in logs.
…_call

Consistent with first-writer-wins on content/allContent in
before_response_emit, the intersection latch on tools, and the one-way
latch on block. Once a higher-priority security plugin sanitizes inputs,
later utility plugins cannot override them.
…response_emit merge

Once a higher-priority plugin sets either content or allContent, both
fields are locked for later plugins. Prevents a lower-priority plugin
from bypassing single-message redaction (content) by supplying
allContent (which takes precedence in application), or vice versa.
…afe)

Removes index-based slicing via preRunMessageCount which can become
stale when compaction replaces/merges transcript entries without
changing the array length. Tail scan finds the last N assistant
messages from the end, which is always correct regardless of
compaction state.
…ction warning

- Removed preRunMessageCount from ApplyBeforeResponseEmitParams,
  getRunScopedMessages signature, and all callers. The parameter was
  unused since switching to tail-based scanning.
- Added warn-level log when block fires but getRunScopedMessages
  returns [] (compaction edge case where run boundaries are
  unidentifiable). Response delivery is still suppressed but blocked
  content may persist in session history.
…dMessages

Tool-call-only assistant messages have content (array with tool_use
blocks) but no text. assistantTextCount is derived from streamed text
entries, so the tail scan must match that filter. Using
extractAssistantText().length > 0 instead of 'content' in msg.
before_llm_call block now uses BeforeLlmCallBlockError sentinel class
instead of generic Error. The agent loop catches this specifically and
completes the run gracefully (no error surfaced) — consistent with
before_response_emit block behavior. Previously, blocking threw a
generic error that surfaced as a run failure.
Consistent with block (one-way latch) and content/messages
(first-writer-wins): the higher-priority plugin that originally
blocked also owns the reason. Prevents a lower-priority plugin
from silently overwriting the block reason.
…s only

Tool-call-only assistant messages (content arrays with only tool_use
blocks, no text) are now skipped in rewriteAllAssistantContent,
matching getRunScopedMessages and assistantTexts counting. Prevents
allContent index misalignment when tool-call-only turns are
interspersed with text-bearing turns.
…ges block

- allContent splice now bounded to original assistantTexts.length —
  plugins cannot expand the response beyond what was produced.
- Block with empty runMessages (compaction edge case) now fails closed
  by clearing ALL assistant content in the session, preventing blocked
  text from leaking into future turns or persistence.
clearAllAssistantContent now sets array content to [] instead of
only clearing text parts. Prevents sensitive data in tool_use input
arguments from persisting in session history when a response is
blocked by before_response_emit.
…ors, guard gate clear

1. Test comment: 'h2 overrides' → clarifies h1 never blocked (not a latch scenario)
2. Cross-lock: added NOTE for plugin authors about silent drop behavior
3. Gate clear: guarded by hasHooks('after_llm_call') to avoid no-op on before_llm_call-only
… to PLUGIN_HOOK_NAMES

Required for isPluginHookName validation and registerTypedHook.
Same pattern as the context_assembled/loop_iteration fix.
@zeroaltitude zeroaltitude force-pushed the feat/hook-llm-call-response-emit branch from a520ab2 to 68a78a0 Compare March 7, 2026 19:48
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 68a78a0397

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

export function rewriteLastAssistantContent(messages: AgentMessage[], newContent: string): void {
const sessionMsg = [...messages]
.toReversed()
.find((m) => m.role === "assistant" && "content" in m);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Rewrite last text-bearing assistant message

rewriteLastAssistantContent picks the last assistant message regardless of whether it contains text, so when a run ends with an assistant tool-call-only message the subsequent rewriteSingleAssistantMessage call is a no-op. In that case a before_response_emit handler that returns modified content appears to succeed (and updates assistantTexts), but the original text in activeSession.messages remains unredacted and is still available to later turns/session persistence.

Useful? React with 👍 / 👎.

log.debug(
`applying allContent modification (${emitResult.allContent.length} entries, original ${assistantTexts.length})`,
);
rewriteAllAssistantContent(runMessages, emitResult.allContent);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fail closed when run-scoped rewrite target is empty

The allContent modification path rewrites runMessages and returns success without checking whether getRunScopedMessages actually found any current-run messages. Under the compaction fallback case (already acknowledged by the block branch), this means the emitted response is redacted but session history is left unchanged, so sensitive pre-redaction text can still be fed into subsequent prompts.

Useful? React with 👍 / 👎.

@dknoodle
Copy link
Copy Markdown

dknoodle commented Mar 7, 2026

Very interested in this! We run a multi-agent setup with 24 specialized agents and have been building our own content sanitization pipeline to scan all external data before it reaches any LLM. These hooks would let us enforce that at the gateway level instead of retrofitting every individual tool — exactly the right abstraction layer for security policy enforcement. The before_llm_call hook is especially compelling for input sanitization, and after_llm_call for tool-call gating based on trust/provenance. Looking forward to seeing this land. 🦞

@zeroaltitude
Copy link
Copy Markdown
Contributor Author

Split into 3 focused PRs

This PR has been superseded by 3 smaller, stacked PRs:

PR 1: before_llm_call — input interception

#39195feat/hook-before-llm-call

Adds the before_llm_call hook. Plugins can inspect/modify messages, system prompt, filter tools, or block the call. Uses streamFn wrapper pattern.

PR 2: after_llm_call — deterministic tool-call gating

#39200feat/hook-after-llm-call

Adds the after_llm_call hook with a completely redesigned enforcement mechanism. The original design in this PR used a synchronous mutable ref with monotonic sequence counters ("best-effort"). The new design uses a Promise-based gate that's deterministic — first tool call blocks until the hook resolves.

Key deletions:

  • All subscription-based handling from attempt.ts
  • Monotonic sequence counters
  • Staleness detection logic
  • The "best-effort" caveat

PR 3: before_response_emit — output policies

#39203feat/hook-before-response-emit

Adds the before_response_emit hook. Plugins can modify, redact (single-message or multi-turn), or block the response.


Merge order: #39195#39200#39203

Closing this PR in favor of the split PRs.

@zeroaltitude
Copy link
Copy Markdown
Contributor Author

Updated split: 2 PRs

Superseded by:

  1. feat(hooks): add before_llm_call + after_llm_call plugin hooks [claude, human developer oversight] #39206before_llm_call + after_llm_call (input interception + deterministic tool-call gating)
  2. feat(hooks): add before_response_emit hook for output policies [claude, human developer oversight] #39207before_response_emit (output policies)

Merge order: #39206#39207

@zeroaltitude
Copy link
Copy Markdown
Contributor Author

zeroaltitude commented Mar 7, 2026

@bmendonca3: NB, the split has taken place; more notes now in #36671

@zeroaltitude zeroaltitude deleted the feat/hook-llm-call-response-emit branch March 7, 2026 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling gateway Gateway runtime size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants