Skip to content

feat(hooks): add after_llm_call hook with deterministic Promise-based gate#39200

Closed
zeroaltitude wants to merge 2 commits intoopenclaw:mainfrom
zeroaltitude:feat/hook-after-llm-call
Closed

feat(hooks): add after_llm_call hook with deterministic Promise-based gate#39200
zeroaltitude wants to merge 2 commits intoopenclaw:mainfrom
zeroaltitude:feat/hook-after-llm-call

Conversation

@zeroaltitude
Copy link
Copy Markdown
Contributor

TL;DR

Adds the after_llm_call modifying hook — fires after the LLM response is received, before tool execution. Plugins can block all tool calls or filter specific calls by ID.

Split from #33916 (2 of 3). Depends on #39195 (before_llm_call) — merge that first.

Key design: deterministic Promise-based gate

This PR completely redesigns the after_llm_call enforcement mechanism. The previous design (in #33916) used:

  • A synchronous mutable ref populated via .then() from a subscription callback
  • Monotonic sequence counters for staleness detection
  • "Best-effort" enforcement — tools could execute before the gate was set

New design:

  1. The streamFn wrapper intercepts the LLM response completion (done/error event)
  2. Extracts tool calls from the final message
  3. Fires runAfterLlmCall() and stores the Promise in the gate (synchronously, before streamAssistantResponse returns)
  4. The tool wrapper (pi-tools.before-tool-call.ts) awaits the gate Promise

Why it's deterministic:

agentLoop async context:
  streamAssistantResponse()
    └── streamFn wrapper fires, stores Promise ← synchronous
    └── returns
  executeToolCalls()
    └── tool.execute()
        └── wrapToolWithBeforeToolCallHook
            └── await checkAfterLlmCallGate() ← first call blocks until hook resolves

The Promise is guaranteed to exist before executeToolCalls() starts because both happen in the same agentLoop async context, sequentially.

What was deleted (vs #33916)

  • All subscription-based after_llm_call handling from attempt.ts
  • hookTurnIteration, hookMessageEndSeq monotonic counters
  • All staleness detection logic (eventIteration !== hookTurnIteration, etc.)
  • The "best-effort" comment/caveat

Hook contract

Field Type Semantics
block boolean One-way latch: once blocked, stays blocked
blockReason string First-writer-wins
toolCalls Array<{ id }> Intersection: later handlers can only narrow the allowlist

Testing

  • 9 Promise-based gate tests (block, filter, isolation, async behavior, fail-open)
  • 4 merge semantics tests (block latch, intersection, first-writer-wins)
  • 7 stream wrapper tests (retained from PR 1)
  • pnpm check clean

AI-Assisted Development 🤖

Developed with AI assistance (Claude/OpenClaw agent "Tank"). All changes reviewed and directed by Eddie Abrams.

Adds the before_llm_call modifying hook that fires before every LLM API
call within the agent loop. Plugins can inspect/modify the message
context, filter the tool list, or block the call entirely.

Key design decisions:
- First-writer-wins for messages/systemPrompt (security plugins can't
  be overridden by lower-priority handlers)
- Intersection latch for tools (a later handler can only narrow, never
  widen, the tool allowlist)
- BeforeLlmCallBlockError sentinel class for graceful blocking without
  surfacing errors to the user
- streamFn wrapper pattern (outermost wrapper, same context as agentLoop)

Also moves clearInternalHooks() from server-startup.ts to server.impl.ts
so plugin hooks registered during loadGatewayPlugins are preserved when
bundled hooks load later.

Files:
- src/plugins/types.ts: BeforeLlmCallEvent/Result types
- src/plugins/hooks.ts: runBeforeLlmCall runner
- src/agents/pi-embedded-runner/run/hook-stream-wrapper.ts: streamFn wrapper
- src/agents/pi-embedded-runner/run/attempt.ts: wrapper setup + iteration tracking
- src/gateway/server-startup.ts, server.impl.ts: clearInternalHooks fix

Tests: 7 stream wrapper + 8 merge semantics = 15 tests
… gate

Adds the after_llm_call modifying hook — fires after the LLM response
is received, before tool execution. Plugins can block all tool calls
or filter specific tool calls by ID.

## Key design: deterministic Promise-based gate

Unlike the previous implementation which used a synchronous mutable ref
populated via .then() from a subscription callback (racy, best-effort),
this uses a Promise-based gate that's set synchronously from the streamFn
wrapper. The wrapper runs inside agentLoop's async context, so the Promise
is guaranteed to exist before executeToolCalls() starts.

The tool wrapper (pi-tools.before-tool-call.ts) awaits the gate Promise:
- First tool call blocks until the hook resolves
- Subsequent calls in the same turn get the cached result instantly
- Turn boundaries clear the gate via the turn_start subscription

This eliminates:
- Monotonic sequence counters
- Staleness detection logic
- The "best-effort" caveat

## Architecture

1. streamFn wrapper intercepts response completion (done/error event)
2. Extracts tool calls from the final message
3. Fires runAfterLlmCall, stores the Promise in the gate (synchronously)
4. Tool wrapper awaits the gate before tool.execute()

Files:
- src/plugins/types.ts: AfterLlmCallEvent/Result types
- src/plugins/hooks.ts: runAfterLlmCall runner
- src/agents/pi-embedded-runner/run/hook-stream-wrapper.ts: response interception
- src/agents/pi-embedded-runner/run/after-llm-call-gate.ts: Promise-based gate
- src/agents/pi-tools.before-tool-call.ts: async gate check

Tests: 9 gate + 4 merge semantics + 7 stream wrapper = 20 new tests
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 7, 2026

Greptile Summary

This PR introduces the after_llm_call modifying hook and a deterministic Promise-based gate mechanism to enforce its decisions before tool execution. The core design — synchronously storing a Promise in the gate when the LLM response stream completes, then awaiting it in the tool wrapper — is sound and a clear improvement over the subscription-based "best-effort" approach it replaces.

Key findings:

  • Memory leak — gate not cleared on run teardown (attempt.ts): clearAfterLlmCallGate(params.sessionId) is never called in the run's cleanup block. Since sessionId is an ephemeral UUID, every terminated session that set a gate leaves a dangling entry in the module-level gates Map indefinitely. The fix is a one-liner alongside clearActiveEmbeddedRun.
  • after_llm_call silently skips text-only responses (hook-stream-wrapper.ts): fireAfterLlmCallGate returns early when there are no tool calls in the response. The hook name and doc comment imply it fires for all LLM responses, but it is effectively a tool-execution gate only. Plugin authors who rely on it for general LLM response observation will silently miss text-only turns — this should be documented on PluginHookAfterLlmCallEvent.
  • Mock drift in tests (hook-stream-wrapper.test.ts): The mock HookRunner includes three methods (runLoopIterationStart, runLoopIterationEnd, runBeforeResponseEmit) that don't exist on the real type; harmless today but should be removed to keep the mock aligned.

Safe to merge after addressing the gate cleanup issue; the documentation gap on text-only responses should also be clarified before release.

Confidence Score: 2/5

  • The Promise-based gate mechanism is deterministic and sound, but the missing clearAfterLlmCallGate in run cleanup is a real memory leak for long-running server processes that must be fixed before merge.
  • The core design is solid (4/5 technically), but the memory leak in attempt.ts is a critical issue that drops this to 2/5. In a long-running gateway, every terminated session leaves a dangling gate entry. The text-only response skip is a documentation/design clarity issue that should also be addressed. Once the cleanup call is added and the doc gap is clarified, this becomes merge-safe.
  • src/agents/pi-embedded-runner/run/attempt.ts (missing cleanup call) and src/agents/pi-embedded-runner/run/hook-stream-wrapper.ts (undocumented text-only skip behavior); consider clarifying the behavior in src/plugins/types.ts for PluginHookAfterLlmCallEvent.

Comments Outside Diff (1)

  1. src/agents/pi-embedded-runner/run/attempt.ts, line 2035-2036 (link)

    Memory leak — gate not cleared on run teardown

    The gates Map in after-llm-call-gate.ts is keyed by sessionId, but clearAfterLlmCallGate is never called in the run's cleanup/finally path. The only call sites are:

    1. The turn_start subscription (line 1542) — only fires while the run is active and receiving new turns.
    2. The start of a new LLM call in hook-stream-wrapper.ts — only useful when the same session makes another LLM call.

    If a run is aborted or times out after the gate Promise is stored, the entry remains in the Map indefinitely. Since sessionId is an ephemeral UUID regenerated on /new and /reset, terminated sessions never reclaim their gate entries.

    The cleanup block already calls clearActiveEmbeddedRun and should also clear the gate:

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/agents/pi-embedded-runner/run/attempt.ts
    Line: 2035-2036
    
    Comment:
    **Memory leak — gate not cleared on run teardown**
    
    The `gates` Map in `after-llm-call-gate.ts` is keyed by `sessionId`, but `clearAfterLlmCallGate` is never called in the run's cleanup/finally path. The only call sites are:
    
    1. The `turn_start` subscription (line 1542) — only fires while the run is active and receiving new turns.
    2. The start of a new LLM call in `hook-stream-wrapper.ts` — only useful when the same session makes another LLM call.
    
    If a run is aborted or times out after the gate Promise is stored, the entry remains in the Map indefinitely. Since `sessionId` is an ephemeral UUID regenerated on `/new` and `/reset`, terminated sessions never reclaim their gate entries.
    
    The cleanup block already calls `clearActiveEmbeddedRun` and should also clear the gate:
    
    
    
    How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: f4b78cb

Comment on lines +209 to +211
if (toolCalls.length === 0) {
return; // No tools to gate — skip hook entirely.
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after_llm_call silently skips text-only responses

fireAfterLlmCallGate returns early when toolCalls.length === 0, meaning the hook is never fired for LLM responses that contain no tool calls (e.g. a final text reply ending the conversation). The inline comment says "No tools to gate — skip hook entirely."

However, the hook is named after_llm_call and the documentation on runAfterLlmCall (hooks.ts line 748) says it "Fires after the LLM response is received, before tool execution" — without any caveat about tool-call-only responses.

This creates a silent behavioral gap: a plugin that registers after_llm_call to audit or log every LLM response will miss all text-only turns. If this is intentional (the hook is purely a gate for tool execution), it should be documented explicitly on PluginHookAfterLlmCallEvent in types.ts so plugin authors know not to rely on it for general observation.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/hook-stream-wrapper.ts
Line: 209-211

Comment:
**`after_llm_call` silently skips text-only responses**

`fireAfterLlmCallGate` returns early when `toolCalls.length === 0`, meaning the hook is never fired for LLM responses that contain no tool calls (e.g. a final text reply ending the conversation). The inline comment says "No tools to gate — skip hook entirely."

However, the hook is named `after_llm_call` and the documentation on `runAfterLlmCall` (hooks.ts line 748) says it "Fires after the LLM response is received, before tool execution" — without any caveat about tool-call-only responses.

This creates a silent behavioral gap: a plugin that registers `after_llm_call` to audit or log every LLM response will miss all text-only turns. If this is intentional (the hook is purely a gate for tool execution), it should be documented explicitly on `PluginHookAfterLlmCallEvent` in `types.ts` so plugin authors know not to rely on it for general observation.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +43 to +46
runLoopIterationStart: vi.fn(),
runLoopIterationEnd: vi.fn(),
runBeforeResponseEmit: vi.fn(),
getHookCount: vi.fn().mockReturnValue(0),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mock includes methods not present on HookRunner

The mock helper makeMockHookRunner includes three methods — runLoopIterationStart, runLoopIterationEnd, and runBeforeResponseEmit — that do not exist on the HookRunner type returned by createHookRunner in hooks.ts. The as unknown as HookRunner cast hides this mismatch.

These extra entries are harmless today (they'll never be called by the code under test), but they signal that the mock may have drifted from a different/future version of the interface. Consider removing them to keep the mock aligned with the actual HookRunner shape:

Suggested change
runLoopIterationStart: vi.fn(),
runLoopIterationEnd: vi.fn(),
runBeforeResponseEmit: vi.fn(),
getHookCount: vi.fn().mockReturnValue(0),
runAfterLlmCall: vi.fn(),
getHookCount: vi.fn().mockReturnValue(0),
hasHooks: vi.fn().mockReturnValue(false),
...overrides,
} as unknown as HookRunner;
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run/hook-stream-wrapper.test.ts
Line: 43-46

Comment:
**Mock includes methods not present on `HookRunner`**

The mock helper `makeMockHookRunner` includes three methods — `runLoopIterationStart`, `runLoopIterationEnd`, and `runBeforeResponseEmit` — that do not exist on the `HookRunner` type returned by `createHookRunner` in `hooks.ts`. The `as unknown as HookRunner` cast hides this mismatch.

These extra entries are harmless today (they'll never be called by the code under test), but they signal that the mock may have drifted from a different/future version of the interface. Consider removing them to keep the mock aligned with the actual `HookRunner` shape:

```suggestion
    runAfterLlmCall: vi.fn(),
    getHookCount: vi.fn().mockReturnValue(0),
    hasHooks: vi.fn().mockReturnValue(false),
    ...overrides,
  } as unknown as HookRunner;
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f4b78cbd99

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


/** Helper to detect tool call content blocks in an assistant message. */
function isToolCallBlockType(type: unknown): boolean {
return type === "toolCall" || type === "tool_call" || type === "tool_use";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Handle toolUse/functionCall blocks in after_llm_call gate

isToolCallBlockType only recognizes toolCall, tool_call, and tool_use, so responses that use canonical toolUse/functionCall blocks are treated as having no tool calls. In that case fireAfterLlmCallGate returns early and never runs runAfterLlmCall, which leaves tool execution unfiltered even when plugins registered after_llm_call policies. This is a functional bypass for providers/models that emit camelCase tool-call blocks.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling gateway Gateway runtime size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant