Skip to content

fix(hooks): harden before_tool_call hook runner to fail-closed on error [AI]#59822

Merged
drobison00 merged 4 commits intoopenclaw:mainfrom
pgondhi987:fix/fix-262
Apr 3, 2026
Merged

fix(hooks): harden before_tool_call hook runner to fail-closed on error [AI]#59822
drobison00 merged 4 commits intoopenclaw:mainfrom
pgondhi987:fix/fix-262

Conversation

@pgondhi987
Copy link
Copy Markdown
Contributor

Summary

  • Problem: The global hook runner used catchErrors: true unconditionally, causing any exception thrown by a before_tool_call hook to be silently swallowed. An outer catch in runBeforeToolCallHook compounded this by returning { blocked: false } on any error, guaranteeing fail-open behavior regardless of hook intent.
  • Why it matters: Security-critical hooks (e.g. before_tool_call) must fail closed — if a hook throws, the tool call must be blocked, not permitted. Silent fail-open allows a crashing security hook to be fully bypassed with only a log-level error emitted.
  • What changed: Introduced a per-hook failurePolicyByHook option ("fail-open" | "fail-closed") on HookRunnerOptions; configured the global runner with before_tool_call: "fail-closed"; updated runBeforeToolCallHook's catch block to return { blocked: true } instead of falling through to { blocked: false }; elevated log level from warn to error.
  • What did NOT change: Failure policy for all other hooks defaults to fail-open, preserving existing behavior for non-security hooks (telemetry, logging, etc.).

Change Type (select all)

  • Bug fix
  • Security hardening

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution

Linked Issue/PR

  • This PR fixes a bug or regression

Root Cause / Regression History (if applicable)

  • Root cause: catchErrors: true was hardcoded in initializeGlobalHookRunner and applied uniformly to all hooks including security-critical ones. A second independent catch in runBeforeToolCallHook returned { blocked: false } on any error, creating a two-layer fail-open guarantee.
  • Missing detection / guardrail: No test asserted that a throwing before_tool_call hook resulted in a blocked tool call. The integration test for this scenario was explicitly titled "continues execution when hook throws" — asserting the wrong (fail-open) behavior.
  • Prior context: catchErrors option was designed for resilience of non-critical hooks; its unconditional application to before_tool_call was an unintended side effect.
  • Why this regressed now: The fail-open behavior was present since the hook runner was introduced; no security test locked in fail-closed semantics for this hook.
  • If unknown, what was ruled out: N/A

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Seam / integration test
    • End-to-end test
  • Target test or file: src/plugins/hooks.security.test.ts, src/agents/pi-tools.before-tool-call.integration.e2e.test.ts, src/agents/pi-tools.before-tool-call.e2e.test.ts
  • Scenario the test should lock in: A before_tool_call hook that throws must result in a blocked outcome; the wrapped tool must not be executed.
  • Why this is the smallest reliable guardrail: The integration test exercises the full wrapToolWithBeforeToolCallHook path; the security test exercises the hook runner's fail-closed policy in isolation.
  • Existing test that already covers this (if any): None — the existing test asserted the inverse (fail-open).
  • If no new test is added, why not: N/A — tests are added.

User-visible / Behavior Changes

If a before_tool_call plugin hook throws an unhandled exception, the tool call will now be blocked instead of proceeding silently. Operators will see an error-level log entry. This only affects hooks that crash; correctly-implemented hooks are unaffected.

Diagram (if applicable)

Before:
[before_tool_call hook throws] -> [error caught, logged] -> [tool call proceeds] ← fail-open bypass

After:
[before_tool_call hook throws] -> [error rethrown] -> [runBeforeToolCallHook catches] -> [blocked: true] -> [tool call blocked]

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? Yes — tool calls that would previously proceed when a security hook crashed are now blocked.
  • Data access scope changed? No
  • If any Yes, explain risk + mitigation: This change reduces the execution surface; a crashing hook now blocks rather than permits. The only risk is a legitimate hook that throws due to a bug causing unexpected tool blocking — operators should ensure their security hooks handle errors internally if partial-block semantics are needed.

Repro + Verification

Environment

  • OS: Linux (CI)
  • Runtime/container: Node.js v22+
  • Model/provider: N/A
  • Integration/channel (if any): N/A
  • Relevant config (redacted): N/A

Steps

  1. Register a before_tool_call hook that throws unconditionally.
  2. Attempt to invoke a tool that the hook is meant to block.
  3. Before fix: tool executes, hook error logged at warn level.
  4. After fix: tool is blocked, hook error logged at error level.

Expected

  • Tool call blocked with reason "Tool call blocked because before_tool_call hook failed".

Actual (after fix)

  • Tool call is blocked as expected; execute is never called.

Evidence

  • Failing test/log before + passing after

Updated integration test (pi-tools.before-tool-call.integration.e2e.test.ts) previously asserted execute was called when the hook threw; it now asserts the opposite. New test in hooks.security.test.ts and pi-tools.before-tool-call.e2e.test.ts lock in fail-closed semantics.

Human Verification (required)

This PR was generated by OpenAI Codex and reviewed by Claude (AI-assisted). No human has manually verified runtime behavior.

  • Verified scenarios: Code review confirmed the two-layer fail-open was closed; test assertions match the new control flow.
  • Edge cases checked: No-hook case (unaffected), non-before_tool_call hooks (policy defaults to fail-open, no behavior change), approval-required path (separate try/catch, unaffected).
  • What you did not verify: Live gateway execution with a real crashing plugin hook.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes — only fail behavior of crashing hooks changes; correctly-implemented hooks are unaffected.
  • Config/env changes? No
  • Migration needed? No

Risks and Mitigations

  • Risk: A plugin's before_tool_call hook that throws due to a non-security bug will now block tool calls it previously allowed through.
    • Mitigation: Plugin authors should handle internal errors within their hooks. The fail-closed default is the correct security posture; operators with misbehaving plugins should fix the hook, not rely on fail-open infrastructure.

AI-assisted: This fix was generated by OpenAI Codex and reviewed by Claude. No human code modifications were made.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S labels Apr 2, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 2, 2026

Greptile Summary

This PR hardens the before_tool_call hook runner to fail-closed on error, closing a two-layer fail-open bypass. The fix introduces a failurePolicyByHook option, sets before_tool_call: "fail-closed" in the global runner, and updates the catch block to return { blocked: true } instead of falling through to the permit path. All previously raised review concerns have been addressed.

Confidence Score: 5/5

Safe to merge — the security fix is correct, non-security hooks are unaffected, and all previous review concerns have been addressed.

All previously raised issues are resolved. The core logic is sound and three layers of tests lock in the fail-closed contract. No P0 or P1 findings remain.

No files require special attention.

Reviews (2): Last reviewed commit: "fix: address PR review feedback" | Re-trigger Greptile

Comment thread src/agents/pi-tools.before-tool-call.e2e.test.ts Outdated
Comment thread src/agents/pi-tools.before-tool-call.ts Outdated
@pgondhi987
Copy link
Copy Markdown
Contributor Author

@greptile review

@drobison00 drobison00 merged commit e19dce0 into openclaw:main Apr 3, 2026
32 checks passed
KimGLee pushed a commit to KimGLee/openclaw that referenced this pull request Apr 4, 2026
…or [AI] (openclaw#59822)

* fix: address issue

* fix: address PR review feedback

* docs: add changelog entry for PR merge

* docs: normalize changelog entry placement

---------

Co-authored-by: Devin Robison <drobison@nvidia.com>
lovewanwan pushed a commit to lovewanwan/openclaw that referenced this pull request Apr 28, 2026
…or [AI] (openclaw#59822)

* fix: address issue

* fix: address PR review feedback

* docs: add changelog entry for PR merge

* docs: normalize changelog entry placement

---------

Co-authored-by: Devin Robison <drobison@nvidia.com>
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
…or [AI] (openclaw#59822)

* fix: address issue

* fix: address PR review feedback

* docs: add changelog entry for PR merge

* docs: normalize changelog entry placement

---------

Co-authored-by: Devin Robison <drobison@nvidia.com>
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
…or [AI] (openclaw#59822)

* fix: address issue

* fix: address PR review feedback

* docs: add changelog entry for PR merge

* docs: normalize changelog entry placement

---------

Co-authored-by: Devin Robison <drobison@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants