Support MCP hooks in the Codex harness by pashpashpash · Pull Request #71707 · openclaw/openclaw

pashpashpash · 2026-04-25T18:34:12Z

OpenClaw already had the generic native hook relay for Codex-owned shell, patch, and permission hooks. The remaining gap was MCP: Codex had recently added MCP hook payloads, but OpenClaw still documented MCP parity as future work and accepted older app-server versions that could not reliably emit those payloads.

This PR moves the tested Codex app-server floor to 0.125.0, which is the current stable line that includes MCP hook payload support. It also tightens version comparison so same-base prereleases or build-suffixed versions such as 0.125.0-alpha.2 and 0.125.0+custom do not satisfy the stable protocol floor.

The native hook relay now treats MCP payloads as first-class Codex-native tool events. PreToolUse can block MCP calls through OpenClaw before_tool_call, PostToolUse can surface MCP results through after_tool_call, and PermissionRequest can route MCP approval decisions through the existing OpenClaw approval path. If OpenClaw has no approval decision, the relay returns no decision and Codex continues through its own guardian or user approval path.

The proof is concrete:

A security plugin story now works for native MCP tools: a before_tool_call policy that scans tool params can block an MCP-shaped call like mcp__shell__run_command before Codex executes it. This is the same class of behavior Knostic Shield relies on for dangerous command detection.
A result-observer story now works: an after_tool_call plugin can observe a native MCP result such as mcp__filesystem__read_file without pretending OpenClaw owns the canonical Codex tool record.
An approval story now works: PermissionRequest payloads for native MCP calls can be allowed, denied, or deferred through OpenClaw's approval requester.
A versioning story now works: app-server handshakes below 0.125.0, unversioned handshakes, and same-base prereleases fail closed instead of silently running on a protocol surface OpenClaw does not test.

The relay also got hardening fixes from review. Pending PermissionRequest approvals are keyed by tool call identity plus a stable content fingerprint, so a reused tool id with different tool input cannot inherit an old pending decision. Retained invocation history stores bounded payload snapshots instead of retaining full native tool output, while the active invocation still receives the full validated payload. The relay now also rejects overly broad JSON payloads before reading child values past its work budget, which keeps malformed or adversarial hook input from making the relay traverse unbounded data.

The Codex harness docs now reflect the updated v1 support contract: MCP block/observe and MCP permission routing are supported on the 0.125.0 floor, native tool argument mutation and native tool-result rewriting remain out of scope, and PermissionRequest no-decision behavior is called out explicitly.

chatgpt-codex-connector · 2026-04-25T18:34:16Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Repo admins can enable using credits for code reviews in their settings.

greptile-apps · 2026-04-25T18:36:26Z

Greptile Summary

This PR bumps the minimum required Codex app-server version from 0.118.0 to 0.124.0 and tightens the SemVer comparison so same-version prereleases (e.g. 0.124.0-alpha.2) no longer satisfy the stable protocol floor, while newer-series prereleases (e.g. 0.125.0-alpha.1) are still accepted. It also adds MCP PreToolUse, PostToolUse, and PermissionRequest relay test coverage and removes "MCP hook parity" from the v1 unsupported table.

Confidence Score: 5/5

Safe to merge — the version gate logic is correct, test coverage is thorough, and documentation is consistent.

The parseVersionForComparison change correctly implements standard SemVer prerelease precedence (prerelease < stable when numeric parts are equal). All test fixtures are updated consistently, new edge-case tests are added for both the prerelease gate and MCP relay payloads, and the regex in readCodexVersionFromUserAgent already captured prerelease identifiers so no further changes were needed there. No logic errors or security concerns were identified.

No files require special attention.

_{Reviews (1): Last reviewed commit: "codex harness mcp hook parity" | Re-trigger Greptile}

pashpashpash · 2026-04-25T18:40:01Z

Addressed the build-metadata edge in 8be0a253c4. Same-base build-suffixed versions now fail the stable 0.124.0 floor, with tests for 0.124.0+alpha.2 rejection and 0.125.0+custom acceptance.

pashpashpash · 2026-04-25T18:59:36Z

Follow-up after moving the Codex floor to 0.125.0: the runtime gate, docs, and Codex live-smoke package pin now all point at 0.125.0.

I also addressed the payload-size concern from the security review by adding per-string and aggregate string budgets to native hook relay JSON validation. The other point about forwarding native tool results to plugin hooks is intentional for parity: OpenClaw after_tool_call plugins are privileged hook participants and need the tool result to enforce policy, audit, or maintain state. Redacting that path in the relay would make Codex-native tools weaker than OpenClaw-owned tools. If we want sanitized telemetry later, that should be a separate surface from the plugin hook contract.

pashpashpash · 2026-04-25T19:03:45Z

I also checked a set of real third-party plugins against the MCP hook question.

The useful direct analogue is Knostic Shield: its L3 path registers before_tool_call, scans tool params for dangerous command/secret patterns, and blocks. I added a relay test for that exact class of behavior using a Codex MCP PreToolUse payload, so a native MCP call like mcp__shell__run_command with a destructive command is denied before execution through the same OpenClaw hook surface.

Most of the other sampled plugins are not native MCP hook consumers. WeCom uses before_tool_call, but only to rewrite OpenClaw's message dynamic tool params. MCP Adapter exposes external MCP servers as OpenClaw-owned tools, so it exercises the dynamic-tool parity path rather than the Codex-native MCP relay. A2A Gateway, Codex App Server bridge, Web Search Plus, and Apify are useful broader contract tests, but they do not prove native MCP hook interception.

pashpashpash · 2026-04-25T19:10:35Z

Addressed the remaining Aisle relay DoS finding too. Native hook JSON validation now counts object key lengths against the relay string budget, and the PermissionRequest fallback key no longer joins every tool-input key into an unbounded string. The new test covers a large object-key payload on a native MCP PermissionRequest.

pashpashpash · 2026-04-25T19:13:35Z

On the Aisle PermissionRequest fail-open note: I checked Codex core before changing behavior. Codex PermissionRequest hooks return an optional decision; if hooks return no decision, Codex falls through to its normal guardian/user approval path. In current Codex, tools/orchestrator.rs and mcp_tool_call.rs both treat None from run_permission_request_hooks as "continue to normal approval," not "allow."

So the relay intentionally returns a no-op when OpenClaw chooses to defer or approval is unavailable. That preserves the v1 contract of allow, deny, or defer to Codex's own approval path. I updated the relay comments/log wording to make this explicit. The object-key DoS finding was a real hardening issue and is fixed in the current head.

pashpashpash · 2026-04-25T19:22:23Z

Current head addresses the two Aisle hardening findings that were reported against earlier commits.

Pending PermissionRequest decisions are now keyed by relay/run/tool identity plus a stable content fingerprint, so a reused tool id with different tool input cannot inherit an old pending decision.
Invocation history now retains bounded payload snapshots instead of full native hook payloads. The active invocation still receives the full validated payload, but retained debug history is capped.

I added regression coverage for both cases in src/agents/harness/native-hook-relay.test.ts.

steipete · 2026-04-25T20:01:52Z

Codex maintainer pass pushed a rebased fixup on top of this PR.

What changed:

Rebased onto current main and resolved the obsolete channel-entry-contract.test.ts conflict by keeping the .cjs sidecar fix that already landed in fix(github-copilot): preserve reasoning IDs for Copilot Codex models #71684.
Added one more hardening fix for the Aisle JSON-budget finding: native hook relay validation now checks the pending work budget before enqueueing array/object children, so broad payloads cannot allocate an oversized pending stack before the 20k node cap trips.
Added regression coverage that proves values beyond the JSON node budget are not even read.

Local verification:

pnpm test src/agents/harness/native-hook-relay.test.ts extensions/codex/src/app-server/client.test.ts extensions/codex/src/app-server/native-hook-relay.test.ts src/plugin-sdk/channel-entry-contract.test.ts
pnpm tsgo:core:test && pnpm tsgo:extensions:test
targeted core + extension oxlint on touched files
pnpm check:changed

aisle-research-bot · 2026-04-25T20:24:43Z

🔒 Aisle Security Analysis

We found 1 potential security issue(s) in this PR:

#	Severity	Title
1	🟡 Medium	DoS via unbounded key sorting/hashing of attacker-controlled toolInput in native hook relay permission requests

1. 🟡 DoS via unbounded key sorting/hashing of attacker-controlled toolInput in native hook relay permission requests

Property	Value
Severity	Medium
CWE	CWE-400
Location	`src/agents/harness/native-hook-relay.ts:506-566`

Description

The native hook relay builds a permission-approval deduplication key by fingerprinting request.toolInput. This fingerprinting performs full key enumeration and sorting on attacker-controlled objects.

permissionRequestToolInputKeyFingerprint() calls Object.keys(toolInput).toSorted() before applying the MAX_PERMISSION_FALLBACK_KEYS/char truncation. An attacker can send a tool_input object with ~20k keys (still within the JSON node budget) and force allocation + sorting of a large string array.
permissionRequestContentFingerprint() calls updateJsonHash(), which recursively hashes toolInput and again sorts keys for every object via Object.keys(value).toSorted(). This work occurs before the per-relay approval budget check, so an attacker can repeatedly invoke permission_request events to consume CPU even if approvals are later deferred.

Vulnerable code (hot path for permission_request):

for (const key of Object.keys(toolInput).toSorted()) {
  if (processed >= MAX_PERMISSION_FALLBACK_KEYS) break;
  ...
}

for (const key of Object.keys(value).toSorted()) {
  updateJsonHash(hash, value[key]);
}

Impact: CPU and memory exhaustion (algorithmic complexity) by sending JSON payloads with very large numbers of object keys / nested objects.

Recommendation

Avoid sorting/enumerating all keys of attacker-controlled objects.

Recommended changes:

Key fingerprinting: stop after a bounded number of keys without sorting the full set. For example, iterate with for...in/Object.keys() but cap before sorting, or collect up to MAX_PERMISSION_FALLBACK_KEYS keys and then sort just that subset.

function permissionRequestToolInputKeyFingerprint(toolInput: Record<string, unknown>): string {
  const keys: string[] = [];
  for (const key in toolInput) {
    if (!Object.prototype.hasOwnProperty.call(toolInput, key)) continue;
    keys.push(key);
    if (keys.length >= MAX_PERMISSION_FALLBACK_KEYS) break;
  }
  keys.sort();
  // then apply MAX_PERMISSION_FALLBACK_KEY_CHARS truncation
  ...
}

Content fingerprinting: either

impose a per-object key cap during hashing (and encode truncation into the hash), or
hash keys without sorting (if deterministic ordering is not strictly required), or
reuse the already-validated traversal from isJsonValue while tracking limits.

Also consider moving expensive fingerprinting behind consumeNativeHookRelayPermissionBudget() if it is only needed for deduplication of pending approvals.

Analyzed PR: #71707 at commit c7b4402

_{Last updated on: 2026-04-25T20:27:27Z}

steipete · 2026-04-25T20:36:07Z

Landed. Thanks @pashpashpash!

Source SHA: c7b4402
Merge commit: 34fb966
Gate: PR CI green on source SHA; local pnpm check:changed; focused relay/Codex/plugin-sdk tests; core and extension type/lint checks. The transient parity-gate red was a QA session lock in the Opus mock half and cleared on rerun.

* codex harness mcp hook parity * tighten codex hook parity floor * prove security-style mcp hook blocking * bound native hook relay key handling * clarify permission relay defers to provider * harden native hook relay approvals * fix(agents): bound native hook relay JSON work budget --------- Co-authored-by: Peter Steinberger <steipete@gmail.com>

openclaw-barnacle Bot added docs Improvements or additions to documentation gateway Gateway runtime agents Agent runtime and tooling extensions: codex size: M maintainer Maintainer-authored PR labels Apr 25, 2026

pashpashpash force-pushed the pash/codex-mcp-hook-parity branch from 76cb880 to 8be0a25 Compare April 25, 2026 18:39

pashpashpash force-pushed the pash/codex-mcp-hook-parity branch from 8be0a25 to 830aa83 Compare April 25, 2026 18:51

openclaw-barnacle Bot added the cli CLI command changes label Apr 25, 2026

pashpashpash force-pushed the pash/codex-mcp-hook-parity branch from 830aa83 to 4edf968 Compare April 25, 2026 18:54

openclaw-barnacle Bot removed the cli CLI command changes label Apr 25, 2026

pashpashpash force-pushed the pash/codex-mcp-hook-parity branch from 1e21826 to d1aa22f Compare April 25, 2026 19:07

openclaw-barnacle Bot added size: L and removed size: M labels Apr 25, 2026

pashpashpash force-pushed the pash/codex-mcp-hook-parity branch 2 times, most recently from bf08b5b to 577ec1e Compare April 25, 2026 19:59

steipete force-pushed the pash/codex-mcp-hook-parity branch from 577ec1e to b1eab67 Compare April 25, 2026 20:01

pashpashpash force-pushed the pash/codex-mcp-hook-parity branch from b1eab67 to fd83a7b Compare April 25, 2026 20:11

pashpashpash added 3 commits April 25, 2026 13:24

codex harness mcp hook parity

7f5bb0d

tighten codex hook parity floor

7e62736

prove security-style mcp hook blocking

2867592

pashpashpash and others added 4 commits April 25, 2026 13:24

bound native hook relay key handling

a7a37ef

clarify permission relay defers to provider

fa9ab29

harden native hook relay approvals

491ef9a

fix(agents): bound native hook relay JSON work budget

c7b4402

pashpashpash force-pushed the pash/codex-mcp-hook-parity branch from fd83a7b to c7b4402 Compare April 25, 2026 20:24

steipete merged commit 34fb966 into main Apr 25, 2026
66 of 67 checks passed

steipete deleted the pash/codex-mcp-hook-parity branch April 25, 2026 20:35

pashpashpash mentioned this pull request Apr 25, 2026

Bound native hook permission fingerprints #71758

Merged

Uh oh!

Conversation

pashpashpash commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented Apr 25, 2026

Uh oh!

greptile-apps Bot commented Apr 25, 2026

Greptile Summary

Confidence Score: 5/5

Uh oh!

pashpashpash commented Apr 25, 2026

Uh oh!

pashpashpash commented Apr 25, 2026

Uh oh!

pashpashpash commented Apr 25, 2026

Uh oh!

pashpashpash commented Apr 25, 2026

Uh oh!

pashpashpash commented Apr 25, 2026

Uh oh!

pashpashpash commented Apr 25, 2026

Uh oh!

steipete commented Apr 25, 2026

Uh oh!

aisle-research-bot Bot commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔒 Aisle Security Analysis

Description

Recommendation

Uh oh!

Uh oh!

steipete commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pashpashpash commented Apr 25, 2026 •

edited

Loading

aisle-research-bot Bot commented Apr 25, 2026 •

edited

Loading