Skip to content

Support MCP hooks in the Codex harness#71707

Merged
steipete merged 7 commits into
mainfrom
pash/codex-mcp-hook-parity
Apr 25, 2026
Merged

Support MCP hooks in the Codex harness#71707
steipete merged 7 commits into
mainfrom
pash/codex-mcp-hook-parity

Conversation

@pashpashpash

@pashpashpash pashpashpash commented Apr 25, 2026

Copy link
Copy Markdown
Contributor

OpenClaw already had the generic native hook relay for Codex-owned shell, patch, and permission hooks. The remaining gap was MCP: Codex had recently added MCP hook payloads, but OpenClaw still documented MCP parity as future work and accepted older app-server versions that could not reliably emit those payloads.

This PR moves the tested Codex app-server floor to 0.125.0, which is the current stable line that includes MCP hook payload support. It also tightens version comparison so same-base prereleases or build-suffixed versions such as 0.125.0-alpha.2 and 0.125.0+custom do not satisfy the stable protocol floor.

The native hook relay now treats MCP payloads as first-class Codex-native tool events. PreToolUse can block MCP calls through OpenClaw before_tool_call, PostToolUse can surface MCP results through after_tool_call, and PermissionRequest can route MCP approval decisions through the existing OpenClaw approval path. If OpenClaw has no approval decision, the relay returns no decision and Codex continues through its own guardian or user approval path.

The proof is concrete:

  • A security plugin story now works for native MCP tools: a before_tool_call policy that scans tool params can block an MCP-shaped call like mcp__shell__run_command before Codex executes it. This is the same class of behavior Knostic Shield relies on for dangerous command detection.
  • A result-observer story now works: an after_tool_call plugin can observe a native MCP result such as mcp__filesystem__read_file without pretending OpenClaw owns the canonical Codex tool record.
  • An approval story now works: PermissionRequest payloads for native MCP calls can be allowed, denied, or deferred through OpenClaw's approval requester.
  • A versioning story now works: app-server handshakes below 0.125.0, unversioned handshakes, and same-base prereleases fail closed instead of silently running on a protocol surface OpenClaw does not test.

The relay also got hardening fixes from review. Pending PermissionRequest approvals are keyed by tool call identity plus a stable content fingerprint, so a reused tool id with different tool input cannot inherit an old pending decision. Retained invocation history stores bounded payload snapshots instead of retaining full native tool output, while the active invocation still receives the full validated payload. The relay now also rejects overly broad JSON payloads before reading child values past its work budget, which keeps malformed or adversarial hook input from making the relay traverse unbounded data.

The Codex harness docs now reflect the updated v1 support contract: MCP block/observe and MCP permission routing are supported on the 0.125.0 floor, native tool argument mutation and native tool-result rewriting remain out of scope, and PermissionRequest no-decision behavior is called out explicitly.

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Repo admins can enable using credits for code reviews in their settings.

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation gateway Gateway runtime agents Agent runtime and tooling extensions: codex size: M maintainer Maintainer-authored PR labels Apr 25, 2026
@greptile-apps

greptile-apps Bot commented Apr 25, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR bumps the minimum required Codex app-server version from 0.118.0 to 0.124.0 and tightens the SemVer comparison so same-version prereleases (e.g. 0.124.0-alpha.2) no longer satisfy the stable protocol floor, while newer-series prereleases (e.g. 0.125.0-alpha.1) are still accepted. It also adds MCP PreToolUse, PostToolUse, and PermissionRequest relay test coverage and removes "MCP hook parity" from the v1 unsupported table.

Confidence Score: 5/5

Safe to merge — the version gate logic is correct, test coverage is thorough, and documentation is consistent.

The parseVersionForComparison change correctly implements standard SemVer prerelease precedence (prerelease < stable when numeric parts are equal). All test fixtures are updated consistently, new edge-case tests are added for both the prerelease gate and MCP relay payloads, and the regex in readCodexVersionFromUserAgent already captured prerelease identifiers so no further changes were needed there. No logic errors or security concerns were identified.

No files require special attention.

Reviews (1): Last reviewed commit: "codex harness mcp hook parity" | Re-trigger Greptile

@pashpashpash pashpashpash force-pushed the pash/codex-mcp-hook-parity branch from 76cb880 to 8be0a25 Compare April 25, 2026 18:39
@pashpashpash

Copy link
Copy Markdown
Contributor Author

Addressed the build-metadata edge in 8be0a253c4. Same-base build-suffixed versions now fail the stable 0.124.0 floor, with tests for 0.124.0+alpha.2 rejection and 0.125.0+custom acceptance.

@pashpashpash pashpashpash force-pushed the pash/codex-mcp-hook-parity branch from 8be0a25 to 830aa83 Compare April 25, 2026 18:51
@openclaw-barnacle openclaw-barnacle Bot added the cli CLI command changes label Apr 25, 2026
@pashpashpash pashpashpash force-pushed the pash/codex-mcp-hook-parity branch from 830aa83 to 4edf968 Compare April 25, 2026 18:54
@openclaw-barnacle openclaw-barnacle Bot removed the cli CLI command changes label Apr 25, 2026
@pashpashpash

Copy link
Copy Markdown
Contributor Author

Follow-up after moving the Codex floor to 0.125.0: the runtime gate, docs, and Codex live-smoke package pin now all point at 0.125.0.

I also addressed the payload-size concern from the security review by adding per-string and aggregate string budgets to native hook relay JSON validation. The other point about forwarding native tool results to plugin hooks is intentional for parity: OpenClaw after_tool_call plugins are privileged hook participants and need the tool result to enforce policy, audit, or maintain state. Redacting that path in the relay would make Codex-native tools weaker than OpenClaw-owned tools. If we want sanitized telemetry later, that should be a separate surface from the plugin hook contract.

@pashpashpash

Copy link
Copy Markdown
Contributor Author

I also checked a set of real third-party plugins against the MCP hook question.

The useful direct analogue is Knostic Shield: its L3 path registers before_tool_call, scans tool params for dangerous command/secret patterns, and blocks. I added a relay test for that exact class of behavior using a Codex MCP PreToolUse payload, so a native MCP call like mcp__shell__run_command with a destructive command is denied before execution through the same OpenClaw hook surface.

Most of the other sampled plugins are not native MCP hook consumers. WeCom uses before_tool_call, but only to rewrite OpenClaw's message dynamic tool params. MCP Adapter exposes external MCP servers as OpenClaw-owned tools, so it exercises the dynamic-tool parity path rather than the Codex-native MCP relay. A2A Gateway, Codex App Server bridge, Web Search Plus, and Apify are useful broader contract tests, but they do not prove native MCP hook interception.

@pashpashpash pashpashpash force-pushed the pash/codex-mcp-hook-parity branch from 1e21826 to d1aa22f Compare April 25, 2026 19:07
@pashpashpash

Copy link
Copy Markdown
Contributor Author

Addressed the remaining Aisle relay DoS finding too. Native hook JSON validation now counts object key lengths against the relay string budget, and the PermissionRequest fallback key no longer joins every tool-input key into an unbounded string. The new test covers a large object-key payload on a native MCP PermissionRequest.

@pashpashpash

Copy link
Copy Markdown
Contributor Author

On the Aisle PermissionRequest fail-open note: I checked Codex core before changing behavior. Codex PermissionRequest hooks return an optional decision; if hooks return no decision, Codex falls through to its normal guardian/user approval path. In current Codex, tools/orchestrator.rs and mcp_tool_call.rs both treat None from run_permission_request_hooks as "continue to normal approval," not "allow."

So the relay intentionally returns a no-op when OpenClaw chooses to defer or approval is unavailable. That preserves the v1 contract of allow, deny, or defer to Codex's own approval path. I updated the relay comments/log wording to make this explicit. The object-key DoS finding was a real hardening issue and is fixed in the current head.

@pashpashpash

Copy link
Copy Markdown
Contributor Author

Current head addresses the two Aisle hardening findings that were reported against earlier commits.

  • Pending PermissionRequest decisions are now keyed by relay/run/tool identity plus a stable content fingerprint, so a reused tool id with different tool input cannot inherit an old pending decision.
  • Invocation history now retains bounded payload snapshots instead of full native hook payloads. The active invocation still receives the full validated payload, but retained debug history is capped.

I added regression coverage for both cases in src/agents/harness/native-hook-relay.test.ts.

@pashpashpash pashpashpash force-pushed the pash/codex-mcp-hook-parity branch 2 times, most recently from bf08b5b to 577ec1e Compare April 25, 2026 19:59
@steipete steipete force-pushed the pash/codex-mcp-hook-parity branch from 577ec1e to b1eab67 Compare April 25, 2026 20:01
@steipete

Copy link
Copy Markdown
Contributor

Codex maintainer pass pushed a rebased fixup on top of this PR.

What changed:

  • Rebased onto current main and resolved the obsolete channel-entry-contract.test.ts conflict by keeping the .cjs sidecar fix that already landed in fix(github-copilot): preserve reasoning IDs for Copilot Codex models #71684.
  • Added one more hardening fix for the Aisle JSON-budget finding: native hook relay validation now checks the pending work budget before enqueueing array/object children, so broad payloads cannot allocate an oversized pending stack before the 20k node cap trips.
  • Added regression coverage that proves values beyond the JSON node budget are not even read.

Local verification:

  • pnpm test src/agents/harness/native-hook-relay.test.ts extensions/codex/src/app-server/client.test.ts extensions/codex/src/app-server/native-hook-relay.test.ts src/plugin-sdk/channel-entry-contract.test.ts
  • pnpm tsgo:core:test && pnpm tsgo:extensions:test
  • targeted core + extension oxlint on touched files
  • pnpm check:changed

@pashpashpash pashpashpash force-pushed the pash/codex-mcp-hook-parity branch from b1eab67 to fd83a7b Compare April 25, 2026 20:11
@pashpashpash pashpashpash force-pushed the pash/codex-mcp-hook-parity branch from fd83a7b to c7b4402 Compare April 25, 2026 20:24
@aisle-research-bot

aisle-research-bot Bot commented Apr 25, 2026

Copy link
Copy Markdown

🔒 Aisle Security Analysis

We found 1 potential security issue(s) in this PR:

# Severity Title
1 🟡 Medium DoS via unbounded key sorting/hashing of attacker-controlled toolInput in native hook relay permission requests
1. 🟡 DoS via unbounded key sorting/hashing of attacker-controlled toolInput in native hook relay permission requests
Property Value
Severity Medium
CWE CWE-400
Location src/agents/harness/native-hook-relay.ts:506-566

Description

The native hook relay builds a permission-approval deduplication key by fingerprinting request.toolInput. This fingerprinting performs full key enumeration and sorting on attacker-controlled objects.

  • permissionRequestToolInputKeyFingerprint() calls Object.keys(toolInput).toSorted() before applying the MAX_PERMISSION_FALLBACK_KEYS/char truncation. An attacker can send a tool_input object with ~20k keys (still within the JSON node budget) and force allocation + sorting of a large string array.
  • permissionRequestContentFingerprint() calls updateJsonHash(), which recursively hashes toolInput and again sorts keys for every object via Object.keys(value).toSorted(). This work occurs before the per-relay approval budget check, so an attacker can repeatedly invoke permission_request events to consume CPU even if approvals are later deferred.

Vulnerable code (hot path for permission_request):

for (const key of Object.keys(toolInput).toSorted()) {
  if (processed >= MAX_PERMISSION_FALLBACK_KEYS) break;
  ...
}

for (const key of Object.keys(value).toSorted()) {
  updateJsonHash(hash, value[key]);
}

Impact: CPU and memory exhaustion (algorithmic complexity) by sending JSON payloads with very large numbers of object keys / nested objects.

Recommendation

Avoid sorting/enumerating all keys of attacker-controlled objects.

Recommended changes:

  1. Key fingerprinting: stop after a bounded number of keys without sorting the full set. For example, iterate with for...in/Object.keys() but cap before sorting, or collect up to MAX_PERMISSION_FALLBACK_KEYS keys and then sort just that subset.
function permissionRequestToolInputKeyFingerprint(toolInput: Record<string, unknown>): string {
  const keys: string[] = [];
  for (const key in toolInput) {
    if (!Object.prototype.hasOwnProperty.call(toolInput, key)) continue;
    keys.push(key);
    if (keys.length >= MAX_PERMISSION_FALLBACK_KEYS) break;
  }
  keys.sort();// then apply MAX_PERMISSION_FALLBACK_KEY_CHARS truncation
  ...
}
  1. Content fingerprinting: either
  • impose a per-object key cap during hashing (and encode truncation into the hash), or
  • hash keys without sorting (if deterministic ordering is not strictly required), or
  • reuse the already-validated traversal from isJsonValue while tracking limits.

Also consider moving expensive fingerprinting behind consumeNativeHookRelayPermissionBudget() if it is only needed for deduplication of pending approvals.


Analyzed PR: #71707 at commit c7b4402

Last updated on: 2026-04-25T20:27:27Z

@steipete steipete merged commit 34fb966 into main Apr 25, 2026
66 of 67 checks passed
@steipete steipete deleted the pash/codex-mcp-hook-parity branch April 25, 2026 20:35
@steipete

Copy link
Copy Markdown
Contributor

Landed. Thanks @pashpashpash!

  • Source SHA: c7b4402
  • Merge commit: 34fb966
  • Gate: PR CI green on source SHA; local pnpm check:changed; focused relay/Codex/plugin-sdk tests; core and extension type/lint checks. The transient parity-gate red was a QA session lock in the Opus mock half and cleared on rerun.

ayesha-aziz123 pushed a commit to ayesha-aziz123/openclaw that referenced this pull request Apr 26, 2026
* codex harness mcp hook parity

* tighten codex hook parity floor

* prove security-style mcp hook blocking

* bound native hook relay key handling

* clarify permission relay defers to provider

* harden native hook relay approvals

* fix(agents): bound native hook relay JSON work budget

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
* codex harness mcp hook parity

* tighten codex hook parity floor

* prove security-style mcp hook blocking

* bound native hook relay key handling

* clarify permission relay defers to provider

* harden native hook relay approvals

* fix(agents): bound native hook relay JSON work budget

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
* codex harness mcp hook parity

* tighten codex hook parity floor

* prove security-style mcp hook blocking

* bound native hook relay key handling

* clarify permission relay defers to provider

* harden native hook relay approvals

* fix(agents): bound native hook relay JSON work budget

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
globalcaos pushed a commit to globalcaos/tinkerclaw that referenced this pull request May 13, 2026
* codex harness mcp hook parity

* tighten codex hook parity floor

* prove security-style mcp hook blocking

* bound native hook relay key handling

* clarify permission relay defers to provider

* harden native hook relay approvals

* fix(agents): bound native hook relay JSON work budget

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
* codex harness mcp hook parity

* tighten codex hook parity floor

* prove security-style mcp hook blocking

* bound native hook relay key handling

* clarify permission relay defers to provider

* harden native hook relay approvals

* fix(agents): bound native hook relay JSON work budget

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
* codex harness mcp hook parity

* tighten codex hook parity floor

* prove security-style mcp hook blocking

* bound native hook relay key handling

* clarify permission relay defers to provider

* harden native hook relay approvals

* fix(agents): bound native hook relay JSON work budget

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
* codex harness mcp hook parity

* tighten codex hook parity floor

* prove security-style mcp hook blocking

* bound native hook relay key handling

* clarify permission relay defers to provider

* harden native hook relay approvals

* fix(agents): bound native hook relay JSON work budget

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling docs Improvements or additions to documentation extensions: codex gateway Gateway runtime maintainer Maintainer-authored PR size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants