fix(codex): quarantine unreadable dynamic tool descriptors by vincentkoc · Pull Request #89397 · openclaw/openclaw

vincentkoc · 2026-06-02T09:16:28Z

Summary

Quarantines unreadable Codex app-server dynamic tool descriptors before bridge registration builds specs, maps, or wrappers.
Snapshots readable descriptor fields once so a plugin getter cannot pass schema projection and crash later during Codex tool spec construction.
Preserves healthy sibling tools and existing dynamic-tool quarantine warnings/diagnostics.
Out of scope: changing plugin SDK registration contracts or provider-side schema normalization.

Linked context

Related to the active runtime tool-schema fuzzing sweep for unsupported plugin tool schemas crashing turns before assistant content.

Real behavior proof (required for external PRs)

Behavior or issue addressed: A malformed Codex dynamic tool descriptor can no longer crash bridge registration before the assistant turn reaches content.
Real environment tested: Local OpenClaw Codex extension Vitest harness in a Codex worktree.
Exact steps or command run after this patch: node scripts/run-vitest.mjs extensions/codex/src/app-server/dynamic-tools.test.ts -- --reporter=dot; AWS Crabbox fresh-PR pnpm check:changed
Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output): extensions/codex/src/app-server/dynamic-tools.test.ts passed, 37 tests.
Observed result after fix: unreadable dynamic tool name, parameters, and description descriptors are quarantined; a one-shot readable name is captured once and reused without a later getter crash; healthy message remains registered.
What was not tested: live Telegram/Codex app-server turn with an installed hostile plugin.
Proof limitations or environment constraints: local disk is very low, so broad changed proof ran remotely instead of in the local Codex worktree.
Before evidence (optional but encouraged): autoreview confirmed the earlier guarded projection still crashed when downstream registration reread raw descriptors.

Tests and validation

./node_modules/.bin/oxfmt --check --threads=1 extensions/codex/src/app-server/dynamic-tools.ts extensions/codex/src/app-server/dynamic-tools.test.ts
git diff --check origin/main...HEAD
node scripts/run-vitest.mjs extensions/codex/src/app-server/dynamic-tools.test.ts -- --reporter=dot
.agents/skills/autoreview/scripts/autoreview --mode local
AWS Crabbox fresh-PR changed gate:
/opt/homebrew/bin/crabbox run --provider aws --fresh-pr openclaw/openclaw#89397 --idle-timeout 30m --ttl 90m --timing-json --shell -- "set -e; if ! command -v node >/dev/null 2>&1; then curl -fsSL https://deb.nodesource.com/setup_24.x | sudo -E bash -; sudo apt-get install -y nodejs; fi; node -v; npm -v; sudo corepack enable; env OPENCLAW_CHECK_CHANGED_REMOTE_CHILD=1 OPENCLAW_CHANGED_LANES_RAW_SYNC=1 corepack pnpm check:changed" passed. Provider: aws; run: run_b533b3b7e58c; lease: cbx_8180c941abce; slug: brisk-krill; exit 0; lanes: extensions, extensionTests; command 4m26.143s; total 4m42.141s; lease stopped.

Regression coverage added:

createCodexDynamicToolBridge(...) quarantines unreadable dynamic tool descriptor fields before bridge registration.
The same regression verifies a one-shot dynamic tool name is not reread during spec construction.

Known validation note:

Targeted local extension oxlint currently reports pre-existing sparse/type-resolution errors in extensions/codex/src/app-server/dynamic-tools.ts, so lint/typecheck proof is being routed through remote changed validation.

Risk checklist

Did user-visible behavior change? (Yes/No) Yes.
Did config, environment, or migration behavior change? (Yes/No) No.
Did security, auth, secrets, network, or tool execution behavior change? (Yes/No) Yes, tool registration now quarantines malformed dynamic descriptors earlier.
What is the highest-risk area? Accidentally changing Codex dynamic tool registration/wrapping semantics.
How is that risk mitigated? Existing dynamic-tool tests still pass, including direct/searchable registration and wrapped-tool behavior; the fix snapshots only registration descriptors and keeps execution dispatch tied to the captured name.

Current review state

Next action: maintainer review.
Still waiting on: nothing known from author.
Bot or reviewer comments addressed: local autoreview first found the raw-descriptor reread issue; the patch now snapshots descriptor fields and final autoreview is clean.

clawsweeper · 2026-06-02T09:24:10Z

Codex review: needs changes before merge. Reviewed June 2, 2026, 5:29 AM ET / 09:29 UTC.

Summary
The PR snapshots Codex dynamic tool descriptor fields before bridge registration and adds regression coverage for unreadable descriptor getters.

PR surface: Source +121, Tests +84. Total +205 across 2 files.

Reproducibility: yes. from source inspection: current main reads dynamic tool descriptor getters directly during projection/spec construction, so throwing getters can abort bridge registration. I did not run the targeted Vitest in this read-only review.

Review metrics: none identified.

Merge readiness
Overall: 🦐 gold shrimp
Proof: 🌊 off-meta tidepool
Patch quality: 🦐 gold shrimp
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

Preserve original tool metadata through descriptor snapshotting and wrapping.
[P2] Add a regression for plugin owner diagnostics or code-mode marker preservation, then rerun the targeted Codex dynamic-tools test.

Risk before merge

[P1] Merging as-is can make unwrapped plugin or channel dynamic tools emit tool.execution diagnostics as core tools, and code-mode control tools can lose their WeakSet marker before before_tool_call normalization.

Maintainer options:

Preserve metadata before merge (recommended)
Keep the descriptor snapshots for Codex specs, but ensure the object passed through before_tool_call wrapping retains plugin/channel/code-mode metadata and add focused tests for that behavior.
Pause until owner review
If maintainers want a different tool snapshot contract, pause this PR until the Codex/plugin runtime owner confirms the permanent metadata-preservation shape.

Copy recommended automerge instruction

@clawsweeper automerge

Special instructions:
Preserve plugin, channel, and code-mode tool metadata when Codex dynamic tool descriptor snapshots are wrapped for execution; add regression coverage for plugin owner diagnostics and code-mode control marker preservation; rerun the Codex dynamic-tools targeted test.

Next step before merge

[P2] There is a narrow, actionable PR-branch repair: preserve object-keyed tool metadata while keeping descriptor snapshots for Codex spec construction.

Security
Needs attention: The patch touches tool execution registration and currently risks dropping owner/kind metadata used by security-relevant diagnostics and hook normalization.

Review findings

[P2] Preserve tool metadata when wrapping snapshots — extensions/codex/src/app-server/dynamic-tools.ts:115-117

Review details

Best possible solution:

Keep the early descriptor quarantine, but preserve execution-object metadata and add a regression that proves plugin/channel owner diagnostics and code-mode marker behavior survive descriptor snapshotting.

Do we have a high-confidence way to reproduce the issue?

Yes from source inspection: current main reads dynamic tool descriptor getters directly during projection/spec construction, so throwing getters can abort bridge registration. I did not run the targeted Vitest in this read-only review.

Is this the best way to solve the issue?

No as written: the quarantine belongs at the Codex bridge boundary, but the execution snapshot must preserve metadata that current wrappers derive from the original tool object.

Full review comments:

[P2] Preserve tool metadata when wrapping snapshots — extensions/codex/src/app-server/dynamic-tools.ts:115-117
The new snapshot object is what gets passed into wrapToolWithBeforeToolCallHook, but wrapper behavior depends on metadata stored on the original tool object. Plugin/channel owner metadata is WeakMap-backed and code-mode control markers are WeakSet-backed, so wrapping the fresh snapshot can reclassify plugin tools as core in tool.execution.* diagnostics and skip code-mode exec param normalization compared with current main wrapping the original tool. Preserve or copy that metadata before wrapping, or keep the original execution object and use the snapshot only for Codex spec fields.
Confidence: 0.87

Overall correctness: patch is incorrect
Overall confidence: 0.88

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 601ab84f35e1.

Label changes

Label changes:

add P2: The PR addresses a real Codex dynamic-tool crash path, but the remaining blocker has limited blast radius and is confined to tool registration/wrapping behavior.
add merge-risk: 🚨 security-boundary: The diff can drop tool owner and code-mode execution metadata before the before_tool_call/diagnostic wrapper observes the tool.
add rating: 🦐 gold shrimp: Overall readiness is 🦐 gold shrimp; proof is 🌊 off-meta tidepool and patch quality is 🦐 gold shrimp.
add status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Not applicable: The external-contributor proof gate does not apply to this maintainer PR; the body reports targeted Vitest and Crabbox changed-gate validation as supplemental proof.

Label justifications:

P2: The PR addresses a real Codex dynamic-tool crash path, but the remaining blocker has limited blast radius and is confined to tool registration/wrapping behavior.
merge-risk: 🚨 security-boundary: The diff can drop tool owner and code-mode execution metadata before the before_tool_call/diagnostic wrapper observes the tool.
rating: 🦐 gold shrimp: Overall readiness is 🦐 gold shrimp; proof is 🌊 off-meta tidepool and patch quality is 🦐 gold shrimp.
status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Not applicable: The external-contributor proof gate does not apply to this maintainer PR; the body reports targeted Vitest and Crabbox changed-gate validation as supplemental proof.

Evidence reviewed

PR surface:

Source +121, Tests +84. Total +205 across 2 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	1	142	21	+121
Tests	1	84	0	+84
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	2	226	21	+205

Security concerns:

[medium] Tool identity metadata can be dropped — extensions/codex/src/app-server/dynamic-tools.ts:115
The execution wrapper is built around a fresh snapshot object, while plugin ownership and code-mode-control state are object-keyed metadata on the original tool; losing them can weaken audit classification and code-mode before_tool_call normalization.
Confidence: 0.84

Acceptance criteria:

[P1] node scripts/run-vitest.mjs extensions/codex/src/app-server/dynamic-tools.test.ts -- --reporter=dot.
[P1] node scripts/run-vitest.mjs extensions/codex/src/app-server/openclaw-owned-tool-runtime-contract.test.ts -- --reporter=dot.

What I checked:

Current main crash path: Current main projects dynamic tools by reading tool.parameters and tool.name directly, so an unreadable descriptor getter can throw during bridge creation before any sibling tool can remain registered. (extensions/codex/src/app-server/dynamic-tools.ts:315, 601ab84f35e1)
PR snapshot/wrap path: The PR wraps projectedTool.tool after projectedTool.tool has been replaced with a fresh snapshot object, so wrapper metadata is resolved from the snapshot rather than the original tool object. (extensions/codex/src/app-server/dynamic-tools.ts:115, a33d86e20ecb)
Wrapper metadata contract: wrapToolWithBeforeToolCallHook resolves diagnostic identity and code-mode hook metadata from the tool object, then copies plugin/channel metadata from that same object onto the wrapper. (src/agents/agent-tools.before-tool-call.ts:1094, 601ab84f35e1)
Weak metadata source: Plugin tool ownership is stored in a WeakMap keyed by the original tool object, so replacing the object without copying metadata loses plugin owner classification. (src/plugins/tools.ts:77, 601ab84f35e1)
Codex dynamic tool contract: Upstream Codex requires dynamic tool specs to include name, description, and inputSchema and validates names/schemas before converting them into core DynamicToolSpec entries. (../codex/codex-rs/app-server/src/request_processors/thread_processor.rs:204, 67b805fc1117)

Likely related people:

WJzz1: Current blame attributes the Codex dynamic tool bridge and wrapper/helper lines in this checkout snapshot to this author, making them a useful routing candidate for the immediate surface. (role: recent area contributor; confidence: medium; commits: 6349af650240; files: extensions/codex/src/app-server/dynamic-tools.ts, src/agents/agent-tools.before-tool-call.ts)
steipete: Git history shows the Codex app-server dynamic-tools module was added/moved in a recent Codex app-server commit by Peter Steinberger. (role: Codex app-server feature history; confidence: medium; commits: d5698038d71c; files: extensions/codex/src/app-server/dynamic-tools.ts, extensions/codex/src/app-server/run-attempt.ts)
vincentkoc: Prior merged history includes plugin provenance and live hook registry fixes, which are directly adjacent to preserving tool identity metadata across runtime wrapping. (role: adjacent plugin/runtime provenance owner; confidence: medium; commits: 982383338373, d22279d2e873; files: src/plugins/tools.ts, src/plugins/runtime.ts, src/plugins/hooks.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

vincentkoc · 2026-06-08T01:30:43Z

Closing this as folded into the landed canonical PR: #90022

Reason: this was an intermediate schema-surface draft for the same issue. The merged PR is the cleaner superset: it preserves healthy tool execution, quarantines unreadable/invalid descriptor and wrapper failures, and prevents Codex from receiving registered durable specs for tools that are unavailable in the current turn.

Landed squash commit: 3ffb360

fix(codex): quarantine unreadable dynamic tool descriptors

a33d86e

vincentkoc self-assigned this Jun 2, 2026

openclaw-barnacle Bot added extensions: codex size: M maintainer Maintainer-authored PR labels Jun 2, 2026

clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels Jun 2, 2026

vincentkoc mentioned this pull request Jun 8, 2026

fix(codex): quarantine unreadable dynamic tools #90022

Merged

vincentkoc closed this Jun 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(codex): quarantine unreadable dynamic tool descriptors#89397

fix(codex): quarantine unreadable dynamic tool descriptors#89397
vincentkoc wants to merge 1 commit into
mainfrom
fuzz-tool-schema-surfaces-next13-20260602

vincentkoc commented Jun 2, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented Jun 2, 2026 •

edited

Loading

Uh oh!

vincentkoc commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

vincentkoc commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Linked context

Real behavior proof (required for external PRs)

Tests and validation

Risk checklist

Current review state

Uh oh!

clawsweeper Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vincentkoc commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vincentkoc commented Jun 2, 2026 •

edited

Loading

clawsweeper Bot commented Jun 2, 2026 •

edited

Loading