fix(codex): quarantine unreadable dynamic tools#89063
Conversation
|
Codex review: needs changes before merge. Reviewed June 1, 2026, 7:33 AM ET / 11:33 UTC. Summary PR surface: Source +83, Tests +29. Total +112 across 2 files. Reproducibility: yes. source-reproducible: current main reads tool.parameters and tool.name directly while constructing the bridge, so a throwing accessor can abort startup before any quarantine path runs. I did not run the repro because this review was kept read-only. Review metrics: none identified. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Next step before merge
Security Review findings
Review detailsBest possible solution: Keep the quarantine design, but make the bridge wrap or execute tools through a stable projected descriptor so name, description, and parameters are not re-read after validation, then add a regression test for a throwing description or one-shot parameters getter. Do we have a high-confidence way to reproduce the issue? Yes, source-reproducible: current main reads tool.parameters and tool.name directly while constructing the bridge, so a throwing accessor can abort startup before any quarantine path runs. I did not run the repro because this review was kept read-only. Is this the best way to solve the issue? No, not yet: quarantine-before-registration is the right layer, but the proposed implementation still re-reads descriptor accessors through wrapToolWithBeforeToolCallHook. The safer fix is to pass a stable projected tool shape into wrapping/spec creation or teach the wrapper to use already-read descriptor fields. Full review comments:
Overall correctness: patch is incorrect AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against a595aba60e3b. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +83, Tests +29. Total +112 across 2 files. View PR surface stats
Acceptance criteria:
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
Closing this as folded into the landed canonical PR: #90022 Reason: this draft was an earlier route for the same Codex dynamic-tool hardening, but the review blocker was that wrapped execution could still re-read unsafe descriptors. The merged PR fixes the bridge boundary by caching readable descriptor data, quarantining wrapping-time failures, and filtering specs for unavailable tools. Landed squash commit: 3ffb360 |
Summary
Verification
node scripts/run-vitest.mjs extensions/codex/src/app-server/dynamic-tools.test.ts --reporter=dotnode_modules/.bin/oxlint extensions/codex/src/app-server/dynamic-tools.ts extensions/codex/src/app-server/dynamic-tools.test.tsgit diff --check HEAD~1..HEADnode --import tsxrepro: bad dynamic toolparametersgetter no longer abortscreateCodexDynamicToolBridge; observed available tool listmessageand quarantine reasonbad_dynamic_probe.inputSchema could not be read: parameters getter exploded.agents/skills/autoreview/scripts/autoreview --mode branch --base origin/mainReal behavior proof
Behavior addressed: Codex dynamic-tool startup can continue with healthy tools when a plugin/runtime tool exposes an unreadable schema accessor.
Real environment tested: local OpenClaw Codex extension test lane via repo wrapper; direct
tsxbridge-construction repro.Exact steps or command run after this patch:
node scripts/run-vitest.mjs extensions/codex/src/app-server/dynamic-tools.test.ts --reporter=dot; directnode --import tsxrepro invokingcreateCodexDynamicToolBridge;node scripts/crabbox-wrapper.mjs run --provider aws --idle-timeout 90m --ttl 240m --timing-json --shell -- "pnpm check:changed".Evidence after fix: focused test passed, 1 file / 37 tests; direct repro printed
messageplus the expected quarantine reason; oxlint and diff check passed; autoreview reported no accepted/actionable findings.Observed result after fix: the bridge exposes the healthy
messagetool and records the bad dynamic tool intelemetry.quarantinedToolsinstead of throwing during startup.What was not tested: remote
pnpm check:changeddid not complete. Blacksmith Testbox is not authenticated in this shell. AWS Crabbox did not reach a remote lease because the wrapper needed a temporary full checkout for sync and failed locally withNo space left on device.