Bug type
Regression (worked before, now fails)
Beta release blocker
No
Summary
In OpenClaw 2026.5.24-beta.2, a native Codex tool invocation can be recorded as tool.call without a matching tool.result in the persisted transcript/trajectory. This has been observed when the Codex turn is denied, interrupted by a gateway restart, or finishes with a terminal assistant message before the tool result is emitted.
This appears to be a runtime invariant bug, not a configuration issue. The safety behavior should be: if a tool call was started but no real result is available, OpenClaw should synthesize a failed tool.result, mark the turn/job as failed, and preserve terminal proof for cron/reporting consumers.
Steps to reproduce
- Run an OpenClaw cron job or isolated agent turn that uses the Codex native runtime and invokes a local command/tool.
- Cause the turn to be denied or interrupted before the native tool result is emitted. Observed cases include:
- a denied shell/tool command that results in final assistant text containing
SYSTEM_RUN_DENIED
- a gateway restart while the cron session has already emitted
tool.call
- Inspect the persisted cron/session trajectory.
- Observe that
tool.call exists, but no matching tool.result exists for the same tool call id.
- Observe that downstream cron/read-model/reporting logic has incomplete terminal proof and cannot reliably distinguish “runtime failed before producing proof” from “tool completed but report was lost”.
Expected behavior
Every persisted tool.call should have exactly one matching terminal tool.result before the turn result is finalized.
If the real tool result cannot be emitted because the turn was denied, interrupted, or terminated, OpenClaw should fail closed by recording a synthetic failed tool.result, for example:
status: failed
reason: missing_tool_result
error: OpenClaw recorded a tool.call without a matching tool.result before the Codex turn completed
The turn/job should then be classified as failed or unproven, and cron/reporting/dashboard consumers should have durable terminal proof instead of an incomplete trajectory.
Actual behavior
OpenClaw can persist a tool.call without a matching tool.result.
Observed outcomes:
- cron detects
SYSTEM_RUN_DENIED in final assistant text and marks the job as failed
- another cron run is marked failed because it was interrupted by gateway restart
- the persisted session trajectory contains
tool.call but no matching tool.result
- there is no
model.completed / session.ended proof in at least one interrupted case
- downstream reporting may have no user-visible failure report because no terminal runtime proof exists
- dashboard/read-model layers can only classify the symptom after the fact; they cannot reconstruct the missing tool result invariant
OpenClaw version
2026.5.24-beta.2
Operating system
macOS 26.5
Install method
npm grobal
Model
openai/gpt-5.5
Provider / routing chain
openclaw -> codex -> openai/gpt-5.5
Additional provider/model setup details
No response
Logs, screenshots, and evidence
Impact and severity
Severity: High for cron/workflow reliability.
Impact:
- Cron jobs can fail without durable terminal tool proof.
- User-visible failure reports can be lost because reporting has no completed runtime result to report from.
- Read models and dashboards must infer from incomplete state instead of relying on a strict transcript invariant.
- Automation can become ambiguous after gateway restarts or denied tool invocations.
- Operators may be tempted to bypass cron with direct launchers, but the correct fix is to restore the invariant in the shared Codex/OpenClaw runtime layer.
This is especially risky for approval/publish/report workflows, where OpenClaw must distinguish successful publish, failed publish, interrupted runtime, and missing proof without guessing.
Additional information
Environment
- OpenClaw version:
2026.5.24-beta.2
- Runtime: OpenAI Codex native runtime via OpenClaw app-server
- Affected area: cron / isolated agent turns / Codex native tool projection / persisted transcript and trajectory
- Observed with scheduled approval/content automation jobs that invoke local runtime wrappers through the normal cron-owned path
Logs, screenshots, and evidence
Cron warning example:
warn cron {"module":"cron"} {"jobName":"agent-ops-approval-daily","error":"cron classifier: denial token \"SYSTEM_RUN_DENIED\" detected in final text","diagnosticsSummary":"cron classifier: denial token \"SYSTEM_RUN_DENIED\" detected in final text"}
cron: job run returned error status
Another observed cron run state:
jobName: agent-ops-approval-daily
status: error
error: cron: job interrupted by gateway restart
deliveryStatus: unknown
Trajectory evidence from the interrupted run:
session.started
tool.call emitted for the expected command
no matching tool.result
no model.completed
no session.ended
Observed denial final text shape:
SYSTEM_RUN_DENIED ... did not run the owned wrapper
Local diagnostic patch direction:
Fail closed on missing Codex tool results
The patch approach was to update the Codex app-server event projector so terminal result construction checks for any recorded tool call without a matching tool result. Missing results are synthesized as failed tool.result records, and the turn is marked fail-closed with SYSTEM_RUN_DENIED / missing_tool_result proof.
Bug type
Regression (worked before, now fails)
Beta release blocker
No
Summary
In OpenClaw
2026.5.24-beta.2, a native Codex tool invocation can be recorded astool.callwithout a matchingtool.resultin the persisted transcript/trajectory. This has been observed when the Codex turn is denied, interrupted by a gateway restart, or finishes with a terminal assistant message before the tool result is emitted.This appears to be a runtime invariant bug, not a configuration issue. The safety behavior should be: if a tool call was started but no real result is available, OpenClaw should synthesize a failed
tool.result, mark the turn/job as failed, and preserve terminal proof for cron/reporting consumers.Steps to reproduce
SYSTEM_RUN_DENIEDtool.calltool.callexists, but no matchingtool.resultexists for the same tool call id.Expected behavior
Every persisted
tool.callshould have exactly one matching terminaltool.resultbefore the turn result is finalized.If the real tool result cannot be emitted because the turn was denied, interrupted, or terminated, OpenClaw should fail closed by recording a synthetic failed
tool.result, for example:The turn/job should then be classified as failed or unproven, and cron/reporting/dashboard consumers should have durable terminal proof instead of an incomplete trajectory.
Actual behavior
OpenClaw can persist a
tool.callwithout a matchingtool.result.Observed outcomes:
SYSTEM_RUN_DENIEDin final assistant text and marks the job as failedtool.callbut no matchingtool.resultmodel.completed/session.endedproof in at least one interrupted caseOpenClaw version
2026.5.24-beta.2
Operating system
macOS 26.5
Install method
npm grobal
Model
openai/gpt-5.5
Provider / routing chain
openclaw -> codex -> openai/gpt-5.5
Additional provider/model setup details
No response
Logs, screenshots, and evidence
Impact and severity
Severity: High for cron/workflow reliability.
Impact:
This is especially risky for approval/publish/report workflows, where OpenClaw must distinguish successful publish, failed publish, interrupted runtime, and missing proof without guessing.
Additional information
Environment
2026.5.24-beta.2Logs, screenshots, and evidence
Cron warning example:
Another observed cron run state:
Trajectory evidence from the interrupted run:
Observed denial final text shape:
Local diagnostic patch direction:
The patch approach was to update the Codex app-server event projector so terminal result construction checks for any recorded tool call without a matching tool result. Missing results are synthesized as failed
tool.resultrecords, and the turn is marked fail-closed withSYSTEM_RUN_DENIED/missing_tool_resultproof.