fix(agents): suppress Write/Edit failed warning on response-timeout false-failure (#55424)#86855
Conversation
|
Codex review: needs changes before merge. Reviewed May 27, 2026, 4:00 AM ET / 08:00 UTC. Summary PR surface: Source +256, Tests +387. Total +643 across 10 files. Reproducibility: yes. for the PR-introduced blocker: source inspection shows the outer host guard validates the original Review metrics: 1 noteworthy metric.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Copy recommended automerge instructionNext step before merge Security Review findings
Review detailsBest possible solution: Keep the readback-confirmed timeout recovery, but make recovery consume the same guarded path flow as the workspace root guard or skip file URL recovery unless the guard has safely mapped it; add workspaceOnly regression coverage for that boundary. Do we have a high-confidence way to reproduce the issue? Yes for the PR-introduced blocker: source inspection shows the outer host guard validates the original Is this the best way to solve the issue? No, not yet. Readback-confirmed recovery is the right shape, but this implementation must first reuse the guarded path flow or skip unsafe file URL recovery under workspaceOnly. Full review comments:
Overall correctness: patch is incorrect AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against c95d348bb5ef. Label changesLabel justifications:
Evidence reviewedPR surface: Source +256, Tests +387. Total +643 across 10 files. View PR surface stats
Security concerns:
Acceptance criteria:
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a3d9de9692
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const isFileWriteResponseTimeoutFalseFailure = | ||
| params.lastToolError.timedOut === true && params.lastToolError.fileTarget !== undefined; | ||
| if (isFileWriteResponseTimeoutFalseFailure) { | ||
| return { showWarning: false, includeDetails }; |
There was a problem hiding this comment.
Preserve timeout warning when no user-facing reply exists
This new gate suppresses warnings for any timedOut && fileTarget error before considering whether the user received any assistant output. In turns where a write/edit call times out and the assistant produces no user-facing reply, buildEmbeddedRunPayloads now returns no error payload at all, so users get silence instead of the previous warning path (showWarning fallback when hasUserFacingReply is false). That can hide real or ambiguous file-mutation failures, not just the false-failure case this change targets.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Thanks for the concern — wanted to lay out the technical reasoning here:
Why the gate is intentionally before hasUserFacingReply:
This is the response-timeout false-failure class (#55424). The mutation actually completed on disk; only the tool-ack reply timed out. Showing "Write failed" for a successful write is the bug.
Why fileTarget !== undefined is the right boundary, not a leaky filter:
-
extractFileTarget()(src/agents/tool-mutation.ts:253-266) only returns a value for thewrite/editfamily (FILE_MUTATING_TOOL_NAMES = {"edit", "write"}). Exec, message, cron, gateway, sessions_send timeouts never match — they keep the existing visible-warning path (covered by the new "still shows exec tool errors when timedOut is true" regression). -
The narrow combination is
timedOut === true && fileTarget !== undefined. A real failed write (permission denied, ENOSPC, EACCES) does not settimedOut: true; those errors keep the existing mutating-error warning path at lines 196-201.
The remaining edge — "disk write timed out AND no assistant reply":
For write/edit specifically, an ack-reply timeout still leaves the partial bytes on disk: the OS does not roll back a partial write when the JS-level ack times out. The reporter explicitly documented 8/8 affected writes completed successfully on disk on v2026.3.13 Windows 11. The product choice is: if the file is there, do not lie to the user that it failed.
If you'd prefer a tighter boundary, I can layer this with hasUserFacingReply:
const isFileWriteResponseTimeoutFalseFailure =
params.lastToolError.timedOut === true &&
params.lastToolError.fileTarget !== undefined &&
params.hasUserFacingReply;That would preserve the warning only in the narrow case where ack-timed-out AND no assistant text exists. Happy to push that variant if you'd prefer the stricter shape — let me know and I'll add a regression test for the "no reply + write timeout" path.
Verification of current shape: pnpm test src/agents/pi-embedded-runner/run/payloads.errors.test.ts — 33 passed (4 new cases pin write+fileTarget suppression, write-without-fileTarget still warns, exec timeout still warns).
|
ClawSweeper PR egg 🔥 Warming up: real-behavior proof passed; findings, security review, or rank-up moves are still in progress. Hatch commandComment Hatchability rules:
What is this egg doing here?
|
a3d9de9 to
d8c0e7f
Compare
d8c0e7f to
d10cf26
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d10cf265be
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (expanded.startsWith("file://")) { | ||
| try { | ||
| return fileURLToPath(expanded); |
There was a problem hiding this comment.
Keep file-URL recovery reads inside the workspace guard
When host file tools run with tools.fs.workspaceOnly=true, the outer wrapToolWorkspaceRootGuard does not parse file:// paths unless a container workdir is configured, so a path like file:///etc/passwd is validated as a relative literal under the workspace. This new normalization then converts the same argument to /etc/passwd inside the edit/write recovery wrappers before their pre-error stat/readFile, so an edit or write attempt with a local file URL can touch arbitrary host files before the underlying workspace-only operation rejects the write. Align the guard and recovery normalization, or avoid resolving file:// here unless it has already been mapped by the guarded path flow.
Useful? React with 👍 / 👎.
…alse-failure (openclaw#55424) Reporter sees '⚠️ Write failed' / '⚠️ Edit failed' warnings on Feishu (and other channels) even though the file was 100% saved successfully (8 of 8 verified writes succeeded; warning shown for all 8). Source path: tool-mutation records lastToolError.timedOut=true with a fileTarget when a write/edit tool ack reply times out after the disk mutation has already completed, then resolveToolErrorWarningPolicy goes through the default mutating-tool branch and emits the misleading failure summary. Add a narrow gate inside resolveToolErrorWarningPolicy that suppresses the warning only when both lastToolError.timedOut is true AND lastToolError.fileTarget is defined. fileTarget is set by tool-mutation.ts only for the write/edit family (FILE_MUTATING_TOOL_NAMES), so this branch never matches exec/message/cron/gateway mutating-tool timeouts where the disk-write idempotency reasoning does not apply. Real file failures (no timeout) and timeouts without recorded fileTarget keep their visible warnings.
d8a72de to
c5b9a53
Compare
|
Verification before merge: Behavior addressed: Write/Edit response-timeout false failures now suppress the user-facing failed-warning only when the completed file change is proven from the filesystem. Timed-out payloads that only attempted a target path still warn. Real environment tested: local macOS checkout on head c5b9a53, plus GitHub CI for the same SHA. Exact steps or command run after this patch: Evidence after fix: focused local tests passed, 138 tests across 3 shards. Full core support-boundary shard passed, 178 files passed, 1996 tests passed, 10 skipped. Autoreview reported clean with no accepted/actionable findings. GitHub Main CI run 26498401197 passed. GitHub Critical Quality CodeQL run 26498401102 passed. Observed result after fix: write-timeout recovery succeeds only after proving requested content reached disk and pre-state proves the operation was not a no-op; attempted-path-only timeout payloads still produce the warning. E2E timeout fixtures no longer depend on ambient PATH ordering. What was not tested: Crabbox changed-check run did not complete because the synced workspace lacked .git metadata, so git diff origin/main...HEAD failed before checks in run_ce50a1ea3ee0. Local support-boundary and GitHub CI/Critical Quality covered the changed surfaces instead. |
…alse-failure (openclaw#55424) (openclaw#86855) * fix(agents): suppress Write/Edit failed warning on response-timeout false-failure (openclaw#55424) Reporter sees '⚠️ Write failed' / '⚠️ Edit failed' warnings on Feishu (and other channels) even though the file was 100% saved successfully (8 of 8 verified writes succeeded; warning shown for all 8). Source path: tool-mutation records lastToolError.timedOut=true with a fileTarget when a write/edit tool ack reply times out after the disk mutation has already completed, then resolveToolErrorWarningPolicy goes through the default mutating-tool branch and emits the misleading failure summary. Add a narrow gate inside resolveToolErrorWarningPolicy that suppresses the warning only when both lastToolError.timedOut is true AND lastToolError.fileTarget is defined. fileTarget is set by tool-mutation.ts only for the write/edit family (FILE_MUTATING_TOOL_NAMES), so this branch never matches exec/message/cron/gateway mutating-tool timeouts where the disk-write idempotency reasoning does not apply. Real file failures (no timeout) and timeouts without recorded fileTarget keep their visible warnings. * fix: recover completed write timeouts safely * fix: bound write timeout recovery precheck * fix: type write recovery precheck fallback * test: complete write recovery result mock * test: isolate e2e timeout fixture shims * test: stabilize e2e timeout fixture path --------- Co-authored-by: Peter Steinberger <steipete@gmail.com>
…026.5.27) (#698) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [ghcr.io/openclaw/openclaw](https://openclaw.ai) ([source](https://github.com/openclaw/openclaw)) | patch | `2026.5.26` → `2026.5.27` | --- ### Release Notes <details> <summary>openclaw/openclaw (ghcr.io/openclaw/openclaw)</summary> ### [`v2026.5.27`](https://github.com/openclaw/openclaw/blob/HEAD/CHANGELOG.md#2026527) [Compare Source](openclaw/openclaw@v2026.5.26...v2026.5.27) ##### Highlights - Safer local/runtime boundaries: OpenClaw now rejects unsafe command wrappers, malformed CLI numeric options, unsafe Node runtime env overrides, no-auth Tailscale exposure, and non-admin device-role pairing approvals before they can affect live runs. ([#​87308](openclaw/openclaw#87308), [#​87305](openclaw/openclaw#87305), [#​87292](openclaw/openclaw#87292), [#​87146](openclaw/openclaw#87146)) - Matrix and auto-reply delivery are steadier: mention previews stay inert, final mention replies deliver normally, shared-DM notices are awaited, MXID parsing ignores filenames, and reasoning-prefixed `NO_REPLY` responses stay suppressed. - Provider and agent reliability improved across OpenAI-compatible embeddings, cached token usage, Anthropic/Codex/Claude runtime state, unsupported tool-schema quarantine, heartbeat templates, and session fallback errors. ([#​85269](openclaw/openclaw#85269), [#​82062](openclaw/openclaw#82062), [#​85416](openclaw/openclaw#85416), [#​86855](openclaw/openclaw#86855)) - Plugin and package release paths got tighter: Pixverse ships as an external video plugin with region selection, package exclusions and shrinkwrap inventory match the published npm shape, and release/package smoke commands fail bounded instead of hanging. - Gateway hot paths do less rediscovery by reusing current plugin metadata fingerprints, stable plugin index fingerprints, read-only session metadata, active working stores, status fast paths, and auth/env snapshots. ([#​86439](openclaw/openclaw#86439)) ##### Changes - Memory: add a core OpenAI-compatible embedding provider for local and hosted OpenAI-style endpoints, with config, doctor, and docs support. ([#​85269](openclaw/openclaw#85269)) Thanks [@​dutifulbob](https://github.com/dutifulbob). - Plugin SDK: mark memory-specific embedding provider registration as deprecated compatibility and surface non-bundled usage in plugin compatibility diagnostics. ([#​85072](openclaw/openclaw#85072)) Thanks [@​mbelinky](https://github.com/mbelinky). - Pixverse: add video generation provider support, API region selection, and external plugin publishing. - Plugins: expose approval action metadata for plugin-driven approval surfaces. ##### Fixes - Security/CLI/runtime: harden hostname normalization for repeated trailing dots, block side-effecting command wrappers, reject unsafe Node runtime env overrides, reject loose numeric CLI and gateway options, require admin approval for node device-role pairing, and reject no-auth Tailscale exposure. ([#​87305](openclaw/openclaw#87305), [#​87292](openclaw/openclaw#87292), [#​87308](openclaw/openclaw#87308), [#​87146](openclaw/openclaw#87146)) Thanks [@​pgondhi987](https://github.com/pgondhi987). - Doctor: validate runtime tool schemas for every configured embedded agent while skipping ACP-only profiles, so bad non-default plugin or MCP tools are reported before assistant turns. - Telegram: route `sendMessage` action replies through durable outbound delivery so completed agent responses remain retryable when the gateway send path times out. ([#​87261](openclaw/openclaw#87261)) Thanks [@​mbelinky](https://github.com/mbelinky). - Matrix/auto-reply: keep draft previews mention-inert, preserve final mention delivery, send mention finals normally, await shared DM notices, ignore filename-embedded MXIDs, and suppress reasoning-prefixed `NO_REPLY` responses. - Agents/providers: add OpenAI-compatible cache retention, forward cached token usage in chat completions, preserve runtime context before active user turns, strip stale Anthropic thinking, load Claude CLI OAuth for Pi auth profiles, avoid false Codex runtime live switches, and quarantine unsupported tool schemas. ([#​82062](openclaw/openclaw#82062), [#​87167](openclaw/openclaw#87167), [#​86855](openclaw/openclaw#86855)) - Gateway/performance: cache plugin metadata fingerprints and stable plugin index fingerprints, borrow read-only session metadata safely, keep the active session working store hot, keep status on a bounded fast path, and preserve model auth profile suffixes. ([#​86439](openclaw/openclaw#86439)) - Package/install/release: align npm package exclusions and inventory, omit unpacked test helpers, skip Homebrew until macOS packages need it, cap tsdown heap in containers, bound install/release smoke waits, and harden post-publish verification. - Codex/Auth: bound ChatGPT OAuth token exchange and refresh requests, and honor cancellation across Codex and Anthropic OAuth login flows. - QA/E2E/CI: bound Telegram, kitchen-sink, Open WebUI, ClawHub, MCP, Discord, realtime, labeler, and GitHub API waits; fail empty explicit test, live-media, gateway CPU, startup benchmark, plugin gauntlet, and beta-smoke runs instead of false-greening. - Agents/Codex: keep spawned agent bootstrap files rooted in the agent workspace while running task commands, transcripts, and compaction from the requested cwd. ([#​87218](openclaw/openclaw#87218)) Thanks [@​mbelinky](https://github.com/mbelinky). </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19--> Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/698
…alse-failure (openclaw#55424) (openclaw#86855) * fix(agents): suppress Write/Edit failed warning on response-timeout false-failure (openclaw#55424) Reporter sees '⚠️ Write failed' / '⚠️ Edit failed' warnings on Feishu (and other channels) even though the file was 100% saved successfully (8 of 8 verified writes succeeded; warning shown for all 8). Source path: tool-mutation records lastToolError.timedOut=true with a fileTarget when a write/edit tool ack reply times out after the disk mutation has already completed, then resolveToolErrorWarningPolicy goes through the default mutating-tool branch and emits the misleading failure summary. Add a narrow gate inside resolveToolErrorWarningPolicy that suppresses the warning only when both lastToolError.timedOut is true AND lastToolError.fileTarget is defined. fileTarget is set by tool-mutation.ts only for the write/edit family (FILE_MUTATING_TOOL_NAMES), so this branch never matches exec/message/cron/gateway mutating-tool timeouts where the disk-write idempotency reasoning does not apply. Real file failures (no timeout) and timeouts without recorded fileTarget keep their visible warnings. * fix: recover completed write timeouts safely * fix: bound write timeout recovery precheck * fix: type write recovery precheck fallback * test: complete write recovery result mock * test: isolate e2e timeout fixture shims * test: stabilize e2e timeout fixture path --------- Co-authored-by: Peter Steinberger <steipete@gmail.com>
…alse-failure (openclaw#55424) (openclaw#86855) * fix(agents): suppress Write/Edit failed warning on response-timeout false-failure (openclaw#55424) Reporter sees '⚠️ Write failed' / '⚠️ Edit failed' warnings on Feishu (and other channels) even though the file was 100% saved successfully (8 of 8 verified writes succeeded; warning shown for all 8). Source path: tool-mutation records lastToolError.timedOut=true with a fileTarget when a write/edit tool ack reply times out after the disk mutation has already completed, then resolveToolErrorWarningPolicy goes through the default mutating-tool branch and emits the misleading failure summary. Add a narrow gate inside resolveToolErrorWarningPolicy that suppresses the warning only when both lastToolError.timedOut is true AND lastToolError.fileTarget is defined. fileTarget is set by tool-mutation.ts only for the write/edit family (FILE_MUTATING_TOOL_NAMES), so this branch never matches exec/message/cron/gateway mutating-tool timeouts where the disk-write idempotency reasoning does not apply. Real file failures (no timeout) and timeouts without recorded fileTarget keep their visible warnings. * fix: recover completed write timeouts safely * fix: bound write timeout recovery precheck * fix: type write recovery precheck fallback * test: complete write recovery result mock * test: isolate e2e timeout fixture shims * test: stabilize e2e timeout fixture path --------- Co-authored-by: Peter Steinberger <steipete@gmail.com>
Summary
Fixes the false "Write failed" / "Edit failed" warning class without trusting argument-derived payload metadata. Instead of suppressing payload warnings when
lastToolError.fileTargetexists, write recovery now treats timeout/abort failures as successful only after reading the target file back and proving the requested content is present.The recovery path keeps normal successful writes cheap: it uses stat metadata first, only reads small same-size existing files before the write, and defers full readback to the timeout/abort path. Large same-size sandbox writes use sub-second stat mtimes so completed writes can still recover without a normal-path full-file read.
Changes
fileTargetonly proves the attempted path.Real behavior proof
Behavior addressed: response/ack timeout after a completed file write no longer surfaces as a failed write when disk readback proves the requested content was written; argument-derived
fileTargetalone still does not suppress payload warnings.Real environment tested: local macOS checkout plus broad support-boundary shard reproduction. Maintainer applied
proof: overridebecause this is a shared write-tool/payload-layer fix covered by focused runtime tests and changed-surface CI proof, without a channel-specific live Feishu/Telegram run.Exact steps or command run after this patch:
Evidence after fix: autoreview clean with no accepted/actionable findings; focused Vitest passed 138 tests across payload, write recovery, sandbox stat coverage, and the E2E timeout fixture; the previously failing support-boundary shard passed locally with 178 files passed, 1 skipped, 1996 tests passed.
Observed result after fix: write recovery returns success only after timeout/abort readback matches requested content and either pre-state differed, metadata changed, or the target was known missing before the write. Payload warning tests still show timed-out
writeerrors whenfileTargetonly proves the attempted path. The broad support-boundary timeout fixture no longer depends on inherited PATH lookup after its fake timeout wrapper records the command.What was not tested: live Feishu/Telegram end-to-end delivery was not rerun; the fix is covered at the shared write-tool and payload construction layers used by those channels. Remote Crabbox
pnpm check:changedwas attempted on runrun_ce50a1ea3ee0but the raw synced workspace had no.git, so it failed before running checks.Closes #55424