Skip to content

Fix edit tool false positive failure when oldText is substring of newText#49639

Closed
ayushozha wants to merge 4 commits intoopenclaw:mainfrom
ayushozha:fix/edit-tool-false-positive-error
Closed

Fix edit tool false positive failure when oldText is substring of newText#49639
ayushozha wants to merge 4 commits intoopenclaw:mainfrom
ayushozha:fix/edit-tool-false-positive-error

Conversation

@ayushozha
Copy link
Copy Markdown
Contributor

Summary

  • Fixes the edit tool post-write recovery logic that incorrectly reports "Edit failed" when oldText is a substring of newText (e.g. replacing "foo" with "foobar").
  • After a successful write, oldText naturally appears inside newText in the file content, causing the stillHasOld check to incorrectly prevent recovery.
  • The fix strips newText occurrences from the file content before checking for residual oldText, so only genuinely unreplaced oldText triggers the error.
  • Adds two test cases: one verifying recovery succeeds for the substring scenario, and one verifying it still correctly rejects when oldText appears outside newText.

Fixes #49363

Test plan

  • Existing 3 recovery tests pass (no regressions)
  • New test: recovery succeeds when oldText is substring of newText and edit wrote correctly
  • New test: recovery correctly rejects when oldText appears both inside newText and separately (incomplete edit)

🤖 Generated with Claude Code

Ayush Ojha and others added 2 commits March 18, 2026 01:29
…wText

When oldText is a substring of newText (e.g. replacing "foo" with "foobar"),
the post-write recovery logic incorrectly detected oldText as still present
in the file because it appears within the replacement newText. Strip newText
occurrences before checking for residual oldText to avoid false positives.

Fixes openclaw#49363

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add min-width: 0 to .cron-workspace-main so the grid child can shrink
below its content size, preventing cron job entries from spilling past
the list boundary into the "New Job" panel. Also add overflow-wrap /
text-overflow to job titles, detail values, and state values so long
text wraps or truncates instead of expanding the container.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: XS labels Mar 18, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 99a1be6bee

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-tools.host-edit.ts Outdated
Comment on lines +65 to +68
const contentForOldCheck =
oldText !== undefined && oldText.length > 0 && newText.includes(oldText)
? content.replaceAll(newText, "")
: content;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve pre-write failure detection for substring replacements

When oldText is a substring of newText, this branch strips every newText occurrence before checking for residual oldText, so a pre-write failure can be misreported as success if the file already contained newText (for example, replacing foo with foobar when the file already has foobar and no standalone foo). In that case hasNew is true and stillHasOld becomes false even though no replacement happened, which reintroduces the false-success behavior this recovery guard was added to prevent.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 89d88c5. Instead of only stripping newText from the content (which erases pre-existing occurrences too), the recovery now uses before/after occurrence counting via contentBefore: editAddedNewText requires the newText count to increase, and oldTextDecreasedOrAbsent checks oldText count decreased. When contentBefore is available, recovery requires both conditions. This closes the false-success gap for pre-existing newText. Also added a dedicated test for this scenario.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 18, 2026

Greptile Summary

This PR fixes a false-positive failure in the edit tool's post-write recovery logic where replacing oldText with a newText that contains oldText as a substring (e.g. "foo""foobar") caused the recovery check to incorrectly detect residual old text and rethrow instead of returning success. The fix strips all occurrences of newText from the file content before checking for leftover oldText, ensuring only genuinely unreplaced instances trigger the error.

Key changes:

  • src/agents/pi-tools.host-edit.ts: When oldText ⊂ newText, compute contentForOldCheck by calling content.replaceAll(newText, "") before the stillHasOld test. Falls back to the original content when there is no substring relationship.
  • src/agents/pi-tools.read.host-edit-recovery.test.ts: Two new test cases covering (1) successful recovery for the substring scenario and (2) correct rejection when oldText still appears outside of newText occurrences.

Minor concern: The replaceAll strip slightly erodes the pre-existing protection against false success when newText was already in the file before the edit (and the upstream threw before writing). In that specific combination — oldText ⊂ newText, pre-populated newText, throw-before-write — recovery will now incorrectly report success. This is an unlikely scenario in practice given how the underlying edit tool works, but worth noting as the original code comment explicitly called it out as a known guard.

Confidence Score: 4/5

  • Safe to merge; fixes a real and reproducible false-positive with well-targeted test coverage and minimal regression risk.
  • The fix is logically sound for the primary failure mode and both new tests accurately verify the intended behavior. The only concern is a narrow edge case (pre-existing newText + oldText ⊂ newText + throw-before-write) where the pre-existing guard is slightly weakened, but this combination is unlikely in practice given the tool's simple string-matching strategy.
  • No files require special attention beyond the inline comment on pi-tools.host-edit.ts lines 65–68.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/pi-tools.host-edit.ts
Line: 65-68

Comment:
**Edge case: pre-existing `newText` in file before write**

The `replaceAll` strip works correctly when the edit succeeded, but there is a subtle regression for a specific scenario: if the file already contained `newText` (as pre-existing content, before any edit took place) and the upstream tool then threw *before* writing (e.g. because `oldText` was not found as a standalone match), this new path will strip that pre-existing `newText` occurrence, leave no residual `oldText`, and incorrectly return success.

**Before this fix** in that scenario:
- `content = "foobar"` (pre-existing)
- `content.includes("foo")``true``stillHasOld = true` → rethrow ✓ (correct)

**After this fix** in that scenario:
- `contentForOldCheck = "foobar".replaceAll("foobar", "")``""`
- `stillHasOld = false` → false success ✗

This requires `oldText ⊂ newText`, the file having `newText` pre-populated, *and* the upstream tool using context-aware matching that didn't treat the substring as a valid match. In practice the edit tool does simple `String.includes` matching, so this combination is rare — but it does erode the protection the original comment specifically called out: *"avoids false success when upstream threw before writing but the file already contained newText."*

One way to make the guard tighter would be to only apply the strip when `hasNew` is true (already guaranteed at this point) and to document the remaining limitation explicitly.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "Fix edit tool false ..."

Comment thread src/agents/pi-tools.host-edit.ts Outdated
Comment on lines +65 to +68
const contentForOldCheck =
oldText !== undefined && oldText.length > 0 && newText.includes(oldText)
? content.replaceAll(newText, "")
: content;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Edge case: pre-existing newText in file before write

The replaceAll strip works correctly when the edit succeeded, but there is a subtle regression for a specific scenario: if the file already contained newText (as pre-existing content, before any edit took place) and the upstream tool then threw before writing (e.g. because oldText was not found as a standalone match), this new path will strip that pre-existing newText occurrence, leave no residual oldText, and incorrectly return success.

Before this fix in that scenario:

  • content = "foobar" (pre-existing)
  • content.includes("foo")truestillHasOld = true → rethrow ✓ (correct)

After this fix in that scenario:

  • contentForOldCheck = "foobar".replaceAll("foobar", "")""
  • stillHasOld = false → false success ✗

This requires oldText ⊂ newText, the file having newText pre-populated, and the upstream tool using context-aware matching that didn't treat the substring as a valid match. In practice the edit tool does simple String.includes matching, so this combination is rare — but it does erode the protection the original comment specifically called out: "avoids false success when upstream threw before writing but the file already contained newText."

One way to make the guard tighter would be to only apply the strip when hasNew is true (already guaranteed at this point) and to document the remaining limitation explicitly.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-tools.host-edit.ts
Line: 65-68

Comment:
**Edge case: pre-existing `newText` in file before write**

The `replaceAll` strip works correctly when the edit succeeded, but there is a subtle regression for a specific scenario: if the file already contained `newText` (as pre-existing content, before any edit took place) and the upstream tool then threw *before* writing (e.g. because `oldText` was not found as a standalone match), this new path will strip that pre-existing `newText` occurrence, leave no residual `oldText`, and incorrectly return success.

**Before this fix** in that scenario:
- `content = "foobar"` (pre-existing)
- `content.includes("foo")``true``stillHasOld = true` → rethrow ✓ (correct)

**After this fix** in that scenario:
- `contentForOldCheck = "foobar".replaceAll("foobar", "")``""`
- `stillHasOld = false` → false success ✗

This requires `oldText ⊂ newText`, the file having `newText` pre-populated, *and* the upstream tool using context-aware matching that didn't treat the substring as a valid match. In practice the edit tool does simple `String.includes` matching, so this combination is rare — but it does erode the protection the original comment specifically called out: *"avoids false success when upstream threw before writing but the file already contained newText."*

One way to make the guard tighter would be to only apply the strip when `hasNew` is true (already guaranteed at this point) and to document the remaining limitation explicitly.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also fixed by the same commit 89d88c5. The exact scenario you describe (pre-existing newText, pre-write failure, oldText is substring of newText) is now handled: when contentBefore is available, recovery requires editAddedNewText (newText count must increase), so if the file was unchanged or only had pre-existing newText, the count stays the same and we correctly rethrow. The existing test "rethrows when oldText is substring of newText and file already had newText (pre-write failure)" covers this case. Also reverted the unrelated CSS changes from this PR.

…e edit

Address review feedback on openclaw#49639: when oldText is a substring of newText
and the file already contained newText before the edit, the replaceAll
approach could incorrectly report success even though the upstream tool
threw before writing.

Fix: capture file content before calling base.execute and compare with
post-state. If the file is unchanged, the tool threw before writing and
we rethrow immediately. Also update tests to properly simulate
throw-after-write vs throw-before-write scenarios, and add a dedicated
test for the pre-existing newText edge case.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openclaw-barnacle openclaw-barnacle Bot added app: web-ui App: web-ui size: S and removed size: XS labels Mar 18, 2026
minupla pushed a commit to minupla/openclaw that referenced this pull request Mar 18, 2026
… is substring of newText

Prevents 'Edit failed' when oldText is a substring of newText (e.g.
replacing 'foo' with 'foobar'). Also prevents false success when the
file content was identical before the edit attempt.

Source: openclaw#49639
@openclaw-barnacle openclaw-barnacle Bot removed the app: web-ui App: web-ui label Mar 18, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 89d88c5021

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +117 to +118
const recovered =
hasNew && oldTextDecreasedOrAbsent && (contentBefore === undefined || editAddedNewText);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Drop strict newText-count increase requirement in recovery

The recovery gate now requires newText occurrences to increase (editAddedNewText) before returning success, which causes false failures when a successful replacement does not raise that count. A common case is replacing a larger string that already contains newText (for example oldText="foofoobar" to newText="foobar"): after a real write, oldText is gone and newText is present, but the before/after newText counts can be equal, so this branch rethrows the original error instead of recovering from the post-write failure.

Useful? React with 👍 / 👎.

@ayushozha
Copy link
Copy Markdown
Contributor Author

Hey @progamman — this PR fixes the edit tool false positive you reported in #49363 (oldText is substring of newText causing false "Edit failed").

The fix uses before/after occurrence counting: it snapshots the file content before the edit, then after the tool throws, compares how many times newText and oldText appear before vs after. This correctly handles:

  • Substring replacements (e.g. "foo""foobar")
  • Pre-existing newText in the file
  • Pre-write failures (tool threw before writing)

All 7 test cases pass including the exact scenario from your issue. Could you confirm this addresses what you experienced? Thanks!

@ayushozha
Copy link
Copy Markdown
Contributor Author

@joshavant @obviyus Would appreciate a review when you get a chance. This fixes a false "Edit failed" when oldText is a substring of newText (e.g. replacing "foo" with "foobar"). Uses before/after occurrence counting for robust detection. All 7 tests pass including edge cases for pre-existing content. Fixes #49363.

@mbelinky
Copy link
Copy Markdown
Contributor

This has been superseded by the merged version in #52516.

Merged PR: #52516
Merge commit: 922f4e6

That landed slice keeps the same exact-match edit contract, but consolidates the user-visible mismatch-content improvement with the related recovery hardening that was needed to ship it safely on current main.

Closing as superseded by the merged implementation above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Edit tool returns false positive "failed" error when edit actually succeeds

2 participants