Skip to content

fix: Edit tool false positive 'failed' error#60664

Closed
progamman wants to merge 2 commits intoopenclaw:mainfrom
progamman:fix/edit-tool-false-positive
Closed

fix: Edit tool false positive 'failed' error#60664
progamman wants to merge 2 commits intoopenclaw:mainfrom
progamman:fix/edit-tool-false-positive

Conversation

@progamman
Copy link
Copy Markdown

Root cause: Recovery logic used simple 'includes' checks which failed when:

  1. newText pre-existed in file (false positive success)
  2. oldText was substring of newText (e.g. 'foo' -> 'foobar')

Fix: Use before/after occurrence counts for robust detection:

  1. Count newText occurrences before and after edit
  2. Verify newText count increased (not just present)
  3. Strip newText when checking oldText to handle substring case
  4. Verify oldText count decreased when original available

Fixes #54455, fixes #49363

admin and others added 2 commits April 4, 2026 09:46
Root cause: UI used lifetime accumulated tokens (totalTokens) for
context percentage, which stays high after compaction. Should use
current window tokens (post-compaction) instead.

Fix:
1. Add currentWindowTokens field to GatewaySessionRow type
2. Gateway computes it: when totalTokensFresh=false (post-compaction),
   use transcriptUsage.totalTokens (current window), else use totalTokens
3. UI uses currentWindowTokens for context % calculation, with
   fallback to totalTokens for backwards compatibility
4. Remove early return when totalTokensFresh=false - now shows
   correct post-compaction percentage via currentWindowTokens

Fixes openclaw#48252
…g of newText

Root cause: Recovery logic used simple 'includes' checks which failed when:
1. newText pre-existed in file (false positive success)
2. oldText was substring of newText (e.g. 'foo' -> 'foobar')

Fix: Use before/after occurrence counts for robust detection:
1. Count newText occurrences before and after edit
2. Verify newText count increased (not just present)
3. Strip newText when checking oldText to handle substring case
4. Verify oldText count decreased when original available

Fixes openclaw#54455, fixes openclaw#49363
@openclaw-barnacle openclaw-barnacle Bot added app: web-ui App: web-ui gateway Gateway runtime agents Agent runtime and tooling size: S labels Apr 4, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 4, 2026

Greptile Summary

This PR fixes two false-positive "edit failed" error cases in the Edit tool recovery path by switching from simple substring checks to before/after occurrence counting. It also adds a currentWindowTokens field to GatewaySessionRow so the UI context-usage notice reflects post-compaction token counts instead of hiding entirely when the token data is stale.

The intent is correct and the two targeted false-positive cases are well-addressed. However, the rewrite of didEditLikelyApply introduces a regression for deletion edits — that is, edits where the replacement text is an empty string. Because countOccurrences short-circuits to 0 for an empty needle, editAddedNewText is always false for deletions, causing every deletion recovery attempt to return false even when the original text was successfully removed. The old code handled this case naturally.

Key changes:

  • src/agents/pi-tools.host-edit.ts — replaces the includes-based recovery heuristic with occurrence counting; fixes false positives but regresses pure-deletion edits.
  • src/gateway/session-utils.ts / session-utils.types.ts — adds currentWindowTokens computed from transcript usage after compaction.
  • ui/src/ui/views/chat.ts — uses currentWindowTokens for the context-fill notice, removing the stale early-exit guard.

Confidence Score: 2/5

Not safe to merge as-is: the edit-recovery rewrite regresses deletion edits (empty newText), re-surfacing errors that should be silently recovered.

The session-utils and chat.ts changes are solid. The core fix in pi-tools.host-edit.ts correctly addresses the two stated false-positive cases, but the occurrence-counting approach breaks deletion edits entirely because countOccurrences returns 0 for an empty needle, making editAddedNewText always false for that class of edit. This is a functional regression on a code path the PR does not cover with tests.

src/agents/pi-tools.host-edit.ts — specifically the empty-newText (deletion) path in didEditLikelyApply.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/pi-tools.host-edit.ts
Line: 113-151

Comment:
**Deletion edits (empty `newText`) are never detected as applied**

When `edit.newText` is an empty string (a pure deletion — `oldText` replaced with nothing), `countOccurrences` returns `0` for any empty needle (see the early return at line 101–103). This makes:

- `newTextCountAfter = 0`
- `newTextCountBefore = 0`
- `editAddedNewText = 0 > 0 → false`
- `recovered = false && ... → false`

So the function always returns `false` for deletion edits, even when the `oldText` was successfully removed. This is a regression from the original logic, which handled the empty-`newText` case naturally: it skipped the inclusion check and skipped stripping, then only checked whether `oldText` was still present.

A deletion edit is valid (and accepted by `readEditReplacements`, which only requires `oldText` to be non-empty). This scenario would cause the recovery wrapper to re-throw the original error even though the edit actually succeeded.

To preserve parity with the old behaviour, the empty-`newText` path needs a dedicated branch — for example:

```typescript
// Handle deletion edits (empty newText) separately
if (normalizedNew.length === 0) {
  const stillHasOld = normalizedOld.length > 0 && normalizedCurrent.includes(normalizedOld);
  if (stillHasOld) {
    return false;
  }
  continue; // oldText is gone — deletion likely applied
}
```

Place this block right after the `normalizedOld` declaration (before the occurrence-count logic) and the rest of the loop can assume `normalizedNew.length > 0`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/pi-tools.host-edit.ts
Line: 140-145

Comment:
**Redundant count check inside `oldTextDecreasedOrAbsent`**

The outer `!stillHasOld` guard already guarantees that `countOccurrences(stripNewText(normalizedCurrent), normalizedOld)` is `0` (since `stillHasOld` is derived from `contentAfterStrip.includes(normalizedOld)`, which is the same string). Therefore the subsequent `<= countOccurrences(...)` comparison is always `true` when `!stillHasOld` holds, making the second conjunct dead code. The variable can be simplified to:

```suggestion
    const oldTextDecreasedOrAbsent = !stillHasOld;
```

This preserves identical runtime behaviour while making the intent clearer.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix: Edit tool false positive 'failed' e..." | Re-trigger Greptile

Comment on lines 113 to 151
for (const edit of params.edits) {
const normalizedNew = normalizeToLF(edit.newText);
const normalizedOld = normalizeToLF(edit.oldText);

// Check if newText exists in current content
if (normalizedNew.length > 0 && !normalizedCurrent.includes(normalizedNew)) {
return false;
}
withoutInsertedNewText =
normalizedNew.length > 0
? removeExactOccurrences(withoutInsertedNewText, normalizedNew)
: withoutInsertedNewText;
}

for (const edit of params.edits) {
const normalizedOld = normalizeToLF(edit.oldText);
if (withoutInsertedNewText.includes(normalizedOld)) {
// Count occurrences before and after
const newTextCountAfter = countOccurrences(normalizedCurrent, normalizedNew);
const newTextCountBefore =
normalizedOriginal !== undefined ? countOccurrences(normalizedOriginal, normalizedNew) : 0;

// Evidence the edit added at least one new occurrence of newText
const editAddedNewText = newTextCountAfter > newTextCountBefore;

// For oldText checking, strip newText occurrences to handle the case where
// oldText is a substring of newText (e.g. appending/wrapping, #49363)
const stripNewText = (s: string): string =>
normalizedOld.length > 0 && normalizedNew.includes(normalizedOld)
? s.split(normalizedNew).join("")
: s;

const contentAfterStrip = stripNewText(normalizedCurrent);
const stillHasOld = normalizedOld.length > 0 && contentAfterStrip.includes(normalizedOld);

// When original is available, check that oldText count decreased
const oldTextDecreasedOrAbsent =
!stillHasOld &&
(normalizedOriginal === undefined ||
countOccurrences(stripNewText(normalizedCurrent), normalizedOld) <=
countOccurrences(stripNewText(normalizedOriginal), normalizedOld));

// Recover only when: newText added AND oldText decreased/absent
const recovered = editAddedNewText && oldTextDecreasedOrAbsent;
if (!recovered) {
return false;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Deletion edits (empty newText) are never detected as applied

When edit.newText is an empty string (a pure deletion — oldText replaced with nothing), countOccurrences returns 0 for any empty needle (see the early return at line 101–103). This makes:

  • newTextCountAfter = 0
  • newTextCountBefore = 0
  • editAddedNewText = 0 > 0 → false
  • recovered = false && ... → false

So the function always returns false for deletion edits, even when the oldText was successfully removed. This is a regression from the original logic, which handled the empty-newText case naturally: it skipped the inclusion check and skipped stripping, then only checked whether oldText was still present.

A deletion edit is valid (and accepted by readEditReplacements, which only requires oldText to be non-empty). This scenario would cause the recovery wrapper to re-throw the original error even though the edit actually succeeded.

To preserve parity with the old behaviour, the empty-newText path needs a dedicated branch — for example:

// Handle deletion edits (empty newText) separately
if (normalizedNew.length === 0) {
  const stillHasOld = normalizedOld.length > 0 && normalizedCurrent.includes(normalizedOld);
  if (stillHasOld) {
    return false;
  }
  continue; // oldText is gone — deletion likely applied
}

Place this block right after the normalizedOld declaration (before the occurrence-count logic) and the rest of the loop can assume normalizedNew.length > 0.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-tools.host-edit.ts
Line: 113-151

Comment:
**Deletion edits (empty `newText`) are never detected as applied**

When `edit.newText` is an empty string (a pure deletion — `oldText` replaced with nothing), `countOccurrences` returns `0` for any empty needle (see the early return at line 101–103). This makes:

- `newTextCountAfter = 0`
- `newTextCountBefore = 0`
- `editAddedNewText = 0 > 0 → false`
- `recovered = false && ... → false`

So the function always returns `false` for deletion edits, even when the `oldText` was successfully removed. This is a regression from the original logic, which handled the empty-`newText` case naturally: it skipped the inclusion check and skipped stripping, then only checked whether `oldText` was still present.

A deletion edit is valid (and accepted by `readEditReplacements`, which only requires `oldText` to be non-empty). This scenario would cause the recovery wrapper to re-throw the original error even though the edit actually succeeded.

To preserve parity with the old behaviour, the empty-`newText` path needs a dedicated branch — for example:

```typescript
// Handle deletion edits (empty newText) separately
if (normalizedNew.length === 0) {
  const stillHasOld = normalizedOld.length > 0 && normalizedCurrent.includes(normalizedOld);
  if (stillHasOld) {
    return false;
  }
  continue; // oldText is gone — deletion likely applied
}
```

Place this block right after the `normalizedOld` declaration (before the occurrence-count logic) and the rest of the loop can assume `normalizedNew.length > 0`.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +140 to +145
// When original is available, check that oldText count decreased
const oldTextDecreasedOrAbsent =
!stillHasOld &&
(normalizedOriginal === undefined ||
countOccurrences(stripNewText(normalizedCurrent), normalizedOld) <=
countOccurrences(stripNewText(normalizedOriginal), normalizedOld));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Redundant count check inside oldTextDecreasedOrAbsent

The outer !stillHasOld guard already guarantees that countOccurrences(stripNewText(normalizedCurrent), normalizedOld) is 0 (since stillHasOld is derived from contentAfterStrip.includes(normalizedOld), which is the same string). Therefore the subsequent <= countOccurrences(...) comparison is always true when !stillHasOld holds, making the second conjunct dead code. The variable can be simplified to:

Suggested change
// When original is available, check that oldText count decreased
const oldTextDecreasedOrAbsent =
!stillHasOld &&
(normalizedOriginal === undefined ||
countOccurrences(stripNewText(normalizedCurrent), normalizedOld) <=
countOccurrences(stripNewText(normalizedOriginal), normalizedOld));
const oldTextDecreasedOrAbsent = !stillHasOld;

This preserves identical runtime behaviour while making the intent clearer.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-tools.host-edit.ts
Line: 140-145

Comment:
**Redundant count check inside `oldTextDecreasedOrAbsent`**

The outer `!stillHasOld` guard already guarantees that `countOccurrences(stripNewText(normalizedCurrent), normalizedOld)` is `0` (since `stillHasOld` is derived from `contentAfterStrip.includes(normalizedOld)`, which is the same string). Therefore the subsequent `<= countOccurrences(...)` comparison is always `true` when `!stillHasOld` holds, making the second conjunct dead code. The variable can be simplified to:

```suggestion
    const oldTextDecreasedOrAbsent = !stillHasOld;
```

This preserves identical runtime behaviour while making the intent clearer.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9e67ae1058

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +147 to +149
// Recover only when: newText added AND oldText decreased/absent
const recovered = editAddedNewText && oldTextDecreasedOrAbsent;
if (!recovered) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Handle deletion replacements in recovery check

didEditLikelyApply now requires editAddedNewText, but for deletion edits (newText === "") both occurrence counts are always zero, so editAddedNewText is always false and recovery can never succeed. This means a successful delete followed by a post-write throw is now reported as a failure, regressing the recovery behavior for valid { oldText, newText: "" } edits.

Useful? React with 👍 / 👎.

progamman pushed a commit to progamman/openclaw that referenced this pull request Apr 6, 2026
1. openclaw#60667 - Properly drop incomplete-thinking (unsigned) to avoid API errors
2. openclaw#60664 - Handle deletion edits (empty newText) in edit recovery
3. openclaw#60649 - Fix currentWindowTokens to use transcriptUsage?.totalTokensFresh

All fixes address Greptile review feedback.
@progamman
Copy link
Copy Markdown
Author

Update: v2 branch with fix

Created v2 branch: progamman/fix/edit-tool-false-positive-v2

Fixed: Deletion edits (empty newText) now work correctly. The issue was that countOccurrences returns 0 for empty needle, making editAddedNewText always false for deletions.

The fix adds an isDeletion check to skip the newText addition verification when newText is empty.

@progamman
Copy link
Copy Markdown
Author

👋 Gentle reminder - this PR is awaiting review. Please let us know if any changes are needed. Thanks!

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 28, 2026

Thanks for the context here. I did a careful shell check against current main, and this is already implemented.

Current main already covers the PR's intended edit-tool false-positive recovery with source and regression tests, and the submitted PR is not safe to land as-is because its occurrence-count rewrite regresses deletion edits. The unrelated Control UI token field in the PR is also not required by current main's documented stale-usage behavior.

Best possible solution:

Close this PR as obsolete because current main already implements the edit-tool recovery with tests. Keep the shipped implementation, leave the Control UI stale-token behavior aligned with current docs/tests, and handle any remaining false-positive warning-policy work through #54635 or a focused replacement PR.

What I checked:

  • Current edit recovery implementation: didEditLikelyApply snapshots original/current content, rejects unchanged readback, requires inserted newText to be present, strips inserted text before checking oldText, and only converts a post-write throw to success when the replacement likely happened. (src/agents/pi-tools.host-edit.ts:76, bdfb408ce6ee)
  • Regression tests cover the PR's edit cases: Tests cover oldText being a substring of newText, pre-existing replacement text with no file change, deletion edits with empty newText, multi-edit payloads, tilde path recovery, and sandboxed edit recovery. (src/agents/pi-tools.read.host-edit-recovery.test.ts:112, bdfb408ce6ee)
  • Submitted PR diff is obsolete and regressive: The open PR still rewrites the same recovery helper to occurrence counting, while review comments identify the deletion-edit regression; current main keeps explicit deletion recovery coverage. (src/agents/pi-tools.host-edit.ts:95, 9e67ae1058e4)
  • Latest release already contains the current edit recovery surface: There is no diff between v2026.4.26 and current main for src/agents/pi-tools.host-edit.ts or its recovery test file, so the implemented edit recovery is already present in the latest release snapshot provided for review. (src/agents/pi-tools.host-edit.ts:76, be8c24633aaa)
  • Unrelated Control UI token change is not needed for this PR: Current main intentionally hides stale context notices when totalTokensFresh is false, and tests assert stale rows do not render a context warning; docs describe stale token snapshots as hidden until fresh usage returns. (ui/src/ui/chat/context-notice.ts:64, bdfb408ce6ee)
  • Broader warning-policy follow-up remains separate: The mutating-tool warning policy still decides warnings from user-facing failure acknowledgement, while the related False positive 'Edit failed' notifications for mutating tools despite successful writes #54635 report remains open for the broader false notification path; this PR's edit recovery rewrite should not be used to close that remaining issue. (src/agents/pi-embedded-runner/run/payloads.ts:127, bdfb408ce6ee)

So I’m closing this as already implemented rather than keeping a duplicate issue open.

Codex review notes: model gpt-5.5, reasoning high; reviewed against bdfb408ce6ee; fix evidence: release v2026.4.26, commit bdfb408ce6ee.

@clawsweeper clawsweeper Bot closed this Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling app: web-ui App: web-ui gateway Gateway runtime size: S

Projects

None yet

1 participant