Skip to content

edits: fix file corruption issue with replace_string tool#1134

Merged
connor4312 merged 1 commit intomainfrom
connor4312/similarity-edit-correction
Sep 24, 2025
Merged

edits: fix file corruption issue with replace_string tool#1134
connor4312 merged 1 commit intomainfrom
connor4312/similarity-edit-correction

Conversation

@connor4312
Copy link
Member

The similarity matching in the replace_string tool incorrect was using
line numbers rather than string offsets (since inception!) which caused
file corruption issues. I initially thought it was only in the multi
edit tool, but it happens in all replace_string variants.

Closes microsoft/vscode#265842

Copilot AI review requested due to automatic review settings September 24, 2025 17:20
@connor4312 connor4312 self-assigned this Sep 24, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a critical file corruption issue in the replace_string tool where similarity matching was incorrectly using line numbers instead of string offsets, causing text replacements to occur at wrong positions and corrupting files.

  • Fixes the similarity matching algorithm to use string byte offsets instead of line indices
  • Adds comprehensive test coverage for the multi-replace string tool functionality
  • Exports the applyEdits function from simulation workspace for testing purposes

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/platform/test/node/simulationWorkspace.ts Exports applyEdits function for use in tests
src/extension/tools/node/test/multiReplaceStringTool.spec.tsx Adds comprehensive test suite for multi-replace string tool with regression test
src/extension/tools/node/test/editFileToolUtilsFixtures/multi-sr-bug-original.txt Test fixture file containing original content for regression testing
src/extension/tools/node/test/editFileToolUtilsFixtures/multi-sr-bug-actual.txt Test fixture file containing expected output after replacements
src/extension/tools/node/editFileToolUtils.tsx Fixes similarity matching to use string offsets instead of line numbers

@vs-code-engineering vs-code-engineering bot added this to the September 2025 milestone Sep 24, 2025
The similarity matching in the replace_string tool incorrect was using
line numbers rather than string offsets (since inception!) which caused
file corruption issues. I initially thought it was only in the multi
edit tool, but it happens in all replace_string variants.

Closes microsoft/vscode#265842
@connor4312 connor4312 force-pushed the connor4312/similarity-edit-correction branch from a771d20 to 42d9800 Compare September 24, 2025 17:26
for (let j = 0; j < oldLines.length; j++) {
const similarity = calculateSimilarity(oldLines[j], lines[i + j]);
totalSimilarity += similarity;
oldLength += lines[i + j].length;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on name only - should this be adding length of oldLines and not lines array?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it were an exact match, yes, but this is similarity matching and so the oldLine might not exactly equal the 'actual' old line we're matching against. The oldLength here is the actual length of the old data within the text document.

@connor4312 connor4312 added this pull request to the merge queue Sep 24, 2025
Merged via the queue into main with commit e15af59 Sep 24, 2025
16 checks passed
@connor4312 connor4312 deleted the connor4312/similarity-edit-correction branch September 24, 2025 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

repalce_string_in_file failure

3 participants