edits: fix file corruption issue with replace_string tool#1134
edits: fix file corruption issue with replace_string tool#1134connor4312 merged 1 commit intomainfrom
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR fixes a critical file corruption issue in the replace_string tool where similarity matching was incorrectly using line numbers instead of string offsets, causing text replacements to occur at wrong positions and corrupting files.
- Fixes the similarity matching algorithm to use string byte offsets instead of line indices
- Adds comprehensive test coverage for the multi-replace string tool functionality
- Exports the
applyEditsfunction from simulation workspace for testing purposes
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/platform/test/node/simulationWorkspace.ts |
Exports applyEdits function for use in tests |
src/extension/tools/node/test/multiReplaceStringTool.spec.tsx |
Adds comprehensive test suite for multi-replace string tool with regression test |
src/extension/tools/node/test/editFileToolUtilsFixtures/multi-sr-bug-original.txt |
Test fixture file containing original content for regression testing |
src/extension/tools/node/test/editFileToolUtilsFixtures/multi-sr-bug-actual.txt |
Test fixture file containing expected output after replacements |
src/extension/tools/node/editFileToolUtils.tsx |
Fixes similarity matching to use string offsets instead of line numbers |
The similarity matching in the replace_string tool incorrect was using line numbers rather than string offsets (since inception!) which caused file corruption issues. I initially thought it was only in the multi edit tool, but it happens in all replace_string variants. Closes microsoft/vscode#265842
a771d20 to
42d9800
Compare
| for (let j = 0; j < oldLines.length; j++) { | ||
| const similarity = calculateSimilarity(oldLines[j], lines[i + j]); | ||
| totalSimilarity += similarity; | ||
| oldLength += lines[i + j].length; |
There was a problem hiding this comment.
Based on name only - should this be adding length of oldLines and not lines array?
There was a problem hiding this comment.
If it were an exact match, yes, but this is similarity matching and so the oldLine might not exactly equal the 'actual' old line we're matching against. The oldLength here is the actual length of the old data within the text document.
The similarity matching in the replace_string tool incorrect was using
line numbers rather than string offsets (since inception!) which caused
file corruption issues. I initially thought it was only in the multi
edit tool, but it happens in all replace_string variants.
Closes microsoft/vscode#265842