Fix race condition in WriteLinesToFile transactional mode (#13323)#13477
Fix race condition in WriteLinesToFile transactional mode (#13323)#13477huulinhnguyen-dev wants to merge 10 commits intodotnet:mainfrom
Conversation
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Fixes a race in the built-in WriteLinesToFile task’s transactional overwrite path where concurrent writers could cause File.Move to fail with “file already exists”, by falling back to File.Replace when the destination appears mid-operation.
Changes:
- Add a transactional overwrite recovery path: if
File.Movefails and the destination now exists, retry viaFile.Replacewith a small bounded retry loop. - Rename the flaky/misleading concurrent overwrite test to reflect actual
Overwrite="true"semantics.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/Tasks/FileIO/WriteLinesToFile.cs | Adds a File.Replace-based fallback when Move fails due to a concurrent create, improving reliability under parallel builds. |
| src/Tasks.UnitTests/WriteLinesToFile_Tests.cs | Renames a test to better describe the concurrent overwrite behavior being validated. |
JanProvaznik
left a comment
There was a problem hiding this comment.
This is a bandaid on a more fundamental problem. Please consider how to do it properly with the Move with overwrite based on my comments.
Learnings from reviewing PR #13477 — the 24-dimension checklist missed design-level issues that required stepping back from the diff. ### Changes - **Dim 4 (Tests)**: Flag weak assertions that would pass with incorrect output - **Dim 10 (Design)**: Check alignment with original design intent, flag workarounds when better APIs exist in existing dependencies, validate borrowed patterns - **Dim 22 (Correctness)**: Generalize 2-participant fixes to N participants, flag symptom-only fixes - **Workflow**: Add historical context step (read linked issue + original feature PR) before dispatching dimension agents - **Tasks instructions**: Document \Microsoft.IO.Redist\ API availability for .NET Framework ### Why The review agents evaluated the diff mechanically but missed: 1. The fix patches a TOCTOU symptom; a better API (\Microsoft.IO.File.Move(overwrite: true)\) exists in an already-referenced package 2. The fix handles 2 concurrent writers but fails with 3+ 3. Test assertions were too weak to verify correct behavior 4. The approach was borrowed from the VS editor (which never has 'file doesn't exist' case) without validating assumptions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…13504) Learnings from reviewing PR dotnet#13477 — the 24-dimension checklist missed design-level issues that required stepping back from the diff. ### Changes - **Dim 4 (Tests)**: Flag weak assertions that would pass with incorrect output - **Dim 10 (Design)**: Check alignment with original design intent, flag workarounds when better APIs exist in existing dependencies, validate borrowed patterns - **Dim 22 (Correctness)**: Generalize 2-participant fixes to N participants, flag symptom-only fixes - **Workflow**: Add historical context step (read linked issue + original feature PR) before dispatching dimension agents - **Tasks instructions**: Document \Microsoft.IO.Redist\ API availability for .NET Framework ### Why The review agents evaluated the diff mechanically but missed: 1. The fix patches a TOCTOU symptom; a better API (\Microsoft.IO.File.Move(overwrite: true)\) exists in an already-referenced package 2. The fix handles 2 concurrent writers but fails with 3+ 3. Test assertions were too weak to verify correct behavior 4. The approach was borrowed from the VS editor (which never has 'file doesn't exist' case) without validating assumptions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
https://github.com/huulinhnguyen-dev/msbuild into dev/huulinhnguyen/issue13323_flaky_test_transactional
| // Using overwrite: true handles concurrent writes without a race condition — | ||
| // both "target doesn't exist" and "target already exists" cases are covered | ||
| // by a single operation, with no window between them. | ||
| const int maxAttempts = 3; |
There was a problem hiding this comment.
this retry mechanism seems reasonalbe
| try | ||
| { | ||
| System.Threading.Thread.Sleep(10); | ||
| System.IO.File.Replace(temporaryFilePath, filePath, null, true); |
There was a problem hiding this comment.
this completely gets rid of the Replace path
Why was it used in the original PR and why you decided to not implement it here?
There was a problem hiding this comment.
File.Replace was used in the original PR because it uses the Windows ReplaceFile API which preserves file identity (attributes, security, timestamps). However, it requires the destination to already exist — throwing FileNotFoundException otherwise — and the fallback File.Move in that catch block was exactly where the race condition lived (another thread could create the destination in that window). File.Move(overwrite: true) handles both the "exists" and "not exists" cases atomically in a single call, eliminating the race window entirely. The tradeoff is losing attribute preservation, but correctness under concurrent writes takes priority here.
There was a problem hiding this comment.
In what case would the file identity matter? aren't you creating the file right before moving it anyway?
There was a problem hiding this comment.
Sorry for the confusion in my previous reply — saying "the tradeoff is losing attribute preservation" was misleading.. since we always create a fresh temp file first, there's no meaningful identity to preserve regardless. The File.Replace approach was unnecessary complexity.
Fixes #13323
Context
Concurrent parallel builds writing to the same file with
Overwrite="true"and transactional mode could throwIOException: Cannot create a file when that file already exists. This happened becauseFile.Movefailed when another thread created the target file between theReplacecheck and theMovecall.Changes Made
WriteLinesToFile.cs: whenFile.Movefails withIOException, retry usingFile.Replacesince the target file now existsWriteLinesToFile_Tests.cs: renameTransactionalModePreservesAllData→TransactionalModeSucceedsWithConcurrentOverwritesto accurately reflect test behavior (Overwrite="true"means only the last writer survives, not all data)Testing
Existing test
TransactionalModeSucceedsWithConcurrentOverwritescovers the concurrent overwrite scenario with parallel MSBuild projects.