Fix deterministic MountVolume test failures on ARM64 Helix machines#126660
Fix deterministic MountVolume test failures on ARM64 Helix machines#126660danmoseley merged 4 commits intodotnet:mainfrom
Conversation
|
/azp run runtime-ioslikesimulator |
|
/azp run runtime-android |
|
/azp run runtime-maccatalyst |
|
Azure Pipelines successfully started running 1 pipeline(s). |
2 similar comments
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Tagging subscribers to this area: @dotnet/area-system-io |
There was a problem hiding this comment.
Pull request overview
Addresses deterministic failures in Windows MountVolume-related filesystem tests on ARM64 Helix by improving second-drive selection and adding Helix diagnostics to make drive/volume issues easier to diagnose from logs.
Changes:
- Filter candidate “other NTFS drives” to only those that support volume mount point operations (via
GetVolumeNameForVolumeMountPoint). - Add a Helix-only diagnostic test that logs drive details and volume GUID availability.
- Improve Mount/Unmount error messages by using P/Invoke error messages instead of raw codes, and add a Helix guard for missing second-drive scenarios.
Show a summary per file
| File | Description |
|---|---|
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/System.IO.FileSystem.Tests.csproj |
Adds the new DumpDriveInformation.cs test file to the project. |
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/PortedCommon/IOServices.cs |
Enhances NTFS drive selection by skipping drives without a volume GUID/mount-point support. |
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/PortedCommon/DllImports.cs |
Adds GetVolumeNameForVolumeMountPointW P/Invoke used to validate mount-point support. |
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/DumpDriveInformation.cs |
New Helix-only diagnostic test that prints drive and volume GUID information to the console log. |
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/Directory/ReparsePoints_MountVolume.cs |
Adds a Helix check to fail when a suitable second NTFS drive can’t be found. |
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/Directory/Delete_MountVolume.cs |
Adds a Helix check to fail when a suitable second NTFS drive can’t be found. |
src/libraries/Common/tests/System/IO/ReparsePointUtilities.cs |
Improves mount/unmount exception messages by using P/Invoke error message APIs. |
Copilot's findings
Comments suppressed due to low confidence (1)
src/libraries/Common/tests/System/IO/ReparsePointUtilities.cs:260
Marshal.GetLastPInvokeError()/Marshal.GetLastPInvokeErrorMessage()are not available on .NET Framework. Since this file is compiled forNETFRAMEWORKin some test projects, this change will break those builds unless guarded (e.g., keepGetLastWin32Error()on NETFRAMEWORK and format the message viaWin32Exception).
bool r = DeleteVolumeMountPoint(mountPoint);
if (!r)
{
int error = Marshal.GetLastPInvokeError();
// Ignore expected cleanup errors: 4390 (ERROR_NOT_A_REPARSE_POINT),
// 3 (ERROR_PATH_NOT_FOUND), 2 (ERROR_FILE_NOT_FOUND)
if (error != 4390 && error != 3 && error != 2)
throw new Exception(string.Format("DeleteVolumeMountPoint({0}) failed: {1}", mountPoint, Marshal.GetLastPInvokeErrorMessage()));
Console.WriteLine(string.Format("Ignoring expected error while unmounting {0}: {1}", mountPoint, Marshal.GetLastPInvokeErrorMessage()));
- Files reviewed: 7/7 changed files
- Comments generated: 4
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/DumpDriveInformation.cs
Show resolved
Hide resolved
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/Directory/Delete_MountVolume.cs
Outdated
Show resolved
Hide resolved
...aries/System.Runtime/tests/System.IO.FileSystem.Tests/Directory/ReparsePoints_MountVolume.cs
Outdated
Show resolved
Hide resolved
e6a88d1 to
cafd910
Compare
There was a problem hiding this comment.
Pull request overview
This PR addresses deterministic failures in the System.IO mount-volume tests on Windows 11 ARM64 Helix machines by avoiding “fixed/NTFS/ready” drives that don’t actually support volume mount-point operations (no volume GUID), and by adding Helix-focused diagnostics/guards to keep cross-drive scenarios from silently becoming dead code.
Changes:
- Update
IOServices.GetNtfsDriveOtherThan(...)to also require a successfulGetVolumeNameForVolumeMountPoint(volume GUID present) before selecting a “usable” NTFS drive. - Add Helix-only hard-fail guards in both mount-volume tests when no suitable second NTFS drive is available.
- Add a Helix-only diagnostic test to dump drive and volume GUID information into Helix logs.
Show a summary per file
| File | Description |
|---|---|
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/System.IO.FileSystem.Tests.csproj | Includes the new DumpDriveInformation.cs test file in the test project. |
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/PortedCommon/IOServices.cs | Filters candidate NTFS drives by requiring a volume GUID (mount-point capable). |
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/PortedCommon/DllImports.cs | Adds GetVolumeNameForVolumeMountPointW P/Invoke via LibraryImport. |
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/DumpDriveInformation.cs | Adds Helix-only diagnostic logging of drive details + selected “other NTFS drive”. |
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/Directory/ReparsePoints_MountVolume.cs | Adds Helix guard that fails when no suitable second NTFS drive is found. |
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/Directory/Delete_MountVolume.cs | Adds Helix guard that fails when no suitable second NTFS drive is found. |
Copilot's findings
- Files reviewed: 6/6 changed files
- Comments generated: 3
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/PortedCommon/IOServices.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/DumpDriveInformation.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/DumpDriveInformation.cs
Outdated
Show resolved
Hide resolved
IOServices.GetNtfsDriveOtherThan() now verifies drives have a volume GUID via GetVolumeNameForVolumeMountPoint before returning them. This filters out SUBST drives and Azure resource disks that report as Fixed/NTFS/Ready but don't support volume mount point operations (error 87). Also: - Assert.Fail on Helix when no suitable second drive found (prevents silent skip) - Added DumpDriveInformation diagnostic test for Helix console logs Fixes dotnet#125295 Fixes dotnet#125624 Fixes dotnet#126627 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
cafd910 to
8ccebe9
Compare
… Assert.Fail LibraryImport source generator does not support StringBuilder parameters - the CI dump showed VolumeGUID=\ because the output buffer was never populated. Switch to char[] which LibraryImport handles correctly. Remove Assert.Fail on Helix when no second drive found - some ARM64 Helix machines legitimately have only C:\ and a CD-ROM. The cross-drive scenarios skip gracefully in that case (existing behavior). The DumpDriveInformation test still provides diagnostics in the Helix log. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR addresses deterministic failures in the MountVolume-related filesystem tests on Windows 11 ARM64 Helix machines by ensuring the “other NTFS drive” selection skips fixed/NTFS/ready drives that don’t actually support volume mount point operations (e.g., drives without a volume GUID).
Changes:
- Update
IOServices.GetNtfsDriveOtherThan(...)to additionally require that candidate drives have a volume GUID viaGetVolumeNameForVolumeMountPoint. - Add a
GetVolumeNameForVolumeMountPointWLibraryImportto the testDllImportshelpers. - Add a Helix-only diagnostic test (
DumpDriveInformation) and wire it into the test project to print drive/volume details to the Helix console log.
Show a summary per file
| File | Description |
|---|---|
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/System.IO.FileSystem.Tests.csproj | Includes the new diagnostic test source file in the test project compilation list. |
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/PortedCommon/IOServices.cs | Filters candidate NTFS drives by requiring successful GetVolumeNameForVolumeMountPoint (i.e., a volume GUID exists). |
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/PortedCommon/DllImports.cs | Adds GetVolumeNameForVolumeMountPointW P/Invoke for drive capability detection and diagnostics. |
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/DumpDriveInformation.cs | Adds Helix-only logging test that dumps drive properties and (on Windows) volume GUID availability to aid CI diagnosis. |
Copilot's findings
- Files reviewed: 4/4 changed files
- Comments generated: 2
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/DumpDriveInformation.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/DumpDriveInformation.cs
Show resolved
Hide resolved
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR aims to address deterministic MountVolume-related test failures on Windows ARM64 Helix agents by ensuring the “other NTFS drive” selection excludes drives that don’t support volume mount point operations, and by adding Helix-only diagnostics to quickly identify problematic drives.
Changes:
- Update
IOServices.GetNtfsDriveOtherThan()to require a successfulGetVolumeNameForVolumeMountPointcall (filters out drives without a usable volume GUID). - Add a new kernel32 P/Invoke (
GetVolumeNameForVolumeMountPointW) to the testDllImports. - Add a Helix-only diagnostic “test” that prints drive properties and volume GUID availability to the console, and include it in the test project.
Show a summary per file
| File | Description |
|---|---|
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/System.IO.FileSystem.Tests.csproj | Includes the new diagnostic test file in the test project compilation list. |
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/PortedCommon/IOServices.cs | Filters candidate NTFS drives by requiring GetVolumeNameForVolumeMountPoint success. |
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/PortedCommon/DllImports.cs | Adds P/Invoke for GetVolumeNameForVolumeMountPointW. |
| src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/DumpDriveInformation.cs | Adds Helix-only logging of drive info + volume GUID presence for future diagnosis. |
Copilot's findings
- Files reviewed: 4/4 changed files
- Comments generated: 1
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/DumpDriveInformation.cs
Show resolved
Hide resolved
adamsitnik
left a comment
There was a problem hiding this comment.
@copilot please address my feedback
src/libraries/System.Runtime/tests/System.IO.FileSystem.Tests/DumpDriveInformation.cs
Show resolved
Hide resolved
adamsitnik
left a comment
There was a problem hiding this comment.
LGTM, thank you @danmoseley !
Note
This PR was created with Copilot assistance.
Fix deterministic MountVolume test failures on ARM64 Helix machines
Fixes #125295, fixes #125624, fixes #126627
Problem
Directory_Delete_MountVolume.RunTestandDirectory_ReparsePoints_MountVolume.runTestfail deterministically (~100% of the time, ~750ms duration) on theWindows.11.Arm64.OpenHelix machine pool. This is not timing-related and was not addressed by the delay/polling fixes in #125914 or the Unmount resilience fix in #125625 (those PRs fixed real timing issues -- pre-fix failures on other configurations have since expired from AzDO retention, so we can't verify directly, but there is no evidence they were ineffective for their intended purpose).Root cause: The ARM64 Helix machines have an E:\ drive (likely an Azure resource/temp disk) that passes all
DriveInfochecks --DriveType=Fixed,DriveFormat=NTFS,IsReady=True-- butGetVolumeNameForVolumeMountPointfails withERROR_INVALID_PARAMETER(87). The drive has no volume GUID and doesn't support volume mount point operations.IOServices.GetNtfsDriveOtherThanCurrent()returns this drive, and the test crashes trying to use it.Some ARM64 Helix machines have only C:\ and a CD-ROM (no second drive at all). On those machines, the cross-drive scenarios already skip gracefully and only same-drive scenarios 3.x run.
Evidence
Analyzed Helix console logs from 5 post-fix builds (all
arm64-NativeAOT-Win11, same C:\ volume GUID). Every failure shows the identical pattern:GetVolumeNameForVolumeMountPoint("E:\")-> error 87SetVolumeMountPointonto E:\ succeeds but path traversal through the mount point fails withDirectoryNotFoundExceptionReproduced locally by removing the real E: drive letter and creating
SUBST E:which exhibits identical error 87 behavior.Changes
IOServices.GetNtfsDriveOtherThan(): After the existing Fixed/Ready/NTFS checks, also verify the drive has a volume GUID viaGetVolumeNameForVolumeMountPoint. Drives without one (SUBST drives, Azure resource disks) are skipped.DumpDriveInformationdiagnostic test: New Helix-only test (following theDescriptionNameTests.DumpRuntimeInformationToConsolepattern) that dumps all drives with their volume GUIDs to the console log. Makes future drive-related CI issues immediately diagnosable from the same Helix work item log.GetVolumeNameForVolumeMountPointP/Invoke in DllImports.cs: Useschar[](notStringBuilder) because this file usesLibraryImportwhich does not supportStringBuilder.Local validation