Fix TerminalLogger IndexOutOfRangeException when replaying binlog with fewer nodes#12809
Conversation
|
Hello @@copilot, I noticed that you’re changing an .swr file or any file under src/Package/MSBuild.VSSetup.. Please make sure to validate this change by an experimental VS insertion. This is accomplished by pushing to an exp/* branch, which requires write permissions to this repo. |
…with fewer nodes Co-authored-by: YuliiaKovalova <95473390+YuliiaKovalova@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes an IndexOutOfRangeException that occurs when replaying a binary log with the TerminalLogger using fewer nodes than the original build. The issue arose because the logger initializes its _nodes array based on the replay's node count, but events in the binlog reference NodeIds from the original build with potentially higher values.
Key changes:
- Added dynamic array resizing via
EnsureNodeCapacity()method - Protected all
_nodesarray accesses in event handlers to ensure capacity before indexing - Added test case for binlog replay with mismatched node counts
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/Build/Logging/TerminalLogger/TerminalLogger.cs | Added EnsureNodeCapacity() method with exponential growth strategy and updated all three array access points (ProjectStarted, UpdateNodeStatus, MessageRaised) to call it before indexing |
| src/Build.UnitTests/TerminalLogger_Tests.cs | Added regression test that builds with /m:4 and replays with nodeCount: 1 to verify the fix works |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…lel builds Co-authored-by: YuliiaKovalova <95473390+YuliiaKovalova@users.noreply.github.com>
…h fewer nodes (dotnet#12809) Replaying a binary log with terminal logger fails when the replay uses fewer nodes (`/m:1`) than the original build. The logger initializes its `_nodes` array based on the replay's node count, but events in the binlog reference NodeIds from the original build, causing an `IndexOutOfRangeException`. ### Context During binary log replay, the TerminalLogger is initialized with the node count from the replay command (e.g., `/m:1`), but the binary log contains events with NodeIds from the original build which may have used more nodes (e.g., `/m:4` or `/m:8`). This mismatch causes an IndexOutOfRangeException when the logger tries to access array indices that don't exist. ### Changes Made - Added `_isReplayMode` flag to detect when running in binary log replay mode by checking if `IEventSource` is `IBinaryLogReplaySource` - Added `EnsureNodeCapacity()` method to dynamically resize `_nodes` array **only during replay mode** when encountering NodeIds beyond current capacity - Protected array resizing with **double-checked locking** (`lock (_lock)`) to prevent race conditions with the refresher thread - Updated all `_nodes` array accesses (in `UpdateNodeStatus`, `ProjectStarted`, `MessageRaised`) to call `EnsureNodeCapacity` before indexing - Enhanced test case to build **three separate projects in parallel** using `MSBuild` task with `BuildInParallel='true'` to ensure NodeIds > 1 are actually present in the binlog ```csharp private void EnsureNodeCapacity(int nodeIndex) { // Only resize in replay mode - during normal builds, the node count is fixed if (_isReplayMode && nodeIndex >= _nodes.Length) { lock (_lock) { if (nodeIndex >= _nodes.Length) { int newSize = Math.Max(nodeIndex + 1, _nodes.Length * 2); Array.Resize(ref _nodes, newSize); } } } } ``` ### Testing - ✅ All 68 existing TerminalLogger tests pass - ✅ New test validates binlog replay with fewer nodes using multi-project parallel build scenario - ✅ Manual testing confirmed: - Multi-project parallel build with `/m:8` → replay with `/m:1` using terminal logger: SUCCESS - Normal builds without replay continue to work correctly with no performance impact - Terminal logger only applies dynamic resizing during replay mode - Thread-safe resizing prevents race conditions with refresher thread ### Notes The fix is scoped specifically to replay scenarios - normal builds are completely unaffected by the dynamic resizing logic, ensuring zero performance impact on regular build operations. The replay mode detection (`_isReplayMode` flag) ensures the array resizing only occurs when actually needed, and the double-checked locking pattern ensures thread safety without impacting performance during normal array accesses. <!-- START COPILOT CODING AGENT SUFFIX --> <details> <summary>Original prompt</summary> > > ---- > > *This section details on the original issue you should resolve* > > <issue_title>Replaying binary log with terminal logger on sometimes may fail.</issue_title> > <issue_description>### Issue Description > > Replaying a binary log with terminal logger fails when the number of nodes specified for replay command is less than the number of nodes originally used during the build. > > ### Steps to Reproduce > > 1. Collect a binary log for a multi-project solution build. > 2. Attempt to replay it with /m:1 > > > ### Expected Behavior > > Replay succeeds. > > ### Actual Behavior > > Getting error "There was an exception while reading the log file: Index was outside the bounds of the array." from the terminal logger. > > Stack trace: > ``` > at Microsoft.Build.Logging.TerminalLogger.UpdateNodeStatus(BuildEventContext buildEventContext, TerminalNodeStatus nodeStatus) in C:\Users\alinama\work\msbuild\msbuild\src\Build\Logging\TerminalLogger\TerminalLogger.cs:line 990 > at Microsoft.Build.Logging.TerminalLogger.TargetStarted(Object sender, TargetStartedEventArgs e) in C:\Users\alinama\work\msbuild\msbuild\src\Build\Logging\TerminalLogger\TerminalLogger.cs:line 977 > at Microsoft.Build.Logging.EventArgsDispatcher.Dispatch(BuildEventArgs buildEvent) in C:\Users\alinama\work\msbuild\msbuild\src\Build\Logging\BinaryLogger\EventArgsDispatcher.cs:line 123 > at Microsoft.Build.Logging.BinaryLogReplayEventSource.Replay(BuildEventArgsReader reader, CancellationToken cancellationToken) in C:\Users\alinama\work\msbuild\msbuild\src\Build\Logging\BinaryLogger\BinaryLogReplayEventSource.cs:line 264 > at Microsoft.Build.Logging.BinaryLogReplayEventSource.Replay(String sourceFilePath, CancellationToken cancellationToken) in C:\Users\alinama\work\msbuild\msbuild\src\Build\Logging\BinaryLogger\BinaryLogReplayEventSource.cs:line 201 > at Microsoft.Build.CommandLine.MSBuildApp.ReplayBinaryLog(String binaryLogFilePath, ILogger[] loggers, IEnumerable`1 distributedLoggerRecords, Int32 cpuCount, Boolean isBuildCheckEnabled) in C:\Users\alinama\work\msbuild\msbuild\src\MSBuild\XMake.cs:line 4602 > ``` > ### Analysis > > This happens because of how terminal logger determines number of nodes to display. During the replay mode it uses currently specified number of nodes, instead of using the number of nodes that was used during the binary log collection. > > Workaround: use /m:big_enough_number during replaying. > > ### Versions & Configurations > > _No response_</issue_description> > > ## Comments on the Issue (you are @copilot in this section) > > <comments> > </comments> > </details> - Fixes dotnet#12806 <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: YuliiaKovalova <95473390+YuliiaKovalova@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Jan Provazník <janprovaznik@microsoft.com>
Replaying a binary log with terminal logger fails when the replay uses fewer nodes (
/m:1) than the original build. The logger initializes its_nodesarray based on the replay's node count, but events in the binlog reference NodeIds from the original build, causing anIndexOutOfRangeException.Context
During binary log replay, the TerminalLogger is initialized with the node count from the replay command (e.g.,
/m:1), but the binary log contains events with NodeIds from the original build which may have used more nodes (e.g.,/m:4or/m:8). This mismatch causes an IndexOutOfRangeException when the logger tries to access array indices that don't exist.Changes Made
_isReplayModeflag to detect when running in binary log replay mode by checking ifIEventSourceisIBinaryLogReplaySourceEnsureNodeCapacity()method to dynamically resize_nodesarray only during replay mode when encountering NodeIds beyond current capacitylock (_lock)) to prevent race conditions with the refresher thread_nodesarray accesses (inUpdateNodeStatus,ProjectStarted,MessageRaised) to callEnsureNodeCapacitybefore indexingMSBuildtask withBuildInParallel='true'to ensure NodeIds > 1 are actually present in the binlogTesting
/m:8→ replay with/m:1using terminal logger: SUCCESSNotes
The fix is scoped specifically to replay scenarios - normal builds are completely unaffected by the dynamic resizing logic, ensuring zero performance impact on regular build operations. The replay mode detection (
_isReplayModeflag) ensures the array resizing only occurs when actually needed, and the double-checked locking pattern ensures thread safety without impacting performance during normal array accesses.Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.