Skip to content

Fix race condition in LoggingService.ProcessLoggingEvent after shutdown#13450

Merged
YuliiaKovalova merged 2 commits intomainfrom
fix/logging-service-race-condition
Mar 27, 2026
Merged

Fix race condition in LoggingService.ProcessLoggingEvent after shutdown#13450
YuliiaKovalova merged 2 commits intomainfrom
fix/logging-service-race-condition

Conversation

@YuliiaKovalova
Copy link
Copy Markdown
Member

@YuliiaKovalova YuliiaKovalova commented Mar 26, 2026

Summary

Fix a race condition crash (access violation 0xC0000005 / NullReferenceException) in LoggingService.ProcessLoggingEvent when the method is called after the logging service has been shut down.

Problem

ProcessLoggingEvent directly accesses _eventQueue, _dequeueEvent, and _enqueueEvent without any shutdown guard. During ShutdownComponent(), CleanLoggingEventProcessing() disposes and nullifies all of these fields. If an external callback (e.g., Process.Exited from a ProjectCachePlugin) attempts to log a message after shutdown has completed, the null field access causes an unrecoverable crash.

Reproduction Scenario

  1. A build runs with a ProjectCachePlugin
  2. The build completes and LoggingService.ShutdownComponent() is called
  3. The cache plugin's external process exits, triggering Process.OnExited on a ThreadPool thread
  4. The callback logs "Project cache service process exited" via LoggingServiceToPluginLoggerAdapter.LogMessage
  5. This calls ProcessLoggingEvent() which accesses _eventQueue.Countnull reference crash

Debugger Evidence

_serviceState = Shutdown
_eventQueue = null
_dequeueEvent = null
_enqueueEvent = null
_emptyQueueEvent = null
_loggingEventProcessingCancellation = null

buildEvent.Message = "Project cache service process exited"
Call: System.Diagnostics.Process.OnExited -> ProjectCachePlugin.dll -> LoggingServiceToPluginLoggerAdapter.LogMessage

Root Cause

ProcessLoggingEvent lacked two protections that already exist elsewhere in the codebase:

  1. Shutdown state guard - Other public methods (RegisterLogger, RegisterDistributedLogger, InitializeNodeLoggers, InitializeComponent) all check _serviceState != LoggingServiceState.Shutdown. ProcessLoggingEvent did not.

  2. Local field capture - The LoggingEventProc queue pump thread already captures _eventQueue, _dequeueEvent, _emptyQueueEvent, and _enqueueEvent into local variables to prevent races with CleanLoggingEventProcessing(). ProcessLoggingEvent accessed the fields directly.

Fix

Three layers of defense, all following patterns already established in the same file:

  1. Early return when _serviceState == LoggingServiceState.Shutdown - silently drops the event, since the service is no longer operational.
  2. Capture fields to locals before use - prevents null dereference if CleanLoggingEventProcessing() runs concurrently between the state check and the field reads.
  3. Null-check the captured locals - second defense against the TOCTOU race between step 1 and step 2.

Risk

Low. The fix only adds early-exit guards - no behavioral change for the normal (non-shutdown) code path. The patterns used are already established in the same file.

ProcessLoggingEvent accessed _eventQueue, _dequeueEvent, and _enqueueEvent
directly without checking if the service had been shut down. When an external
callback (e.g., Process.Exited from a ProjectCachePlugin) attempted to log
after ShutdownComponent() had nullified these fields, a NullReferenceException
(access violation 0xC0000005) would occur.

The fix:
1. Early return when _serviceState == LoggingServiceState.Shutdown, matching
   the guard pattern already used by RegisterLogger, RegisterDistributedLogger,
   InitializeNodeLoggers, and InitializeComponent.
2. Capture field references to local variables before use, matching the
   existing pattern in LoggingEventProc (the queue pump thread).
3. Null-check the captured locals as a second defense against the race
   between the _serviceState check and field reads during concurrent shutdown.
Copilot AI review requested due to automatic review settings March 26, 2026 15:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a race where LoggingService.ProcessLoggingEvent could be invoked after shutdown (e.g., by external callbacks holding references) and crash due to shutdown-cleaned resources.

Changes:

  • Add an early-return guard in ProcessLoggingEvent when the service is already shut down, and use local copies of async-queue fields to reduce null races.
  • Add unit tests covering late logging after shutdown (sync/async) and concurrent shutdown vs. logging calls.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/Build/BackEnd/Components/Logging/LoggingService.cs Adds shutdown guard + local captures in async path to mitigate post-shutdown races.
src/Build.UnitTests/BackEnd/LoggingService_Tests.cs Adds regression tests for logging-after-shutdown and a concurrent shutdown/logging scenario.

…tion, Shouldly consistency

- Wrap wait-handle operations (WaitOne/Set) in try/catch for
  ObjectDisposedException, since shutdown can dispose handles after they
  are captured into locals but before they are used.
- Assert logThread.Join result to prevent silent test hangs.
- Switch Assert.Equal to Shouldly ShouldBe for consistency with the rest
  of the test file.
@YuliiaKovalova YuliiaKovalova merged commit db5eca3 into main Mar 27, 2026
10 checks passed
@YuliiaKovalova YuliiaKovalova deleted the fix/logging-service-race-condition branch March 27, 2026 15:08
SimaTian pushed a commit that referenced this pull request Apr 7, 2026
…wn (#13450)

## Summary

Fix a race condition crash (access violation 0xC0000005 /
NullReferenceException) in `LoggingService.ProcessLoggingEvent` when the
method is called after the logging service has been shut down.

## Problem

`ProcessLoggingEvent` directly accesses `_eventQueue`, `_dequeueEvent`,
and `_enqueueEvent` without any shutdown guard. During
`ShutdownComponent()`, `CleanLoggingEventProcessing()` disposes and
nullifies all of these fields. If an external callback (e.g.,
`Process.Exited` from a `ProjectCachePlugin`) attempts to log a message
after shutdown has completed, the null field access causes an
unrecoverable crash.

### Reproduction Scenario

1. A build runs with a `ProjectCachePlugin`
2. The build completes and `LoggingService.ShutdownComponent()` is
called
3. The cache plugin's external process exits, triggering
`Process.OnExited` on a ThreadPool thread
4. The callback logs "Project cache service process exited" via
`LoggingServiceToPluginLoggerAdapter.LogMessage`
5. This calls `ProcessLoggingEvent()` which accesses `_eventQueue.Count`
→ **null reference crash**

### Debugger Evidence

```
_serviceState = Shutdown
_eventQueue = null
_dequeueEvent = null
_enqueueEvent = null
_emptyQueueEvent = null
_loggingEventProcessingCancellation = null

buildEvent.Message = "Project cache service process exited"
Call: System.Diagnostics.Process.OnExited -> ProjectCachePlugin.dll -> LoggingServiceToPluginLoggerAdapter.LogMessage
```

## Root Cause

`ProcessLoggingEvent` lacked two protections that already exist
elsewhere in the codebase:

1. **Shutdown state guard** - Other public methods (`RegisterLogger`,
`RegisterDistributedLogger`, `InitializeNodeLoggers`,
`InitializeComponent`) all check `_serviceState !=
LoggingServiceState.Shutdown`. `ProcessLoggingEvent` did not.

2. **Local field capture** - The `LoggingEventProc` queue pump thread
already captures `_eventQueue`, `_dequeueEvent`, `_emptyQueueEvent`, and
`_enqueueEvent` into local variables to prevent races with
`CleanLoggingEventProcessing()`. `ProcessLoggingEvent` accessed the
fields directly.

## Fix

Three layers of defense, all following patterns already established in
the same file:

1. **Early return** when `_serviceState == LoggingServiceState.Shutdown`
- silently drops the event, since the service is no longer operational.
2. **Capture fields to locals** before use - prevents null dereference
if `CleanLoggingEventProcessing()` runs concurrently between the state
check and the field reads.
3. **Null-check the captured locals** - second defense against the
TOCTOU race between step 1 and step 2.

## Risk

**Low.** The fix only adds early-exit guards - no behavioral change for
the normal (non-shutdown) code path. The patterns used are already
established in the same file.
dfederm pushed a commit to dfederm/msbuild that referenced this pull request Apr 9, 2026
…wn (dotnet#13450)

## Summary

Fix a race condition crash (access violation 0xC0000005 /
NullReferenceException) in `LoggingService.ProcessLoggingEvent` when the
method is called after the logging service has been shut down.

## Problem

`ProcessLoggingEvent` directly accesses `_eventQueue`, `_dequeueEvent`,
and `_enqueueEvent` without any shutdown guard. During
`ShutdownComponent()`, `CleanLoggingEventProcessing()` disposes and
nullifies all of these fields. If an external callback (e.g.,
`Process.Exited` from a `ProjectCachePlugin`) attempts to log a message
after shutdown has completed, the null field access causes an
unrecoverable crash.

### Reproduction Scenario

1. A build runs with a `ProjectCachePlugin`
2. The build completes and `LoggingService.ShutdownComponent()` is
called
3. The cache plugin's external process exits, triggering
`Process.OnExited` on a ThreadPool thread
4. The callback logs "Project cache service process exited" via
`LoggingServiceToPluginLoggerAdapter.LogMessage`
5. This calls `ProcessLoggingEvent()` which accesses `_eventQueue.Count`
→ **null reference crash**

### Debugger Evidence

```
_serviceState = Shutdown
_eventQueue = null
_dequeueEvent = null
_enqueueEvent = null
_emptyQueueEvent = null
_loggingEventProcessingCancellation = null

buildEvent.Message = "Project cache service process exited"
Call: System.Diagnostics.Process.OnExited -> ProjectCachePlugin.dll -> LoggingServiceToPluginLoggerAdapter.LogMessage
```

## Root Cause

`ProcessLoggingEvent` lacked two protections that already exist
elsewhere in the codebase:

1. **Shutdown state guard** - Other public methods (`RegisterLogger`,
`RegisterDistributedLogger`, `InitializeNodeLoggers`,
`InitializeComponent`) all check `_serviceState !=
LoggingServiceState.Shutdown`. `ProcessLoggingEvent` did not.

2. **Local field capture** - The `LoggingEventProc` queue pump thread
already captures `_eventQueue`, `_dequeueEvent`, `_emptyQueueEvent`, and
`_enqueueEvent` into local variables to prevent races with
`CleanLoggingEventProcessing()`. `ProcessLoggingEvent` accessed the
fields directly.

## Fix

Three layers of defense, all following patterns already established in
the same file:

1. **Early return** when `_serviceState == LoggingServiceState.Shutdown`
- silently drops the event, since the service is no longer operational.
2. **Capture fields to locals** before use - prevents null dereference
if `CleanLoggingEventProcessing()` runs concurrently between the state
check and the field reads.
3. **Null-check the captured locals** - second defense against the
TOCTOU race between step 1 and step 2.

## Risk

**Low.** The fix only adds early-exit guards - no behavioral change for
the normal (non-shutdown) code path. The patterns used are already
established in the same file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants