-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
Description
When using a custom host (nethost/hostfxr) on Windows, I experience intermittent crashes when my program is under heavy GC load (in my case, specifically in a code path that creates a lot of short-lived byte arrays). This was also independently observed at Reloaded-Project/Reloaded-II#588.
Reproduction Steps
I don't have exact reproduction steps available for this crash, as it's not 100% consistent, but here's what I've observed so far:
- This seems to only happen on custom hosts - I can't get this to happen in a standard C# console application. In all cases I've observed this crash, it's been from a C# library loaded in a C++ application using hostfxr.
- Applications seem to crash semi-consistently when making large amounts of allocations, which leads me to think it's GC related. While I haven't had much success trying to intentionally trigger the crash, @Sewer56 was able to reproduce a crash in their codebase fairly easily with this snippet:
for (int x = 0; x < 10000; x++)
{
var buf = new byte[1337];
GC.Collect();
}- I was unable to get a MRE in a C++ console application using hostfxr, so there's likely something more at play here.
I apologize for the vague reproduction steps, this bug is quite elusive and I hope to find a proper MRE soon. Hopefully there might be some ideas on how to reproduce this from a .NET team member?
Expected behavior
The runtime should not crash.
Actual behavior
The program displays the message Fatal error. Internal CLR error. (0x80131506) (COR_E_EXECUTIONENGINE) in standard output and then crashes with STATUS_ACCESS_VIOLATION dereferencing a null pointer. While I can't upload the entire minidump here (it contains sensitive information), here's a snippet from WinDbg (referenced source is the commit linked below):
FAULTING_SOURCE_LINE: D:\code\csharp\crashrepro\runtime\src\coreclr\gc\gc.cpp
FAULTING_SOURCE_FILE: D:\code\csharp\crashrepro\runtime\src\coreclr\gc\gc.cpp
FAULTING_SOURCE_LINE_NUMBER: 41333
FAULTING_SOURCE_CODE:
11606: inline size_t my_get_size (Object* ob)
11607: {
11608: MethodTable* mT = header(ob)->GetMethodTable();
11609:
>11610: return (mT->GetBaseSize() +
11611: (mT->HasComponentSize() ?
11612: ((size_t)((CObjectHeader*)ob)->GetNumComponents() * mT->RawGetComponentSize()) : 0));
11613: }
11614:
11615: #define size(i) my_get_size (header(i))
SYMBOL_NAME: coreclr!WKS::gc_heap::find_first_object+e8
Regression?
This started happening with .NET 9.0.5 and previously worked on .NET 9.0.4.
Known Workarounds
None, beyond pinning to 9.0.4.
Configuration
- Version: .NET 9.0.5, .NET 9.0.6, .NET 10 Preview 5
- OS: Windows 11 23H2 (22631.5189)
- Architecture: x64
Other information
A git bisect led me to 0fb125a as the first bad commit, which matches my testing (the commit before 9226298 is good). I've also heard from a friend that this started with .NET 10 Preview 3, and while I haven't tested the reproduction case myself, that lines up (fa3beb9).