Description
When porting our application from .NET 8 to .NET 10 we discovered a crash because one of of our global COM-like objects got destroyed when we use this interface from C#. That interface is stored in the static field of the static class.
Initially I thought that our code messed up reference count and called Release but after many hours of debugging I found that this is the COM stub that calls Release for this interface under certain circumstances but it didn't call AddRef. As a result, after 2 such calls the COM object got destroyed and next call will crash.
My understanding of the problem is following. The App!ILStubClass.IL_STUB_CLRtoCOM() function has following code when project is compiled in the Release configuration:
00007ffb`035b4c64 e84724b55f call coreclr!StubHelpers::GetCOMIPFromRCW (7ffb631070b0)
00007ffb`035b4c69 4c8bf0 mov r14, rax
00007ffb`035b4c6c 4d85f6 test r14, r14
00007ffb`035b4c6f 755d jne 00007FFB035B4CCE
00007ffb`035b4c71 c745b401000000 mov dword ptr [rbp-4Ch], 1
If the coreclr!StubHelpers::GetCOMIPFromRCW returns NULL, then that code sets a flag in the [rbp-4Ch]. Then the following code:
00007ffb`035b4caf 488b45c0 mov rax, qword ptr [rbp-40h]
00007ffb`035b4cb3 ffd0 call rax
call a native COM function. Then eventually following code executed:
00007ffb`035b4cf3 837db400 cmp dword ptr [rbp-4Ch], 0
00007ffb`035b4cf7 741e je 00007FFB035B4D17
00007ffb`035b4cf9 4d85ff test r15, r15
00007ffb`035b4cfc 7419 je 00007FFB035B4D17
00007ffb`035b4cfe 4d8b17 mov r10, qword ptr [r15]
00007ffb`035b4d01 4d8b5210 mov r10, qword ptr [r10+10h]
00007ffb`035b4d05 498bcf mov rcx, r15
00007ffb`035b4d08 49bb30d457a5d8010000 mov r11, 1D8A557D430h
00007ffb`035b4d12 e839f4c25f call coreclr!GenericPInvokeCalliHelper (7ffb631e4150)
This code checks [rbp-4Ch] flag and if it is set, checks if the interface is not NULL and if it is not the code calls Release.
But that code never calls AddRef in my case. This code works fine when thread is created by .NET but failed when thread is created by native code.
This code works just fine .NET 8 and .NET 9.
Reproduction Steps
Open attached project and extract at any location. Open and compile a solution that is located at Cpp\Project17.sln . I used VS 2022. Then open and run a solution from DotNet\App.slnx. After some time the application will stop on the ::DebugBreak(); in the ~ClientApiAgile() destructor.
Problem.zip
Expected behavior
The attached project should work indefinitely without ever reaching the ~ClientApiAgile() destructor because there is no code that releases references to this interface. And it works indefinitely in .NET 8 and .NET 9.
Actual behavior
The attached project sometimes calls IUnknown::Release during call to the clientApiAgile.Empty() function when it called from the native thread and as a result, this object is got destroyed when last reference is released and as a result, the application will crash.
Regression?
Yes, .NET 6, .NET 8 and .NET 9 works just fine.
Known Workarounds
None
Configuration
.NET 10 10.03 and 10.0.5. Windows 11 x64.
Other information
The C++ project uses pure WinApi and does not requires anything else. It contains very basic implementation of the COM-like object that will be called from the .NET side. This object implements an IAgileObject interface. Please note this is not a production code and just something I created quickly to demonstrate the problem.
It exports the GetClientApiAgile function that creates and return new instance of the IClientApiAgile interface. The RunTests function will create 15 threads and each of them will call the passed callback. Then repeat it again and again.
The C# code calls the GetClientApiAgile function and stores result in the clientApiAgile static field. Then it creates function pointer from the delegate and call the RunTests function while passing that function pointer.
The delegate calls the DotNetCallback function and that functions calls clientApiAgile.Empty(). As you can see there is no code that calls Release in any shape or form.
Description
When porting our application from .NET 8 to .NET 10 we discovered a crash because one of of our global COM-like objects got destroyed when we use this interface from C#. That interface is stored in the static field of the static class.
Initially I thought that our code messed up reference count and called
Releasebut after many hours of debugging I found that this is the COM stub that callsReleasefor this interface under certain circumstances but it didn't callAddRef. As a result, after 2 such calls the COM object got destroyed and next call will crash.My understanding of the problem is following. The
App!ILStubClass.IL_STUB_CLRtoCOM()function has following code when project is compiled in the Release configuration:If the
coreclr!StubHelpers::GetCOMIPFromRCWreturnsNULL, then that code sets a flag in the[rbp-4Ch]. Then the following code:call a native COM function. Then eventually following code executed:
This code checks
[rbp-4Ch]flag and if it is set, checks if the interface is notNULLand if it is not the code callsRelease.But that code never calls
AddRefin my case. This code works fine when thread is created by .NET but failed when thread is created by native code.This code works just fine .NET 8 and .NET 9.
Reproduction Steps
Open attached project and extract at any location. Open and compile a solution that is located at
Cpp\Project17.sln. I used VS 2022. Then open and run a solution fromDotNet\App.slnx. After some time the application will stop on the::DebugBreak();in the~ClientApiAgile()destructor.Problem.zip
Expected behavior
The attached project should work indefinitely without ever reaching the
~ClientApiAgile()destructor because there is no code that releases references to this interface. And it works indefinitely in .NET 8 and .NET 9.Actual behavior
The attached project sometimes calls
IUnknown::Releaseduring call to theclientApiAgile.Empty()function when it called from the native thread and as a result, this object is got destroyed when last reference is released and as a result, the application will crash.Regression?
Yes, .NET 6, .NET 8 and .NET 9 works just fine.
Known Workarounds
None
Configuration
.NET 10 10.03 and 10.0.5. Windows 11 x64.
Other information
The C++ project uses pure WinApi and does not requires anything else. It contains very basic implementation of the COM-like object that will be called from the .NET side. This object implements an
IAgileObjectinterface. Please note this is not a production code and just something I created quickly to demonstrate the problem.It exports the
GetClientApiAgilefunction that creates and return new instance of theIClientApiAgileinterface. TheRunTestsfunction will create 15 threads and each of them will call the passed callback. Then repeat it again and again.The C# code calls the
GetClientApiAgilefunction and stores result in theclientApiAgilestatic field. Then it creates function pointer from the delegate and call theRunTestsfunction while passing that function pointer.The delegate calls the
DotNetCallbackfunction and that functions callsclientApiAgile.Empty(). As you can see there is no code that callsReleasein any shape or form.