Skip to content

.NET 6.0.4 incorrect profiler callback sequence (CRITICAL) #68760

@ww898

Description

@ww898

The bug is very critical for us. Our customers are getting a lot crashes under profiler.

.NET 6.0.4 x64, Windows 10.0.19044

Our profiler crashed in case when Enter/Leave/Tailcall was called on the thread where ICorProfilerCallback::ThreadCreated wasn't called before. I used the Time Travel Debugger (TTD) to make the situation clear. The investigating thread is 0x7dec. The beginning state show us that the thread is native:

0:028> ~.
. 28  Id: 664.7dec Suspend: 4096 Teb: 00000040`af9a0000 Unfrozen
0:028> !threads
ThreadCount:      18
UnstartedThread:  0
BackgroundThread: 17
PendingThread:    0
DeadThread:       0
Hosted Runtime:   no
                                                                                                            Lock  
 DBG   ID     OSID ThreadOBJ           State GC Mode     GC Alloc Context                  Domain           Count Apt Exception
   0    1     40c8 00000210E28F4000    26020 Preemptive  0000021080098D80:0000021080099FD0 00000210e284b430 -00001 STA 
  10    2     7130 00000210E43DB2A0    2b220 Preemptive  0000000000000000:0000000000000000 00000210e284b430 -00001 MTA (Finalizer) 
  11    3     4250 00000210E85A3F60    2b220 Preemptive  000002108003D940:000002108003DFD0 00000210e284b430 -00001 MTA 
  15    4     31fc 00000210E8E8DAA0    2b220 Preemptive  000002108001D440:000002108001DFD0 00000210e284b430 -00001 MTA 
  19    5     b224 00000210E9221650  102a220 Preemptive  0000000000000000:0000000000000000 00000210e284b430 -00001 MTA (Threadpool Worker) 
  21    6     7498 00000210E92A7380  102b220 Preemptive  0000021080078248:00000210800791E8 00000210e284b430 -00001 MTA (Threadpool Worker) 
  22    7     865c 00000210E94744F0  302b220 Preemptive  0000021080064DF8:0000021080065FD0 00000210e284b430 -00001 MTA (Threadpool Worker) 
  23    8     9fc0 00000210E94A6BF0  102b220 Preemptive  0000021080070FC8:0000021080071FD0 00000210e284b430 -00001 MTA (Threadpool Worker) 
  24    9     6590 00000210E94CB850  102b220 Preemptive  0000021080086FE8:0000021080087FD0 00000210e284b430 -00001 MTA (Threadpool Worker) 
  25   10     a750 00000210EEC72F20    20220 Preemptive  0000021080079300:000002108007B1E8 00000210e284b430 -00001 Ukn 
  29   11     c7fc 00000210EEC7C8C0    20220 Preemptive  0000021080072358:0000021080073FD0 00000210e284b430 -00001 Ukn 
  32   12     c1d0 00000210EECC26B0  102b220 Preemptive  000002108007D768:000002108007F1E8 00000210e284b430 -00001 MTA (Threadpool Worker) 
  31   13     c3c4 00000210EECC2D40  102b220 Preemptive  000002108007C088:000002108007D1E8 00000210e284b430 -00001 MTA (Threadpool Worker) 
  33   14     8174 00000210EECC33D0  102b220 Preemptive  000002108007FE18:00000210800811E8 00000210e284b430 -00001 MTA (Threadpool Worker) 
  34   15     b320 00000210EECC2020  8029220 Preemptive  00000210800916F8:0000021080091FD0 00000210e284b430 -00001 MTA (Threadpool Completion Port) 
  35   16     93bc 00000210EED4C210  8029220 Preemptive  0000021080094360:0000021080095FD0 00000210e284b430 -00001 MTA (Threadpool Completion Port) 
  36   17     2af8 00000210EED4BB80  8029220 Preemptive  0000021080092A30:0000021080093FD0 00000210e284b430 -00001 MTA (Threadpool Completion Port) 
  37   18     b140 00000210EED4D5C0  8029220 Preemptive  0000000000000000:0000000000000000 00000210e284b430 -00001 MTA (Threadpool Completion Port) 

We are in ntdll!LdrpDispatchUserCallTarget:

    ntdll!LdrpDispatchUserCallTarget:
00007ff9`bb3fc620 4c8b1d812d0f00 mov     r11, qword ptr [ntdll!LdrSystemDllInitBlock+0xb8 (00007ff9`bb4ef3a8)]
00007ff9`bb3fc627 4c8bd0         mov     r10, rax
00007ff9`bb3fc62a 49c1ea09       shr     r10, 9
00007ff9`bb3fc62e 4f8b1cd3       mov     r11, qword ptr [r11+r10*8]
00007ff9`bb3fc632 4c8bd0         mov     r10, rax
00007ff9`bb3fc635 49c1ea03       shr     r10, 3
00007ff9`bb3fc639 a80f           test    al, 0Fh
00007ff9`bb3fc63b 7509           jne     ntdll!LdrpDispatchUserCallTarget+0x26 (00007ff9`bb3fc646)
00007ff9`bb3fc63d 4d0fa3d3       bt      r11, r10
00007ff9`bb3fc641 730e           jae     ntdll!LdrpDispatchUserCallTarget+0x31 (00007ff9`bb3fc651)
00007ff9`bb3fc643 48ffe0         jmp     rax {00007ff9`04485ab0} ;  <===================== RIP
00007ff9`bb3fc646 490fbaf200     btr     r10, 0
00007ff9`bb3fc64b 4d0fa3d3       bt      r11, r10
00007ff9`bb3fc64f 730d           jae     ntdll!LdrpDispatchUserCallTarget+0x3e (00007ff9`bb3fc65e)
00007ff9`bb3fc651 4983ca01       or      r10, 1
00007ff9`bb3fc655 4d0fa3d3       bt      r11, r10
00007ff9`bb3fc659 7303           jae     ntdll!LdrpDispatchUserCallTarget+0x3e (00007ff9`bb3fc65e)
00007ff9`bb3fc65b 48ffe0         jmp     rax
00007ff9`bb3fc65e 41ba01000000   mov     r10d, 1
00007ff9`bb3fc664 e957feffff     jmp     ntdll!LdrpHandleInvalidUserCallTarget (00007ff9`bb3fc4c0)

With stack:

0:028> k
 # Child-SP          RetAddr               Call Site
00 00000040`b417f178 00007ff9`b9e7688a     ntdll!LdrpDispatchUserCallTarget+0x23
01 00000040`b417f180 00007ff9`b9e78ad3     WS2_32!NSMASTERJOB::SignalCompleted+0x282
02 00000040`b417f200 00007ff9`b9e78713     WS2_32!NSJOB::SignalCompleted+0xa3
03 00000040`b417f240 00007ff9`b9e78559     WS2_32!NSSUBJOB::SignalCompletedQDNS+0x143
04 00000040`b417f330 00007ff9`b9e77e1f     WS2_32!NSSUBJOB::SignalCompletedQDNS4F+0xf9
05 00000040`b417f380 00007ff9`b9e7a78c     WS2_32!NSSUBJOB::SignalCompletedQDNS4F+0x3f
06 00000040`b417f3c0 00007ff9`b9e7a6e5     WS2_32!WSALookupServiceNextAsyncWCallback+0x2c
07 00000040`b417f400 00007ff9`b9e79f49     WS2_32!NSQUERY::LookupServiceNextAsyncCoalesce+0x141
08 00000040`b417f480 00007ff9`bb3e0bf9     WS2_32!NSQUERY::LookupServiceNextAsyncCallback+0x229
09 00000040`b417f590 00007ff9`bb3c315a     ntdll!TppSimplepExecuteCallback+0x99
0a 00000040`b417f5e0 00007ff9`b9447034     ntdll!TppWorkerThread+0x68a
0b 00000040`b417f8e0 00007ff9`bb3c2651     KERNEL32!BaseThreadInitThunk+0x14
0c 00000040`b417f910 00000000`00000000     ntdll!RtlUserThreadStart+0x21

After executing jumps we are in coreclr.dll:

00007ff9`04488c90 55                   push    rbp
00007ff9`04488c91 57                   push    rdi
00007ff9`04488c92 56                   push    rsi
00007ff9`04488c93 4883ec30             sub     rsp, 30h
00007ff9`04488c97 488d6c2440           lea     rbp, [rsp+40h]
00007ff9`04488c9c 894d10               mov     dword ptr [rbp+10h], ecx
00007ff9`04488c9f 895518               mov     dword ptr [rbp+18h], edx
00007ff9`04488ca2 4c894520             mov     qword ptr [rbp+20h], r8
00007ff9`04488ca6 b986005002           mov     ecx, 2500086h
00007ff9`04488cab 488d5510             lea     rdx, [rbp+10h]
00007ff9`04488caf ff1523d7945f         call    qword ptr [coreclr!hlpDynamicFuncTable+0xe8 (00007ff9`63dd63d8)]  ; <================================= RPI
00007ff9`04488cb5 8b4d10               mov     ecx, dword ptr [rbp+10h]
00007ff9`04488cb8 8b5518               mov     edx, dword ptr [rbp+18h]
00007ff9`04488cbb 4c8b4520             mov     r8, qword ptr [rbp+20h]
00007ff9`04488cbf 8bf1                 mov     esi, ecx
00007ff9`04488cc1 498bf8               mov     rdi, r8
00007ff9`04488cc4 488d4de0             lea     rcx, [rbp-20h]
00007ff9`04488cc8 48baf0415804f97f0000 mov     rdx, 7FF9045841F0h
00007ff9`04488cd2 4533c0               xor     r8d, r8d
00007ff9`04488cd5 e8f6ca605f           call    coreclr!JIT_ReversePInvokeEnterTrackTransitions (00007ff9`63a957d0)
00007ff9`04488cda 8bce                 mov     ecx, esi
00007ff9`04488cdc 488bd7               mov     rdx, rdi
00007ff9`04488cdf e8d4cdffff           call    00007ff9`04485ab8
00007ff9`04488ce4 488d4de0             lea     rcx, [rbp-20h]
00007ff9`04488ce8 e8f3e9605f           call    coreclr!JIT_ReversePInvokeExitTrackTransitions (00007ff9`63a976e0)
00007ff9`04488ced b986005002           mov     ecx, 2500086h
00007ff9`04488cf2 488d5510             lea     rdx, [rbp+10h]
00007ff9`04488cf6 ff15e4d6945f         call    qword ptr [coreclr!hlpDynamicFuncTable+0xf0 (00007ff9`63dd63e0)]
00007ff9`04488cfc 90                   nop     
00007ff9`04488cfd 4883c430             add     rsp, 30h
00007ff9`04488d01 5e                   pop     rsi
00007ff9`04488d02 5f                   pop     rdi
00007ff9`04488d03 5d                   pop     rbp
00007ff9`04488d04 c3                   ret     

The call qword ptr [coreclr!hlpDynamicFuncTable+0xe8] to our profiler code:

00007df4`71270000 50                   push    rax
00007df4`71270001 51                   push    rcx
00007df4`71270002 52                   push    rdx
00007df4`71270003 4150                 push    r8
00007df4`71270005 4151                 push    r9
00007df4`71270007 4152                 push    r10
00007df4`71270009 4153                 push    r11
00007df4`7127000b 4883ec30             sub     rsp, 30h
00007df4`7127000f 660f7f442420         movdqa  xmmword ptr [rsp+20h], xmm0
00007df4`71270015 488bd1               mov     rdx, rcx
00007df4`71270018 48b980fe39e410020000 mov     rcx, 210E439FE80h
00007df4`71270022 48b810474b71f97f0000 mov     rax, offset JetBrains_Profiler_Core!jbprof::performance_tracing_profiler::Rdtsc_fast_enter (00007ff9`714b4710)
00007df4`7127002c ffd0                 call    rax
00007df4`7127002e 660f6f442420         movdqa  xmm0, xmmword ptr [rsp+20h]
00007df4`71270034 4883c430             add     rsp, 30h
00007df4`71270038 415b                 pop     r11
00007df4`7127003a 415a                 pop     r10
00007df4`7127003c 4159                 pop     r9
00007df4`7127003e 4158                 pop     r8
00007df4`71270040 5a                   pop     rdx
00007df4`71270041 59                   pop     rcx
00007df4`71270042 58                   pop     rax
00007df4`71270043 c3                   ret 

Diagnosis: No any "unmanaged to manage transition" / "thread creating" calls was made before the profiler get the "Enter" callback.

We have 100% repro in attachment WinFormsApp1.zip (run application under profiler with ELT callbacks, press the button till crash, at least 2 times)
TTD recording: https://drive.google.com/file/d/1MoxiP4Y28N7vHiR_33_3HWS5daGuMhy4/view?usp=sharing (276Mb, SHA-256: fa7b670bcf279f79c0759450138bcd08eee831041d625452641f17fccf519b24)

P.S. Connected issue in our tracker https://youtrack.jetbrains.com/issue/DTRC-28226

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions