Skip to content

AV in Release build of C implementation of EventPipe #46158

@josalem

Description

@josalem

I started my attempts at collecting end-to-end throughput measurements, but I'm getting an AV in a Release macos-x64 build with the C implementation that doesn't happen with the C++ implementation.

Process 69371 resuming
Process 69371 stopped
* thread #11, stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x00000001041bb967 libcoreclr.dylib`ep_buffer_manager_write_event(_EventPipeBufferManager*, Thread*, _EventPipeSession*, _EventPipeEvent*, _EventPipeEventPayload*, unsigned char const*, unsigned char const*, Thread*, _EventPipeStackContents*) + 151
libcoreclr.dylib`ep_buffer_manager_write_event:
->  0x1041bb967 <+151>: movq   0x10(%rbx), %rdi
    0x1041bb96b <+155>: testq  %rdi, %rdi
    0x1041bb96e <+158>: jne    0x1041bba0b               ; <+315>
    0x1041bb974 <+164>: jmp    0x1041bba4e               ; <+382>
Target 0: (corescaletest) stopped.
(lldb) bt
* thread #11, stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
  * frame #0: 0x00000001041bb967 libcoreclr.dylib`ep_buffer_manager_write_event(_EventPipeBufferManager*, Thread*, _EventPipeSession*, _EventPipeEvent*, _EventPipeEventPayload*, unsigned char const*, unsigned char const*, Thread*, _EventPipeStackContents*) + 151
    frame #1: 0x00000001041c3d34 libcoreclr.dylib`ep_session_write_event(_EventPipeSession*, Thread*, _EventPipeEvent*, _EventPipeEventPayload*, unsigned char const*, unsigned char const*, Thread*, _EventPipeStackContents*) + 292
    frame #2: 0x00000001041c6b8a libcoreclr.dylib`write_event_2(Thread*, _EventPipeEvent*, _EventPipeEventPayload*, unsigned char const*, unsigned char const*, Thread*, _EventPipeStackContents*) + 234
    frame #3: 0x00000001041c0fff libcoreclr.dylib`ep_write_event_2(_EventPipeEvent*, _EventData*, unsigned int, unsigned char const*, unsigned char const*) + 191
    frame #4: 0x0000000103f3191d libcoreclr.dylib`EventPipeInternal::WriteEventData(long, _EventData*, unsigned int, _GUID const*, _GUID const*) + 61
    frame #5: 0x000000011abdea9d
    frame #6: 0x000000011abde62d
    frame #7: 0x000000011abde1da
    frame #8: 0x000000011abddc85
    frame #9: 0x000000011abddc44
    frame #10: 0x000000011a732912
    frame #11: 0x000000011a73dab1
    frame #12: 0x000000011a732a2e
    frame #13: 0x0000000104097019 libcoreclr.dylib`CallDescrWorkerInternal + 124
    frame #14: 0x0000000103efe00f libcoreclr.dylib`MethodDescCallSite::CallTargetWorker(unsigned long const*, unsigned long*, int) + 1519
    frame #15: 0x0000000103f0e984 libcoreclr.dylib`ThreadNative::KickOffThread_Worker(void*) + 404
    frame #16: 0x0000000103ec859b libcoreclr.dylib`ManagedThreadBase_DispatchOuter(ManagedThreadCallState*) + 315
    frame #17: 0x0000000103ec8b50 libcoreclr.dylib`ManagedThreadBase::KickOff(void (*)(void*), void*) + 32
    frame #18: 0x0000000103f0eaff libcoreclr.dylib`ThreadNative::KickOffThread(void*) + 191
    frame #19: 0x0000000103dbc14a libcoreclr.dylib`CorUnix::CPalThread::ThreadEntry(void*) + 426
    frame #20: 0x00007fff719bf109 libsystem_pthread.dylib`_pthread_start + 148
    frame #21: 0x00007fff719bab8b libsystem_pthread.dylib`thread_start + 15

I'll try switching to a checked build and see if I can't pin down what's happening.

For reference, the test I'm running writes events across N threads (N == number of cores on machine, 4 in my case) without pausing in between. It's meant to saturate the buffer system and drop events. I have levers for reducing that stream to more realistic values and simulate bursts and periodic behavior.

Belongs to #46079

CC - @lateralusX

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions