Skip to content

grpc_completion_queue_next sometimes hangs when cq is shut down #988

@jtattermusch

Description

@jtattermusch

Context:
In C#, I have a test Grpc.Core.Tests.ClientServerTest.UnknownMethodHandler that runs a client and a server. Clients calls a nonimplemented RPC and server responds with "Unimplemented" status. So far so good.

My tests passes, but when shutting down the thread pool (4 threads calling next() on the same cq), one of the threads hangs in about 10% of the time. I found out that all threads but one receive shutdown event as expected, but for one of the threads grpc_completion_queue_next never returns.

Here's my stack trace for the thread that hangs:

(gdb) where
#0 0x00007f59598b2b13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
#1 0x00007f59554b7cf6 in multipoll_with_epoll_pollset_maybe_work (pollset=0x7f594400f760, deadline=..., now=..., allow_synchronous_callback=1)

at src/core/iomgr/pollset_multipoller_with_epoll.c:112

#2 0x00007f59554b8562 in grpc_pollset_work (pollset=0x7f594400f760, deadline=...) at src/core/iomgr/pollset_posix.c:176
#3 0x00007f59554c84dd in grpc_completion_queue_next (cc=0x7f594400f750, deadline=...) at src/core/surface/completion_queue.c:312
#4 0x00007f5955adb94c in grpcsharp_completion_queue_next_with_callback (cq=0x7f594400f750) at src/csharp/ext/grpc_csharp_ext.c:230
#5 0x0000000041cd00a7 in ?? ()
#6 0x00007f592c002540 in ?? ()
#7 0x00007f59582644d0 in ?? ()
#8 0x00007f594c0025c0 in ?? ()
#9 0x00007f593fdfdc00 in ?? ()
#10 0x00007f593fdfdaf0 in ?? ()
#11 0x0000000000000000 in ?? ()

(gdb) info threads
Id Target Id Frame
12 Thread 0x7f5956b1f700 (LWP 15791) "cli" sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
11 Thread 0x7f59564ff700 (LWP 15792) "cli" 0x00007f5959b8c6dd in accept () at ../sysdeps/unix/syscall-template.S:81
10 Thread 0x7f59562fe700 (LWP 15793) "cli" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
9 Thread 0x7f59560fd700 (LWP 15794) "cli" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
8 Thread 0x7f5955efc700 (LWP 15795) "cli" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
7 Thread 0x7f595523b700 (LWP 15796) "cli" 0x00007f59598b2b13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
6 Thread 0x7f5954a3a700 (LWP 15797) "cli" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185

  • 5 Thread 0x7f593fdfe700 (LWP 15800) "cli" 0x00007f59598b2b13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:81
    4 Thread 0x7f593f9fc700 (LWP 15802) "cli" 0x00007f5959b8cb9d in nanosleep () at ../sysdeps/unix/syscall-template.S:81
    3 Thread 0x7f593f9bb700 (LWP 15803) "cli" pthread_cond_timedwait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
    2 Thread 0x7f593efb9700 (LWP 15805) "cli" sem_timedwait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:101
    1 Thread 0x7f595a6ac7c0 (LWP 15790) "cli" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions