Skip to content

Race condition in ev_epollex_linux.cc for client connections #19161

@philsc

Description

@philsc

What version of gRPC and what language are you using?

v1.20.1

What operating system (Linux, Windows,...) and version?

Linux Debian 8 and Linux Debian 9

What runtime / compiler are you using (e.g. python version or version of gcc)

clang 3.7

What did you do?

I have a client that attempts to connect to a server. If the connection is unsuccessful the client tries again 2 seconds later.

What did you expect to see?

I expected the client to connect to the server on the first try.

What did you see instead?

Occasionally the client times out on the connection attempt. Sometimes multiple times in a row.
On the server side it looks like this:

D0525 16:30:14.790699091 21074 tcp_posix.cc:1258] cannot set inq fd=154 errno=92
D0525 16:30:16.791713096 21074 tcp_posix.cc:1258] cannot set inq fd=157 errno=92
D0525 16:30:18.791578080 21074 tcp_posix.cc:1258] cannot set inq fd=158 errno=92
D0525 16:30:20.791485232 21074 tcp_posix.cc:1258] cannot set inq fd=159 errno=92
D0525 16:30:22.791244970 21074 tcp_posix.cc:1258] cannot set inq fd=160 errno=92

Looking at tcpdump on the server side, I see that for every "cannot set inq" above:

  • the TCP handshake succeeds
  • roughly 2 seconds after the handshake, the client sends an FIN,ACK followed by a RST,ACK.
  • Shortly after that (within a couple milliseconds) the client establishes another connection
  • possibly repeat cycle.

Looking at strace, it looks to me like the client calls connect() and then subsequently adds it to an edge-triggered epoll descriptor.

I suspect this is racy because if the server establishes the connection quickly enough then you lose the event.

I added the following as a test. This causes the client to fail connecting indefinitely:

diff --git a/third_party/grpc/src/core/lib/iomgr/ev_epollex_linux.cc b/third_party/grpc/src/core/lib/iomgr/ev_epollex_linux.cc
index 01be46c..1a758d3 100644
--- a/third_party/grpc/src/core/lib/iomgr/ev_epollex_linux.cc
+++ b/third_party/grpc/src/core/lib/iomgr/ev_epollex_linux.cc
@@ -635,6 +635,7 @@ static grpc_error* pollable_add_fd(pollable* p, grpc_fd* fd) {
   ev_fd.data.ptr = reinterpret_cast<void*>(reinterpret_cast<intptr_t>(fd) |
                                            (fd->track_err ? 2 : 0));
   GRPC_STATS_INC_SYSCALL_EPOLL_CTL();
+  sleep(1);
   if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd->fd, &ev_fd) != 0) {
     switch (errno) {
       case EEXIST:

Alternatively, when I ran the following inside the bazel sandbox the connections succeeded 100% of the time. This is without the patch above.

# https://stackoverflow.com/questions/40196730/simulate-network-latency-on-specific-port-using-tc
/sbin/tc qdisc add dev lo root handle 1: prio priomap 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
/sbin/tc qdisc add dev lo parent 1:2 handle 20: netem delay 150ms
/sbin/tc filter add dev lo parent 1:0 protocol ip u32 match ip sport 51053 0xffff flowid 1:2

Port 51503 is the port I happen to be using for this experiment.

Anything else we should know about your project / environment?

Problem was exposed in a LAN environment.
It can be reproduced in a sandboxed bazel test.

Will update the ticket when I have a standalone test I can post.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions