EPOLLRDHUP unsupported (and needed for Rust's tokio async runtime)

From https://gitlab.torproject.org/tpo/core/arti/-/issues/1972#note_3193046, debugging why `arti-bench` hangs forever after reading all of the data in a connection, seemingly missing/ignoring EOF:

The problem appears to be because in shadow, the other end of the socket closing doesn't report `EPOLLRDHUP` events.

After a couple small local modifications to simplify (in particular batching the write of the json into a single 298 byte write), and enabling strace logging, here is a snippet from the strace log:

```
00:01:19.055000000 [tid 1100] sendto(10, "{\"accepted_ts\":{\"secs_since_epoch\":9"..., 298, MsgFlags(MSG_NOSIGNAL), 0x0, 0) = 298
000000079055000000 [tid 1100] clock_gettime(...) = 0
00:01:19.055000000 [tid 1100] write(1, "\x1b[2m2000-01-01T00:01:19.055000Z\x1b[0"..., 120) = 120
00:01:19.055000000 [tid 1100] close(10) = 0
00:01:19.055000000 [tid 1100] sigaltstack(0x7ffff6786818, 0x0) = 0
00:01:19.055000000 [tid 1100] munmap(0x7ffff7c48000, 12288) = <native>
00:01:19.055000000 [tid 1100] ^^^ = SyscallReg { as_i64: 0, as_u64: 0, as_ptr: 0x0 }
00:01:19.055000000 [tid 1100] rt_sigprocmask(0, 0x7ffff6787f30, 0x0, 8) = 0
00:01:19.055000000 [tid 1100] madvise(...) = <native>
00:01:19.055000000 [tid 1100] ^^^ = SyscallReg { as_i64: 0, as_u64: 0, as_ptr: 0x0 }
00:01:19.055000000 [tid 1099] epoll_wait(3, 0x555555cd67d0, 1024, -1) = 1
000000079055000000 [tid 1099] clock_gettime(...) = 0
00:01:19.055000000 [tid 1099] futex(0 (0x7ffff7c88090), 129, 1, 0x0, 0x0 <null>, 1439300144) = 1
000000079055000000 [tid 1099] clock_gettime(...) = 0
00:01:19.055000000 [tid 1099] epoll_wait(3, 0x555555cd67d0, 1024, -1) = <blocked>
00:01:19.055000000 [tid 1097] futex(0 (0x7ffff7c88090), 128, 1, 0x0, 0x2 <invalid-read>, 1439524704) = 0
00:01:19.055000000 [tid 1097] recvfrom(9, 0x555555cda410, 64, MsgFlags(0x0), 0x0, 0x0 <null>) = 64
000000079055000000 [tid 1097] clock_gettime(...) = 0
00:01:19.055000000 [tid 1097] write(1, "\x1b[2m2000-01-01T00:01:19.055000Z\x1b[0"..., 114) = 114
00:01:19.055000000 [tid 1097] recvfrom(9, 0x555555cda450, 64, MsgFlags(0x0), 0x0, 0x0 <null>) = 64
000000079055000000 [tid 1097] clock_gettime(...) = 0
00:01:19.055000000 [tid 1097] write(1, "\x1b[2m2000-01-01T00:01:19.055000Z\x1b[0"..., 115) = 115
00:01:19.055000000 [tid 1097] recvfrom(9, 0x555555cd1f70, 128, MsgFlags(0x0), 0x0, 0x0 <null>) = 128
000000079055000000 [tid 1097] clock_gettime(...) = 0
00:01:19.055000000 [tid 1097] write(1, "\x1b[2m2000-01-01T00:01:19.055000Z\x1b[0"..., 116) = 116
00:01:19.055000000 [tid 1097] recvfrom(9, 0x555555cd4290, 256, MsgFlags(0x0), 0x0, 0x0 <null>) = 42
000000079055000000 [tid 1097] clock_gettime(...) = 0
00:01:19.055000000 [tid 1097] write(1, "\x1b[2m2000-01-01T00:01:19.055000Z\x1b[0"..., 115) = 115
00:01:19.055000000 [tid 1097] futex(1 (0x7ffff7c88090), 128, 1, 0x0, 0x2 <invalid-read>, 1439524704) = <blocked>
```

summarizing:
* tid-1100 sends the data and closes the socket. That thread then exits, but `exit` syscalls don't appear in shadow's strace logs. (TODO: file an issue for that)
* tid-1099, which was previously blocked on an `epoll_wait`, gets a result from that `epoll_wait`, wakes a futex, and then blocks again on `epoll_wait`
* tid-1097, which was previously blocked on a futex, wakes up, `recvfrom`'s exactly all of the data without doing one more that that would have gotten EOF, and then waits on the futex again. My best guess is that the task is "yielding" to the runtime after doing some amount of reading. (I also tried changing the reader here to do one big read, in which case it blocks on the futex after reading all the data in that one big read, again without doing another `recvfrom`)

Neither thread 1099 nor 1097 run again for the rest of the simulation.

Running arti-bench natively under strace, I see that its setup of the epoll descriptor includes requesting `EPOLLRDHUP`, which shadow doesn't support.

My theory is that if shadow correctly included `EPOLLRDHUP` in the results returned from the first `epoll_wait` in this trace, the runtime would know not to `epoll_wait` again and we wouldn't get stuck here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EPOLLRDHUP unsupported (and needed for Rust's tokio async runtime) #3575

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

EPOLLRDHUP unsupported (and needed for Rust's tokio async runtime) #3575

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions