rgw: workaround for deadlock in early versions of curl_multi_wait()#10467
rgw: workaround for deadlock in early versions of curl_multi_wait()#10467cethikdata wants to merge 1 commit intoceph:masterfrom cethikdata:hikdata/rgw
Conversation
radosgw Consumes too much CPU time to synchronize metadata or data between multisite and then stop synchronizing after a while.The reason is that RGWHTTPManager thread doesn't read data from pipe. This makes the pipe full of zeroes and RGWDataSyncProcessorThread suspend on write() call.
| uint32_t buf; | ||
| ret = read(signal_fd, (void *)&buf, sizeof(buf)); | ||
| if (ret < 0) { | ||
| if (ret < 0 && ret!=EAGAIN) { |
There was a problem hiding this comment.
if (ret < 0 && errno !=EAGAIN) {
There was a problem hiding this comment.
since we're in non-blocking mode, we could try to read more than 4 bytes as a potential optimization to avoid later wakeups. something like this?
std::array<char, 256> buf;
ret = read(signal_fd, buf.data(), buf.size());|
@cethikdata could you please update your commit message to match this format? |
|
@cethikdata thanks for looking into this! tracking down this bug in |
| r=-errno; | ||
| ldout(cct, 0) << "ERROR: fcntl() returned errno=" << r << dendl; | ||
| return r; | ||
| } |
There was a problem hiding this comment.
i think that we only need the read side thread_pipe[0] to be non-blocking, can you try without the second call to fcntl()?
|
@cethikdata ping |
|
@cethikdata ping. if you're busy, i can pick this up |
|
closing in favor of #10998. thanks for your contribution! |
radosgw Consumes too much CPU time to synchronize metadata or data between multisite and then stop synchronizing after a while.The reason is that RGWHTTPManager thread doesn't read data from pipe. This makes the pipe full of zeroes and RGWDataSyncProcessorThread suspend on write() call.