Skip to content

rgw: workaround for deadlock in early versions of curl_multi_wait()#10467

Closed
cethikdata wants to merge 1 commit intoceph:masterfrom
cethikdata:hikdata/rgw
Closed

rgw: workaround for deadlock in early versions of curl_multi_wait()#10467
cethikdata wants to merge 1 commit intoceph:masterfrom
cethikdata:hikdata/rgw

Conversation

@cethikdata
Copy link
Contributor

radosgw Consumes too much CPU time to synchronize metadata or data between multisite and then stop synchronizing after a while.The reason is that RGWHTTPManager thread doesn't read data from pipe. This makes the pipe full of zeroes and RGWDataSyncProcessorThread suspend on write() call.

radosgw Consumes too much CPU time to synchronize metadata or data between multisite and then stop synchronizing after a while.The reason is that RGWHTTPManager thread doesn't read data from pipe. This makes the pipe full of zeroes and RGWDataSyncProcessorThread suspend on write() call.
uint32_t buf;
ret = read(signal_fd, (void *)&buf, sizeof(buf));
if (ret < 0) {
if (ret < 0 && ret!=EAGAIN) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (ret < 0 && errno !=EAGAIN) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we're in non-blocking mode, we could try to read more than 4 bytes as a potential optimization to avoid later wakeups. something like this?

     std::array<char, 256> buf;
     ret = read(signal_fd, buf.data(), buf.size());

@cbodley cbodley self-assigned this Aug 2, 2016
@cbodley
Copy link
Contributor

cbodley commented Aug 2, 2016

@cethikdata could you please update your commit message to match this format?

rgw: <short description of change>

<long description is great as it is>

Fixes: http://tracker.ceph.com/issues/16695

Signed-off-by: Name <email>

@cbodley
Copy link
Contributor

cbodley commented Aug 2, 2016

@cethikdata thanks for looking into this!

tracking down this bug in curl_multi_wait() was painful. we were able to backport the relevant fixes to curl 7.29 in RHEL (https://bugzilla.redhat.com/show_bug.cgi?id=1347904), but that might take a while for centos (https://bugs.centos.org/view.php?id=11242). so i'm happy to have a workaround in the meantime, and your approach looks like a good one

r=-errno;
ldout(cct, 0) << "ERROR: fcntl() returned errno=" << r << dendl;
return r;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think that we only need the read side thread_pipe[0] to be non-blocking, can you try without the second call to fcntl()?

@cbodley cbodley changed the title fix BUG #16695 rgw: workaround for deadlock in early versions of curl_multi_wait() Aug 2, 2016
@cbodley
Copy link
Contributor

cbodley commented Aug 16, 2016

@cethikdata ping

@cbodley
Copy link
Contributor

cbodley commented Aug 25, 2016

@cethikdata ping. if you're busy, i can pick this up

@cbodley
Copy link
Contributor

cbodley commented Sep 7, 2016

closing in favor of #10998. thanks for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants