Skip to content

rgw: work around curl_multi_wait bug with non-blocking reads#10998

Merged
cbodley merged 4 commits intomasterfrom
wip-16695
Sep 14, 2016
Merged

rgw: work around curl_multi_wait bug with non-blocking reads#10998
cbodley merged 4 commits intomasterfrom
wip-16695

Conversation

@cbodley
Copy link
Contributor

@cbodley cbodley commented Sep 6, 2016

@cbodley
Copy link
Contributor Author

cbodley commented Sep 6, 2016

this made it through teuthology with no new failures: http://pulpito.ceph.com/cbodley-2016-09-02_15:45:38-rgw-wip-16695---basic-mira

@oritwas
Copy link
Member

oritwas commented Sep 8, 2016

this means we do a read every time do_curl_wait is called, correct?
can we detect what curl version is running and only for older versions call read without checking wait_fd.revents?

@cbodley
Copy link
Contributor Author

cbodley commented Sep 8, 2016

@oritwas yeah, that was my first inclination. i wasn't going to require this of the original contributor in #10467, but i agree that it's worth doing

Signed-off-by: Casey Bodley <cbodley@redhat.com>
@cbodley
Copy link
Contributor Author

cbodley commented Sep 9, 2016

tested and verified using unittest_http_manager and an old version of libcurl 7.29 located at /home/cbodley/local-curl/lib/libcurl.so

when run against master, the bug leads to deadlock:

$ LD_PRELOAD=/home/cbodley/local-curl/lib/libcurl.so ./unittest_http_manager
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from HTTPManager
[ RUN      ] HTTPManager.SignalThread
^C

when run against this branch, the bug is detected and the workaround avoids deadlock:

$ LD_PRELOAD=/home/cbodley/local-curl/lib/libcurl.so ./unittest_http_manager
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from HTTPManager
[ RUN      ] HTTPManager.SignalThread
2016-09-09 16:55:25.457454 7eff9c96e900  0 WARNING: detected a version of libcurl which contains a bug in curl_multi_wait(). enabling a workaround that may degrade performance slightly. please upgrade to a more recent version of libcurl.
[       OK ] HTTPManager.SignalThread (71 ms)
[----------] 1 test from HTTPManager (71 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (71 ms total)
[  PASSED  ] 1 test.

and with my system's libcurl 7.43, the workaround is not applied and the test passes normally:

$ ./unittest_http_manager
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from HTTPManager
[ RUN      ] HTTPManager.SignalThread
[       OK ] HTTPManager.SignalThread (74 ms)
[----------] 1 test from HTTPManager (74 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (74 ms total)
[  PASSED  ] 1 test.

static int clear_signal(int fd)
{
uint32_t buf;
int ret = ::read(fd, &buf, sizeof(buf));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cbodley need to cast to void * like we had originally? might be that needed for clearing a warning

@yehudasa
Copy link
Member

@cbodley see my comments, otherwise lgtm

Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
@cbodley
Copy link
Contributor Author

cbodley commented Sep 12, 2016

@yehudasa thanks for the feedback, updated clear_signal() to preserve the (void *) cast and return 0 on EAGAIN

@cbodley cbodley merged commit c2ac199 into master Sep 14, 2016
@cbodley cbodley deleted the wip-16695 branch September 14, 2016 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants