test/rgw: fix use of poll() with timers in unittest_rgw_dmclock_scheduler#42425
Merged
tchaikov merged 2 commits intoceph:masterfrom Jul 21, 2021
Merged
test/rgw: fix use of poll() with timers in unittest_rgw_dmclock_scheduler#42425tchaikov merged 2 commits intoceph:masterfrom
tchaikov merged 2 commits intoceph:masterfrom
Conversation
…uler
the AsyncScheduler uses an asio timer to dispatch work to its executor
with an optional delay. when no delay is requested, it waits on the
timer with an expiration time in the past (crimson::dmclock::TimeZero)
tests are failing here because poll() is returning without executing the
handlers of those expired timers
asio implements these timers with timerfd and epoll. debugging with
strace, i see that these timers armed with timerfd_settime() are not
always immediately ready according to epoll_wait():
eventfd2(0, EFD_CLOEXEC|EFD_NONBLOCK) = 3
epoll_create1(EPOLL_CLOEXEC) = 4
timerfd_create(CLOCK_MONOTONIC, TFD_CLOEXEC) = 5
epoll_ctl(4, EPOLL_CTL_ADD, 3, {events=EPOLLIN|EPOLLERR|EPOLLET, data={u32=14164052, u64=14164052}}) = 0
epoll_ctl(4, EPOLL_CTL_ADD, 5, {events=EPOLLIN|EPOLLERR, data={u32=14164064, u64=14164064}}) = 0
timerfd_settime(5, TFD_TIMER_ABSTIME, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=0, tv_nsec=1}}, {it_interval={tv_sec=0, tv_nsec=0}, it_value={tv_sec=0, tv_nsec=0}}) = 0
epoll_wait(4, [{events=EPOLLIN, data={u32=14164052, u64=14164052}}], 128, 0) = 1
epoll_wait(4, [], 128, 0) = 0
epoll_wait(4, [], 128, 0) = 0
epoll_wait(4, [], 128, 0) = 0
epoll_wait(4, [], 128, 0) = 0
epoll_wait(4, [{events=EPOLLIN, data={u32=14164064, u64=14164064}}], 128, 0) = 1
in this example, it took 6 calls to context.poll() before it was ready
to execute the timer's handler
to work around this, replace calls to context.poll() with calls to
context.run_for() with a very short duration
Fixes: https://tracker.ceph.com/issues/42788
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Signed-off-by: Casey Bodley <cbodley@redhat.com>
Contributor
Author
Contributor
|
@cbodley I tested this on my local machine and the test passed successfully. Thanks! |
3 tasks
tchaikov
approved these changes
Jul 21, 2021
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
the AsyncScheduler uses an asio timer to dispatch work to its executor with an optional delay. when no delay is requested, it waits on the timer with an expiration time in the past (crimson::dmclock::TimeZero)
tests are failing here because poll() is returning without executing the handlers of those expired timers
asio implements these timers with timerfd and epoll. debugging with strace, i see that these timers armed with timerfd_settime() are not always immediately ready according to epoll_wait():
in this example, it took 6 calls to context.poll() before it was ready to execute the timer's handler
to work around this, replace calls to context.poll() with calls to context.run_for() with a very short duration
Fixes: https://tracker.ceph.com/issues/42788
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox