test/rgw: use io_context::run() to drain the task queue#42237
test/rgw: use io_context::run() to drain the task queue#42237tchaikov wants to merge 1 commit intoceph:masterfrom
Conversation
f602860 to
1fd6310
Compare
|
cc @t-msn |
|
@tchaikov Thanks for proceeding the fix. Actually I saw other sub tests fail too in my machine:
Just updating other poll call too as follows solves all problem: diff --git a/src/test/rgw/test_rgw_dmclock_scheduler.cc b/src/test/rgw/test_rgw_dmclock_scheduler.cc
index d13c4fca69e..ac4546f7a39 100644
--- a/src/test/rgw/test_rgw_dmclock_scheduler.cc
+++ b/src/test/rgw/test_rgw_dmclock_scheduler.cc
@@ -105,7 +105,7 @@ TEST(Queue, RateLimit)
EXPECT_EQ(1u, counters(client_id::admin)->get(queue_counters::l_qlen));
EXPECT_EQ(1u, counters(client_id::auth)->get(queue_counters::l_qlen));
- context.poll();
+ EXPECT_GT(context.run(), 0);
EXPECT_TRUE(context.stopped());
ASSERT_TRUE(ec1);
@@ -163,7 +163,7 @@ TEST(Queue, AsyncRequest)
EXPECT_EQ(1u, counters(client_id::admin)->get(queue_counters::l_qlen));
EXPECT_EQ(1u, counters(client_id::auth)->get(queue_counters::l_qlen));
- context.poll();
+ EXPECT_GT(context.run(), 0);
EXPECT_TRUE(context.stopped());
ASSERT_TRUE(ec1);
@@ -217,7 +217,7 @@ TEST(Queue, Cancel)
EXPECT_FALSE(ec1);
EXPECT_FALSE(ec2);
- context.poll();
+ EXPECT_GT(context.run(), 0);
EXPECT_TRUE(context.stopped());
ASSERT_TRUE(ec1);
@@ -265,7 +265,7 @@ TEST(Queue, CancelClient)
EXPECT_FALSE(ec1);
EXPECT_FALSE(ec2);
- context.poll();
+ EXPECT_GT(context.run(), 0);
EXPECT_TRUE(context.stopped());
ASSERT_TRUE(ec1);
@@ -315,7 +315,7 @@ TEST(Queue, CancelOnDestructor)
EXPECT_FALSE(ec1);
EXPECT_FALSE(ec2);
- context.poll();
+ EXPECT_GT(context.run(), 0);
EXPECT_TRUE(context.stopped());
ASSERT_TRUE(ec1);
@@ -376,13 +376,13 @@ TEST(Queue, CrossExecutorRequest)
EXPECT_FALSE(ec1);
EXPECT_FALSE(ec2);
- queue_context.poll();
+ EXPECT_GT(queue_context.run(), 0);
EXPECT_TRUE(queue_context.stopped());
EXPECT_FALSE(ec1); // no callbacks until callback executor runs
EXPECT_FALSE(ec2);
- callback_context.poll();
+ EXPECT_GT(callback_context.run(), 0);
EXPECT_TRUE(callback_context.stopped());
ASSERT_TRUE(ec1);
@@ -421,7 +421,7 @@ TEST(Queue, SpawnAsyncRequest)
EXPECT_EQ(PhaseType::priority, p2);
});
- context.poll();
+ EXPECT_GT(context.run(), 0);
EXPECT_TRUE(context.stopped());
}Could you update the patch? Thanks. |
per https://www.boost.org/doc/libs/1_76_0/doc/html/boost_asio/reference/io_context/run/overload1.html, > The run() function blocks until all work has finished and there are no more handlers to be dispatched while `poll()` does not ensure that the handlers are all dispatched, > Run the io_context object's event processing loop to execute ready handlers. see https://www.boost.org/doc/libs/1_76_0/doc/html/boost_asio/reference/io_context/poll/overload1.html there is chance that the some request is not scheduled when `io_context::poll()` gets called, so a safer change would be to call `io_context::run()` to ensure that all the handlers are processed. Fixes: https://tracker.ceph.com/issues/42788 Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com> Signed-off-by: Kefu Chai <kchai@redhat.com>
|
@t-msn thank you! i also added your Signed-off-by, hope it's fine by you. as i think you are indeed the author of this changeset. i just channeled your suggestions to github as a pull request. =) |
|
hi @tchaikov, everything you said about from https://tracker.ceph.com/issues/42788, i see "on aarch64 centos 7".. is this only happening on arch? |
|
@cbodley no, actually it happens on our "make check" checks on both pacific and master . and we only run this check on ubuntu focal + amd64 nowadays. |
|
ok. i'll try running the test with |
@cbodley and @t-msn i am closing this PR as per Casey's comment, this change does not address the issue, on the contrary. it practically removes the some tests by waiting on some handlers which are supposed to be ready immediately. |
per
https://www.boost.org/doc/libs/1_76_0/doc/html/boost_asio/reference/io_context/run/overload1.html,
while
poll()does not ensure that the handlers are all dispatched,see
https://www.boost.org/doc/libs/1_76_0/doc/html/boost_asio/reference/io_context/poll/overload1.html
there is chance that the some request is not scheduled when
io_context::poll()gets called, so a safer change would be to callio_context::run()to ensure that all the handlers are processed.Fixes: https://tracker.ceph.com/issues/42788
Signed-off-by: Kefu Chai kchai@redhat.com
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox