Fixed test_events_executors in zenoh#2643
Conversation
Signed-off-by: Alejandro Hernández Cordero <ahcorde@gmail.com>
| endif() | ||
|
|
||
| ament_add_gtest(test_events_executor executors/test_events_executor.cpp TIMEOUT 5) | ||
| ament_add_gtest(test_events_executor executors/test_events_executor.cpp TIMEOUT 60) |
There was a problem hiding this comment.
I am not sure about this... each test in executors/test_events_executor.cpp is basic pub/sub, client/server with EventExecutor. i would not expect that is gonna take more than 5 seconds, maybe 10 seconds. if that takes more than that, i would suspect something wrong in the libraries but test. btw do you happen to know which test takes a long time to process?
There was a problem hiding this comment.
From my testing, it seems like the cancel_while_timers_running test is timing out.
I agree with @fujitatomoya that we should find out the cause for why this times out with rmw_zenoh and try to address it.
105: [==========] Running 11 tests from 1 test suite.
105: [----------] Global test environment set-up.
105: [----------] 11 tests from TestEventsExecutor
105: [ RUN ] TestEventsExecutor.run_pub_sub
105: [INFO] [1728076414.221154335] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 976ea020597cd8d73eaf8ddf5222a054.
105: [ OK ] TestEventsExecutor.run_pub_sub (694 ms)
105: [ RUN ] TestEventsExecutor.run_clients_servers
105: [INFO] [1728076414.902934390] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 976ea020597cd8d73eaf8ddf5222a054.
105: [ OK ] TestEventsExecutor.run_clients_servers (671 ms)
105: [ RUN ] TestEventsExecutor.spin_once_max_duration_timeout
105: [INFO] [1728076415.575311398] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 976ea020597cd8d73eaf8ddf5222a054.
105: [ OK ] TestEventsExecutor.spin_once_max_duration_timeout (655 ms)
105: [ RUN ] TestEventsExecutor.spin_once_max_duration_timer
105: [INFO] [1728076416.229936940] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 976ea020597cd8d73eaf8ddf5222a054.
105: [ OK ] TestEventsExecutor.spin_once_max_duration_timer (660 ms)
105: [ RUN ] TestEventsExecutor.spin_some_max_duration
105: [INFO] [1728076416.890282041] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 976ea020597cd8d73eaf8ddf5222a054.
105: [ OK ] TestEventsExecutor.spin_some_max_duration (691 ms)
105: [ RUN ] TestEventsExecutor.spin_some_zero_duration
105: [INFO] [1728076417.581796625] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 976ea020597cd8d73eaf8ddf5222a054.
105: [ OK ] TestEventsExecutor.spin_some_zero_duration (659 ms)
105: [ RUN ] TestEventsExecutor.spin_all_max_duration
105: [INFO] [1728076418.240770747] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 976ea020597cd8d73eaf8ddf5222a054.
1/1 Test #105: test_events_executor .............***Timeout 5.09 sec
There was a problem hiding this comment.
So the problem here is the time it takes for rmw_zenoh_cpp to create a Context. As it stands, it takes ~600ms for rmw_zenoh_cpp to create a Context. That time is dominated by the time to create a "session" in Zenoh, which takes ~500ms. The other ~100ms is taken checking that our connection to the router is valid.
There isn't a whole lot we can do about this, at least not with some tradeoffs. Switching to Zenoh 1.0 will allow us to configure this value down, I believe. However, if we tune this to a smaller number, that will mean that doing a ros2 topic list when using rmw_zenoh_cpp will not always return all of the data on the first try.
Thus, my suggestion is that we do actually increase the timeout here. In my local testing, 20 seconds was enough to not timeout, but in order to make this not flaky (particularly on Windows), I think setting 60 seconds here seems reasonable.
(however, I do have a comment about the use of the rmw_zenoh_cpp identifier, which I'll post separately)
| std::string rmw_implementation_str = std::string(rmw_get_implementation_identifier()); | ||
| if (rmw_implementation_str == "rmw_zenoh_cpp") { | ||
| EXPECT_EQ("", pub_log_msg); | ||
| EXPECT_EQ("", sub_log_msg); | ||
| } else { | ||
| EXPECT_EQ( | ||
| "New subscription discovered on topic '/test_topic', requesting incompatible QoS. " | ||
| "No messages will be sent to it. Last incompatible policy: DURABILITY_QOS_POLICY", | ||
| pub_log_msg); | ||
| EXPECT_EQ( | ||
| "New publisher discovered on topic '/test_topic', offering incompatible QoS. " | ||
| "No messages will be sent to it. Last incompatible policy: DURABILITY_QOS_POLICY", | ||
| sub_log_msg); | ||
| } |
There was a problem hiding this comment.
I'm not a big fan of hard-coding rmw_zenoh_cpp in here, particularly since it is not yet in the core.
I think we can do this better by using the QoS compatibility checker:
rclcpp/rclcpp/src/rclcpp/qos.cpp
Line 349 in 0be8aa0
Thus, I think something like the following would work:
rclcpp::QoSCheckCompatibleResult qos_compatible = rclcpp::qos_check_compatible(publisher->get_actual_qos(), subscription->get_actual_qos());
if (qos_compatible.compatibility == rclcpp::QoScompatibility::Error) {
EXPECT_EQ(
"New subscription discovered on topic '/test_topic', requesting incompatible QoS. "
"No messages will be sent to it. Last incompatible policy: DURABILITY_QOS_POLICY",
pub_log_msg);
EXPECT_EQ(
"New publisher discovered on topic '/test_topic', offering incompatible QoS. "
"No messages will be sent to it. Last incompatible policy: DURABILITY_QOS_POLICY",
sub_log_msg);
} else {
EXPECT_EQ("", pub_log_msg);
EXPECT_EQ("", sub_log_msg);
}
(this is similar to what I did in #2653.
…noh_fix_events_executor
clalancette
left a comment
There was a problem hiding this comment.
This looks good to me with green CI.
|
Pulls: #2643 |
Fixed test_events_executors in zenoh