Fixed test_events_executors in zenoh by ahcorde · Pull Request #2643 · ros2/rclcpp

ahcorde · 2024-10-04T10:23:53Z

Fixed test_events_executors in zenoh

Signed-off-by: Alejandro Hernández Cordero <ahcorde@gmail.com>

fujitatomoya · 2024-10-04T17:26:55Z

rclcpp/test/rclcpp/CMakeLists.txt

 endif()

-ament_add_gtest(test_events_executor executors/test_events_executor.cpp TIMEOUT 5)
+ament_add_gtest(test_events_executor executors/test_events_executor.cpp TIMEOUT 60)


I am not sure about this... each test in executors/test_events_executor.cpp is basic pub/sub, client/server with EventExecutor. i would not expect that is gonna take more than 5 seconds, maybe 10 seconds. if that takes more than that, i would suspect something wrong in the libraries but test. btw do you happen to know which test takes a long time to process?

From my testing, it seems like the cancel_while_timers_running test is timing out.
I agree with @fujitatomoya that we should find out the cause for why this times out with rmw_zenoh and try to address it.

105: [==========] Running 11 tests from 1 test suite. 105: [----------] Global test environment set-up. 105: [----------] 11 tests from TestEventsExecutor 105: [ RUN ] TestEventsExecutor.run_pub_sub 105: [INFO] [1728076414.221154335] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 976ea020597cd8d73eaf8ddf5222a054. 105: [ OK ] TestEventsExecutor.run_pub_sub (694 ms) 105: [ RUN ] TestEventsExecutor.run_clients_servers 105: [INFO] [1728076414.902934390] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 976ea020597cd8d73eaf8ddf5222a054. 105: [ OK ] TestEventsExecutor.run_clients_servers (671 ms) 105: [ RUN ] TestEventsExecutor.spin_once_max_duration_timeout 105: [INFO] [1728076415.575311398] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 976ea020597cd8d73eaf8ddf5222a054. 105: [ OK ] TestEventsExecutor.spin_once_max_duration_timeout (655 ms) 105: [ RUN ] TestEventsExecutor.spin_once_max_duration_timer 105: [INFO] [1728076416.229936940] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 976ea020597cd8d73eaf8ddf5222a054. 105: [ OK ] TestEventsExecutor.spin_once_max_duration_timer (660 ms) 105: [ RUN ] TestEventsExecutor.spin_some_max_duration 105: [INFO] [1728076416.890282041] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 976ea020597cd8d73eaf8ddf5222a054. 105: [ OK ] TestEventsExecutor.spin_some_max_duration (691 ms) 105: [ RUN ] TestEventsExecutor.spin_some_zero_duration 105: [INFO] [1728076417.581796625] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 976ea020597cd8d73eaf8ddf5222a054. 105: [ OK ] TestEventsExecutor.spin_some_zero_duration (659 ms) 105: [ RUN ] TestEventsExecutor.spin_all_max_duration 105: [INFO] [1728076418.240770747] [rmw_zenoh_cpp]: Successfully connected to a Zenoh router with id 976ea020597cd8d73eaf8ddf5222a054. 1/1 Test #105: test_events_executor .............***Timeout 5.09 sec

So the problem here is the time it takes for rmw_zenoh_cpp to create a Context. As it stands, it takes ~600ms for rmw_zenoh_cpp to create a Context. That time is dominated by the time to create a "session" in Zenoh, which takes ~500ms. The other ~100ms is taken checking that our connection to the router is valid.

There isn't a whole lot we can do about this, at least not with some tradeoffs. Switching to Zenoh 1.0 will allow us to configure this value down, I believe. However, if we tune this to a smaller number, that will mean that doing a ros2 topic list when using rmw_zenoh_cpp will not always return all of the data on the first try.

Thus, my suggestion is that we do actually increase the timeout here. In my local testing, 20 seconds was enough to not timeout, but in order to make this not flaky (particularly on Windows), I think setting 60 seconds here seems reasonable.

(however, I do have a comment about the use of the rmw_zenoh_cpp identifier, which I'll post separately)

clalancette · 2024-10-16T20:50:30Z

rclcpp/test/rclcpp/executors/test_events_executor.cpp

+  std::string rmw_implementation_str = std::string(rmw_get_implementation_identifier());
+  if (rmw_implementation_str == "rmw_zenoh_cpp") {
+    EXPECT_EQ("", pub_log_msg);
+    EXPECT_EQ("", sub_log_msg);
+  } else {
+    EXPECT_EQ(
+      "New subscription discovered on topic '/test_topic', requesting incompatible QoS. "
+      "No messages will be sent to it. Last incompatible policy: DURABILITY_QOS_POLICY",
+      pub_log_msg);
+    EXPECT_EQ(
+      "New publisher discovered on topic '/test_topic', offering incompatible QoS. "
+      "No messages will be sent to it. Last incompatible policy: DURABILITY_QOS_POLICY",
+      sub_log_msg);
+  }


I'm not a big fan of hard-coding rmw_zenoh_cpp in here, particularly since it is not yet in the core.

I think we can do this better by using the QoS compatibility checker:

rclcpp/rclcpp/src/rclcpp/qos.cpp

Line 349 in 0be8aa0

qos_check_compatible(const QoS & publisher_qos, const QoS & subscription_qos)

. This calls down into the RMW to see if the QoS are compatible.

Thus, I think something like the following would work:

rclcpp::QoSCheckCompatibleResult qos_compatible = rclcpp::qos_check_compatible(publisher->get_actual_qos(), subscription->get_actual_qos()); if (qos_compatible.compatibility == rclcpp::QoScompatibility::Error) { EXPECT_EQ( "New subscription discovered on topic '/test_topic', requesting incompatible QoS. " "No messages will be sent to it. Last incompatible policy: DURABILITY_QOS_POLICY", pub_log_msg); EXPECT_EQ( "New publisher discovered on topic '/test_topic', offering incompatible QoS. " "No messages will be sent to it. Last incompatible policy: DURABILITY_QOS_POLICY", sub_log_msg); } else { EXPECT_EQ("", pub_log_msg); EXPECT_EQ("", sub_log_msg); }

(this is similar to what I did in #2653.

…noh_fix_events_executor

Signed-off-by: Alejandro Hernández Cordero <ahcorde@gmail.com>

clalancette

This looks good to me with green CI.

clalancette · 2024-10-17T14:52:42Z

Pulls: #2643
Gist: https://gist.githubusercontent.com/clalancette/b735e7ba4ac807ae0aeecde0d2db5899/raw/a41ab32ab6f3f03e4dae8b1bc774ec9829aab76a/ros2.repos
BUILD args: --packages-up-to rclcpp
TEST args: --packages-select rclcpp
ROS Distro: rolling
Job: ci_launcher
ci_launcher ran: https://ci.ros2.org/job/ci_launcher/14717

Linux
Linux-aarch64
Linux-rhel
Windows

Fixed test_events_executors in zenoh

c1f861a

Signed-off-by: Alejandro Hernández Cordero <ahcorde@gmail.com>

ahcorde requested a review from Yadunund October 4, 2024 10:23

ahcorde self-assigned this Oct 4, 2024

ahcorde requested review from hidmic, ivanpauno and wjwwood as code owners October 4, 2024 10:23

ahcorde mentioned this pull request Oct 4, 2024

Test failures with rmw_zenoh ros2/rmw_zenoh#286

Closed

fujitatomoya reviewed Oct 4, 2024

View reviewed changes

ahcorde marked this pull request as draft October 9, 2024 21:14

clalancette requested changes Oct 16, 2024

View reviewed changes

ahcorde added 2 commits October 17, 2024 12:02

Merge remote-tracking branch 'origin/rolling' into ahcorde/rolling/ze…

592dc0a

…noh_fix_events_executor

Feredback

5dce5b7

Signed-off-by: Alejandro Hernández Cordero <ahcorde@gmail.com>

ahcorde requested a review from clalancette October 17, 2024 10:10

ahcorde marked this pull request as ready for review October 17, 2024 10:10

clalancette approved these changes Oct 17, 2024

View reviewed changes

ahcorde merged commit 4567b6e into rolling Oct 17, 2024

clalancette deleted the ahcorde/rolling/zenoh_fix_events_executor branch October 17, 2024 16:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed test_events_executors in zenoh#2643

Fixed test_events_executors in zenoh#2643
ahcorde merged 3 commits intorollingfrom
ahcorde/rolling/zenoh_fix_events_executor

ahcorde commented Oct 4, 2024

Uh oh!

fujitatomoya Oct 4, 2024

Uh oh!

Yadunund Oct 4, 2024

Uh oh!

clalancette Oct 16, 2024

Uh oh!

clalancette Oct 16, 2024

Uh oh!

clalancette left a comment

Uh oh!

clalancette commented Oct 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ahcorde commented Oct 4, 2024

Uh oh!

fujitatomoya Oct 4, 2024

Choose a reason for hiding this comment

Uh oh!

Yadunund Oct 4, 2024

Choose a reason for hiding this comment

Uh oh!

clalancette Oct 16, 2024

Choose a reason for hiding this comment

Uh oh!

clalancette Oct 16, 2024

Choose a reason for hiding this comment

Uh oh!

clalancette left a comment

Choose a reason for hiding this comment

Uh oh!

clalancette commented Oct 17, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants