-
Notifications
You must be signed in to change notification settings - Fork 522
Description
Bug report
Required Info:
- Operating System:
- Ubuntu 24.04 (Docker)
- Installation type:
- Binaries
- Version or commit hash:
- 28.3.1
- DDS implementation:
- Fast-DDS, Cyclone DDS both
- Client library (if applicable):
- rclcpp
Expected behavior
Spin All, Spin Some work as previously expected
Actual behavior
Spinning some or all requires a warmup spin that doesn't do anything before processing useful work.
Additional information
Nav2's CI broke recently with a unit test on a behavior tree node that I got to looking into yesterday. I was chatting with @clalancette about it in a comment thread and we realized that this is a rclcpp regression.
goal_updater_node spins an internal executor to process a callback function to get some data that has always worked until recently: https://github.com/ros-navigation/navigation2/blob/main/nav2_behavior_tree/plugins/decorator/goal_updater_node.cpp#L60. In debugging, I found that it wasn't getting the data the unit test was sending.
Blaming Executor::spin_some_impl(std::chrono::nanoseconds max_duration, bool exhaustive), I see there's been alot of poking around there recently in the last few months, and a PR to fix some other regression in May, which makes me think maybe there's another regression lurking in that corner.
Test A:
callback_group_executor_.spin_all(std::chrono::milliseconds(200));
Pushing the 50ms up to 200ms still doesn't work
Test B:
callback_group_executor_.spin_all(std::chrono::milliseconds(50));
callback_group_executor_.spin_all(std::chrono::milliseconds(1));
This works
Test C:
callback_group_executor_.spin_some(std::chrono::milliseconds(50));
Attempting to use spin some instead of spin all also fails
Test D:
Lets try ticking the node multiple times while having the single spin_all(50ms) (which would cause the spin to be called multiple times)
tree_->rootNode()->executeTick();
tree_->rootNode()->executeTick();
tree_->rootNode()->executeTick();
This works
This experiment proves to me that something is odd with spinning behavior right now requiring multiple calls (whether sequentially or as part of the broader application loop) to process data. The first run is having no effect. I see this running in Nav2's CI that is running Cyclone and I can replicate this locally in a Docker container with Fast-DDS, so I think its a ROS 2 side issue, not DDS.