Added static single threaded executor functionality#873
Added static single threaded executor functionality#873ishugoel20 wants to merge 16 commits intoros2:masterfrom
Conversation
|
@alsora @fujitatomoya @ivanpauno @gbiggs @Karsten1987 can you have a look at this? |
|
This current implementation seems like it would be great to operate in environments such as when using the lifecycle nodes: https://design.ros2.org/articles/node_lifecycle.html Currently the implementation requires that the all the setup is done before starting the spin which is relatively limiting. Would it be possible to modify this to react to events such as new subscriptions at which time it will rebuild the index that it is then iterating over until there's a new event triggering a reevaluation of the list to iterate. There could be flags to keep it in permissive mode versus enforcing mode that tools like the lifecycle management and realtime environments might like to setup. But it seems that this would allow us to use this as a replacement for the current executor and get the benefits of faster iteration without requiring the developer to switch to a different executor. |
|
We did already think about this. When a trigger occurs we can simply collect the handles, resize the wait-set and add the handles again. The list of executables can probably just be appended and may not require rebuilding. The only thing to consider in this case is waitables. Waitables may interfere with the order of things since they are added in a slightly different way(no longer possible to simply use indices to figure out what to execute). We could look into making a separate waitable struct, or figure out a way to retain a mapping between the wait-set and the list of executables. I think right now waitables work because they are assessed last and are always added at the end of the lists(I may be wrong here). |
can we just recreate entities and executable list via user's request for reconstruction? i believe that it is easier to implement, and besides this user event or request for reconstruction would not be happening every single time just like SingleThreadExecutor does. so i think that it is reasonable. |
| refresh_wait_set(timeout); | ||
| //Execute all the ready subscriptions | ||
| for (size_t i = 0; i < wait_set_.size_of_subscriptions; ++i) { | ||
| if (wait_set_.size_of_subscriptions && i < exec_list.number_of_subscriptions) { |
There was a problem hiding this comment.
check exec_list.number_of_subscriptions at 1st, so we do not do that every single loop of wait_set_.size_of_subscriptions ? also this applies to time, service and clients.
There was a problem hiding this comment.
check exec_list.number_of_subscriptions at 1st, so we do not do that every single loop of wait_set_.size_of_subscriptions ? also this applies to time, service and clients.
@fujitatomoya thanks for the suggestion, we would consider making this change.
We are working on this, would share when this is ready. |
ivanpauno
left a comment
There was a problem hiding this comment.
First pass, I left some minimal comments (I have to do a more detailed review of the implementation).
The idea looks great!
It would be good to add some testing.
I think that these are the current executor tests:
|
|
||
| #include "rclcpp/executors/multi_threaded_executor.hpp" | ||
| #include "rclcpp/executors/single_threaded_executor.hpp" | ||
| #include "rclcpp/executors/static_single_threaded_executor.hpp" |
There was a problem hiding this comment.
Maybe, also add an using statement similar to lines 54-55. It's not strictly needed though.
There was a problem hiding this comment.
I do not use StaticSingleThreadedExecutor in any of the below spin functions, so it is better not to include it in my view.
There was a problem hiding this comment.
The MultiThreadedExecutor is also not used, and there is an using statement for it in line 54.
I'm not sure, but probably the original idea was to re-declare all executors in rclcpp::executors namespace.
You can ignore this comment for the moment, as it's a really not important and I'm not sure what was the idea of the original author.
| for (size_t i = 0; i < wait_set_.size_of_guard_conditions; ++i) { | ||
| if (wait_set_.guard_conditions[i]) { | ||
| // rebuild the wait_set and ExecutableList struct | ||
| run_collect_entities(); |
There was a problem hiding this comment.
Why collecting entities again? Isn't this an static executor where the entities are collected once and then you just refresh the wait set?
There was a problem hiding this comment.
At first we implemented the executor to be fully static, but this was quite restrictive. Now we use the node guard_conditions to check if something was added to a node at runtime. If this happens then we rebuild. This means that if nothing is added at runtime (everything is in place before .spin() ), this executor still behaves as a fully static one. So this check and rebuilding step only adds functionality. This means the executor can now be used for a wider variety of source code and will most likely still perform better than the original executor, since only rebuilding when necessary to me seems better than always rebuilding.
There was a problem hiding this comment.
can we just recreate entities and executable list via user's request for reconstruction? i believe that it is easier to implement, and besides this user event or request for reconstruction would not be happening every single time just like SingleThreadExecutor does. so i think that it is reasonable.
Instead of rebuilding on request, we handled it like this (event trigger from node).
There was a problem hiding this comment.
i think that your idea is better than giving the responsibility to user for reconstruction.
There was a problem hiding this comment.
i believe that this reconstruction only needs to be done here once, so find out if anything is notified, and then reconstruct just once and exit from this loop. this is performance concern if some of entities are updated.
There was a problem hiding this comment.
Yes this is true, I believe we did realize this, but apparently haven't updated it here. We can add a "break;" after get_executable_list, that should get the desired result right?
There was a problem hiding this comment.
A guard condition is for "waking up" the executor, and I think that shouldn't be coupled to re-collecting entities (I'm not sure if you always want to re-collect entities when a guard condition is triggered).
IMO, it's better to provide a way of doing it under user request.
There was a problem hiding this comment.
One of the problems that we found is that this condition causes run_collect_entities to be called every time a guard condition is triggered, even if no nodes/pubs/subs/etc are added
Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com> Signed-off-by: Ishu Goel <ishu.goel@nobleo.nl>
Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com> Signed-off-by: Ishu Goel <ishu.goel@nobleo.nl>
* Fail on invalid and unknown ROS specific arguments. Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com> * Revert changes to utilities.hpp in rclcpp Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com> * Fully revert change to utilities.hpp Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com> Signed-off-by: Ishu Goel <ishu.goel@nobleo.nl>
"it's" instead of its Signed-off-by: Luca Della Vedova <luca@openrobotics.org> Signed-off-by: Ishu Goel <ishu.goel@nobleo.nl>
* added throwing parameter name if parameter is not set Signed-off-by: Alex <cvbn127@gmail.com> Signed-off-by: ivanpauno <ivanpauno@ekumenlabs.com> Signed-off-by: Ishu Goel <ishu.goel@nobleo.nl>
…s2#841) Signed-off-by: Tomoya.Fujita <Tomoya.Fujita@sony.com> Signed-off-by: Ishu Goel <ishu.goel@nobleo.nl>
6705790 to
28a7cbe
Compare
Signed-off-by: Ishu Goel <ishu.goel@nobleo.nl> executor enhanced to run clients and waitable Signed-off-by: Ishu Goel <ishu.goel@nobleo.nl> tested executor Signed-off-by: Ishu Goel <ishu.goel@nobleo.nl> added semi-dynamic feature to the executor Signed-off-by: Ishu Goel <ishu.goel@nobleo.nl> Jenkins error fixes Signed-off-by: Ishu Goel <ishu.goel@nobleo.nl> Added static single threaded executor functionality Signed-off-by: Ishu Goel <ishu.goel@nobleo.nl>
Signed-off-by: Ishu Goel <ishu.goel@nobleo.nl>
043a412 to
fee1a91
Compare
rclcpp/src/rclcpp/node_options.cpp
Outdated
| try { | ||
| std::vector<std::string> unparsed_ros_args; | ||
| for (int i = 0; i < unparsed_ros_args_count; ++i) { | ||
| unparsed_ros_args.push_back(c_argv[unparsed_ros_args_indices[i]]); | ||
| } | ||
| throw exceptions::UnknownROSArgsError(std::move(unparsed_ros_args)); | ||
| } catch (...) { | ||
| this->allocator_.deallocate(unparsed_ros_args_indices, this->allocator_.state); | ||
| throw; | ||
| } |
There was a problem hiding this comment.
i am not sure this try & catch is required here, can we just do the following?
| try { | |
| std::vector<std::string> unparsed_ros_args; | |
| for (int i = 0; i < unparsed_ros_args_count; ++i) { | |
| unparsed_ros_args.push_back(c_argv[unparsed_ros_args_indices[i]]); | |
| } | |
| throw exceptions::UnknownROSArgsError(std::move(unparsed_ros_args)); | |
| } catch (...) { | |
| this->allocator_.deallocate(unparsed_ros_args_indices, this->allocator_.state); | |
| throw; | |
| } | |
| } else { | |
| std::vector<std::string> unparsed_ros_args; | |
| for (int i = 0; i < unparsed_ros_args_count; ++i) { | |
| unparsed_ros_args.push_back(c_argv[unparsed_ros_args_indices[i]]); | |
| } | |
| this->allocator_.deallocate(unparsed_ros_args_indices, this->allocator_.state); | |
| throw exceptions::UnknownROSArgsError(std::move(unparsed_ros_args)); | |
| } |
| * This executor is a static version of original single threaded executor. | ||
| * This executor makes the assumption that system does not change during runtime. | ||
| * This means all nodes, callbackgroups, timers, subscriptions etc. are created | ||
| * before .spin() is called. |
There was a problem hiding this comment.
this explanation is not consistent, compared to the current implementation. i believe that run_collect_entities will be called if anything added during runtime.
|
|
||
| void | ||
| StaticSingleThreadedExecutor::get_executable_list( | ||
| ExecutableList & executable_list, std::chrono::nanoseconds timeout) |
There was a problem hiding this comment.
second argument std::chrono::nanoseconds timeout is not really needed. we can call refresh_wait_set() instead.
| for (size_t i = 0; i < wait_set_.size_of_guard_conditions; ++i) { | ||
| if (wait_set_.guard_conditions[i]) { | ||
| // rebuild the wait_set and ExecutableList struct | ||
| run_collect_entities(); |
There was a problem hiding this comment.
i believe that this reconstruction only needs to be done here once, so find out if anything is notified, and then reconstruct just once and exit from this loop. this is performance concern if some of entities are updated.
ivanpauno
left a comment
There was a problem hiding this comment.
Sorry for the delay, I did a more detailed review this time.
My main questions are:
- Why is
ExecutableListneeded? (see my other comment below) - Why updating the wait_set when a guard condition is triggered? I don't think that both things should be coupled.
- When should an user use the existing
SingleThreadedExecutorinstead of this one?
If this one provides a way of recollecting entities, I think that both of them could be used in the same use-cases. In that case, I will simple drop the old one and replace it with this one.
As a first step, we could merge this as a different option, which will also ease backporting without breaking API. But in the long term, I don't think we should keep both.
PS: Rebasing is needed.
| }; | ||
|
|
||
| /// Created when the ret is RCL_RET_INVALID_ROS_ARGS. | ||
| class RCLInvalidROSArgsError : public RCLErrorBase, public std::runtime_error |
There was a problem hiding this comment.
The PR seems to not be correctly rebased with master.
|
|
||
| #include "rclcpp/executors/multi_threaded_executor.hpp" | ||
| #include "rclcpp/executors/single_threaded_executor.hpp" | ||
| #include "rclcpp/executors/static_single_threaded_executor.hpp" |
There was a problem hiding this comment.
The MultiThreadedExecutor is also not used, and there is an using statement for it in line 54.
I'm not sure, but probably the original idea was to re-declare all executors in rclcpp::executors namespace.
You can ignore this comment for the moment, as it's a really not important and I'm not sure what was the idea of the original author.
| * exec.spin(); | ||
| * exec.remove_node(node); | ||
| */ | ||
| class StaticSingleThreadedExecutor : public executor::Executor |
There was a problem hiding this comment.
This is inheriting from executor::Executor, but it's not using some of the "protected" method it provides.
IMO, the code has to be reorganized a little bit (I don't have any suggestion yet, I will think a little bit more about it).
| prepare_wait_set(); | ||
|
|
||
| // add handles to the wait_set and wait for work | ||
| refresh_wait_set(timeout); |
There was a problem hiding this comment.
refresh_wait_set is always called in execute_ready_executables, why it has to be called here too?
|
|
||
| void | ||
| StaticSingleThreadedExecutor::execute_ready_executables( | ||
| ExecutableList & exec_list, std::chrono::nanoseconds timeout) |
There was a problem hiding this comment.
Why do we need both the exec_list and the memory_strategy_?
I think there is some duplication there, as they are mostly collecting the same stuff.
Why not using directly get_next_ready_executable instead of re-implementing this?
My idea is that this should look quite similar to the current SingleThreadedExecutor, but avoiding calling clear_handles and collect_entities each time. I don't understand the need of another collection.
|
Hello Everyone, Ishu and I are very happy to see that so many people have responded to this PR. The intention of this version was to show the inefficiencies of the original executor. With a few changes to the code, we managed to gain quite some performance. However, the executor is now sitting on top of old data-structures and functions that do not perfectly fit the implementation. A proper refactored executor would require a bigger rewrite and re-organization of rclcpp. This is why we rather wait until we are close to the Foxy release and then rebase and help with the refactored executor. |
@MartinCornelis2 I would suggest to try to get your proposed changes reviewed and merged rather earlier than later in the release cycle. If you need certain changes or decisions to happen before that please push on those ones as well. If you wait until close to the Foxy release it becomes likely that a significant PR like this won't be accepted in time and misses the freeze deadline and therefore can only be considered for the next ROS distro. |
|
@dirk-thomas please read up on the discussion: https://discourse.ros.org/t/singlethreadedexecutor-creates-a-high-cpu-overhead-in-ros-2/10077/21 One option is to get this reviewed and merged first, but it's not clear to me that is the best way. If @MartinCornelis2 wants to do it that way, then I'm happy to try and get it merged though. Also if, after discussing the roadmap for Foxy, we will not have time for the executor improvements (very bad idea, imo) then that might be another reason to resurrect this pr early, since the aforementioned improvements won't come anyways. |
|
Any updates on this? |
I don't have any updates right now, but that is still the plan. |
…hreaded_executor.cpp w.r.t. intra-process change since last commit Signed-off-by: MartinCornelis2 <martin.cornelis@nobleo.nl>
095af9d to
2b6a0a2
Compare
|
Hi Everyone, I merged master with this PR (resolved conflicts) and made some small changes to get the executor to compile again. I haven't had time to thoroughly test the executor functionality, but some quick ros2 topic inspections appear to indicate it is in working condition again. I am OK with reviewing further additions to this PR, but like stated before don't have time to commit fully to further development on this. @alsora had some suggestions to improve the rebuilding of the wait-set and executable list further. Right now we consider too many guard-conditions instead of just the "add stuff to a node"-related ones so there is still some unnecessary rebuilding going on. Changes and small fixes like the one alsora mentioned can still quite easily be added to the executor before the Foxy release (alsora offered to contribute, I offered to review). Additionally, the static executor in this PR is implemented in such a way that it is optional next to the original executor, so (aside from a slightly bigger install) there is no harm in including it. From @iluetkeb and @wjwwood s input it looks like tackling "proper" scheduling is (currently) out of scope. In short: Let me know what you guys think. If I am needed for anything please @ me or send me an e-mail. Kind regards! |
wait_set_.size_of_* is always different than '0' if we are inside the for loop Signed-off-by: Mauro <mpasserino@irobot.com>
|
@MartinCornelis2 I support including your executor. Regarding "proper" scheduling: IMHO this is an absolute MUST for Foxy. The current situation is very problematic in that it mostly works but breaks down in important edge cases. Everybody I talked to that uses ROS 2 for serious applications is using workarounds. |
|
Of course scheduling is very important, but do you think it is realistic given the timescale to still get something in before Foxy? Is it possible to implement one of these workarounds you mention before April? Additionally, the improved scheduling will cost "some" performance. I think the discussion on performance vs determinism (is determinism the correct term here?) is difficult and (for me) still full of question marks. Of course if we only lose a small amount of performance and suddenly have (fully) predictable scheduling then by all means shoot in a PR! @iluetkeb I think your experience can maybe give some clarity on this subject? Do you know of any fixes that are, like this executor, a good short-term improvement that can be implemented somewhat quickly? Or will this just lead to a half-solution that benefits nobody and will still require a huge rewrite? |
Certainly. It's not that hard. So far, I had assumed OpenRobotics is already working on it, and since this goes across many layers, they are in the best position to do so. I personally have some IP-related restrictions that would make it harder for me to do this, but having learned of the situation, I will certainly discuss internally how we could proceed here. |
Now we check ONLY node guard_conditions_ Some possible guard conditions to be triggered HERE are: 1. Ctrl+C guard condition 2. Executor interrupt_guard_condition_ 3. Node guard_conditions_ 4. Waitables guard conditions 5. ..more The previous approach was only checking if NOT (1 & 2), so if a Waitable was triggered, it would re-collect all entities, even if no new node entity was added. This was the case of the intra process manager, who relies on waitables. Every time a subscriber got a message, all the entities were collected. Signed-off-by: Mauro <mpasserino@irobot.com>
|
Just opened a PR with some StaticSingleThreadedExecutor improvements: nobleo#5
The gain in CPU performance is about ~13% compared to the StaticSingleThreadedExecutor, and 32% compared to the SingleThreadedExecutor. |
Collect entities when a new node entity is added/removed
|
I recently started looking into spurious awakes with the ROS 2 executor. I want to move the discussion here to get some feedback from more people on the static executor issue. Results from recent experiments:
I think that this is a relevant issue that must be addressed. |
Signed-off-by: Mauro <mpasserino@irobot.com>
|
@ivanpauno @wjwwood |
|
This pull request has been mentioned on ROS Discourse. There might be relevant details there: https://discourse.ros.org/t/singlethreadedexecutor-creates-a-high-cpu-overhead-in-ros-2/10077/23 |
|
@wjwwood, Hi William I just saw your post on Discourse and your additions here. Besides the copyright lines, what is expected from our side? My two main questions are:
I don't want to accidentally make changes that impede your work. Or have you wait on an action or response from us (which we might not be aware we are supposed to be taking/giving). Clear and concise instructions (even if blunt) are much appreciated :) |
Nothing at this point. If I have questions, being available to answer them quickly will help me get it in quicker (and be much appreciated). Sometimes we ask or expect contributors to address test failures and such, but given the accelerated timeline for getting changes in, I'll be editing the pr directly if I can.
You don't have to do anything. You could, however, fix this pull request using that pull request's branch as a source. This pull request currently contains commits from master which typically indicates some sort of issue from past rebases done incorrectly, but I'm not sure in this specific case. I could have fixed this one directly, but since you had issues and pull requests on your fork I didn't want to disrupt those without asking.
I'll have to have a look at it, but in general I'm fine to consider additional changes now or later (as a follow up pull request to rclcpp).
I made the other pull request out of a similar concern, I wouldn't worry about it, I can work around anything you do.
No, I've just decided I want to get this in and then change it based on my changes to the executor I have prepared rather than the other way around. |
…ntities_collector Implement static executor entities collector
Signed-off-by: Ishu Goel <ishu.goel@nobleo.nl>
|
Closing since I merged #1034 as an alternative. Thanks everyone for their contributions and patience with us. Please feel free to open new issues to continue discussions that have been taking place here. Also, I've been working on other structural changes to the executor design in rclcpp, so these new classes may be update soon and/or subsumed into other classes. |
* Use Reader seek() method for seeking/jumping in Player Signed-off-by: Sonia Jin <diegothemuich@gmail.com> * make sure playback stays paused (does not finish even when no messages left) while in pause state Signed-off-by: Sonia Jin <diegothemuich@gmail.com>
|
This pull request has been mentioned on ROS Discourse. There might be relevant details there: |
Signed-off-by: Ishu Goel ishu.goel@nobleo.nl
This pull request aims to provide a static alternative to the existing single-threaded executor. In response to a question posted on ROS answers, we did research on the high CPU overhead of the existing implementation. The results of the research are here. Based on the results we started a discussion on ROS discourse here and we posted an issue on the rclcpp GitHub.
We contribute this static executor as an intermediate solution until another alternative is present or the existing executor is optimized. Our executor can be used by ROS users that have a static system to reduce CPU overhead.
An explanation of our static single-threaded executor is not included in this pull request but can instead be found here. This is an older Dashing version we made for testing. The README provides an overview in the form of a flowchart with a short description of functionality.