set signal_status to 0 in rclcpp::init by jacquelinekay · Pull Request #152 · ros2/rclcpp

jacquelinekay · 2015-11-12T18:21:14Z

shutdown should be the inverse function of init. This PR sets state that is set in shutdown back to what its starting value should be in init, to make the functions more symmetric.

Ideally, calling init, shutdown, init should have the same effect as calling init once.

dirk-thomas · 2015-11-12T18:36:34Z

Node::init() calls rmw_init. Therefore a symmetric function must do something similar.

Also the signal handling is not restored.

And there is a temporal gap after calling shutdown() and being able to call init() again (since it takes a while until it is actually shutdown).

wjwwood · 2015-11-12T19:27:11Z

(since it takes a while until it is actually shutdown)

What are you referring to?

dirk-thomas · 2015-11-12T19:58:32Z

Afaik inoking shutdown is not instant. It might take a while until all running threads notice that and exit.

wjwwood · 2015-11-12T20:27:57Z

You mean in the middleware or within our code (like the multi threaded executor)?

dirk-thomas · 2015-11-12T20:34:31Z

Any code calling shutdown() is in the middle of doing something, e.g.:

wjwwood · 2015-11-12T20:42:31Z

Yeah but in most cases I think it would be ok, since init is not going to be called again until they are done (until what ever spin function called is done).

A bigger concern for me is what an rmw_shutdown should do. It can't destroy subscriptions and publishers and participants, etc. yet since, as you pointed out, things are still in progress. We probably need a finalize or something like that in addition to shutdown. Where shutdown notifies, but until the blocking finalize call is called and returns the system is still going. I dunno it all part of a large lifecycle which we don't have a clear definition of yet. This is also related to something I'm dealing with in rcl which is how to handle the case where a subscription handle is trying to be used, but the corresponding node has been destroyed.

jacquelinekay · 2015-11-12T21:24:49Z

Right now, rmw_init just checks if the DomainParticipantFactory instance exists. It does not have any side effects. rmw_create_node and rmw_destroy_node allocate and deallocate the DomainParticipant and rclcpp::Node manages its own lifecycle by calling rmw_destroy_node in a custom deleter. Similarly, publishers and subscribers create and destroy their rmw representations in the destructor of the C++ object.

https://github.com/ros2/rmw_connext/blob/master/rmw_connext_shared_cpp/src/shared_functions.cpp#L119
https://github.com/ros2/rmw_opensplice/blob/master/rmw_opensplice_cpp/src/rmw_init.cpp#L26
https://github.com/ros2/rclcpp/blob/master/rclcpp/src/rclcpp/node.cpp#L108

Do we intend for rmw_init to ever change shared state? I don't see the need for an rmw_shutdown right now, and I think planning for it is a premature optimization.

dirk-thomas · 2015-11-12T21:28:29Z

If our API does not contain a rmw_shutdown function that prohibits any rmw implementation to perform anything stateful stuff in rmw_init.

We could also argue to remove rmw_init since currently we don't do anything in this function.

But having only one side seems wrong to me.

wjwwood · 2015-11-12T21:28:55Z

It's not clear to me that rmw_init will need to change shared state, but since we want others to be able to write their own middleware we can't assume they don't need a symmetric call or that rmw_init can be called repeatedly. We can choose to make that the case, but we have to make that clear in the documentation and then all future middlewares would have to deal with it.

jacquelinekay · 2015-11-12T22:08:03Z

OK, this is what I'm hearing:

We should add an rmw_shutdown function to the rmw API, even if it is just a stub for now.

We should decide if calling init, shutdown, init should even be permitted, or if init and shutdown are meant to be called once each in one process.

If multiple calls to init/shutdown is permitted, we should add rmw_shutdown to rclcpp::shutdown.

This might be more controversial: if multiple calls to init/shutdown are NOT permitted, there should be a function (rclcpp::interrupt?) that sends the signal to interrupt spin without tearing down the global rclcpp state or the rmw global state.

I'm not sure what the path I should take in this PR is. I just want a way of interrupting spin in a gtest with multiple cases. Right now calling shutdown ruins the global state in rclcpp such that I can't call spin again in the same process.

dirk-thomas · 2015-11-12T22:15:04Z

Imo multiple calls should be permitted. But only alternating - never two consecutive calls of the same function.

One difficult question is: are the calls permitted immediately after each other or might there be some "delay" before they can be called again?

jacquelinekay · 2015-11-12T23:11:42Z

Ideally, shutdown could block until all threads created by rclcpp or the middleware during execution have joined. That way init could be called any time after shutdown finished. It would be complicated to implement this with the codebase in its current state, however. Within rclcpp::spin, you risk running an entire cycle of spin before shutdown takes effect.

Middleware threads are more problematic because we're not sure what threads are spawned in the middleware and when they join. We could make it a requirement of rmw_shutdown that it must block until all the threads in the middleware are joined. But this doesn't fit well with our current implementation, where each rclcpp entity manage the lifecycle of its counterpart rmw object, which often has one or more threads associated with it. So the middleware threads do not get joined until the rclcpp entities are destructed.

(Actually, I'm pretty sure some Opensplice threads don't get joined until the process finishes--like the garbage collector--or that might be a different process.)

We could make it a requirement that the next rclcpp::init must happen in an independent scope of the previous call to rclcpp::init/rclcpp::shutdown. This seems prone to user error, however.

So, I don't know how to answer that questions besides "it seems to work without requiring a wait in our current implementation".

wjwwood · 2015-11-12T23:30:54Z

@jacquelinekay and I talked off line and I suggested that we just add a cancel() function to the executors which will interrupt spin without shutting down rclcpp. That way she can move forward by calling rclcpp::init() once and interrupt spin for her tests using executor.cancel().

Separately we need to come up with a completely strategy for the library lifecycle and implement that.

jacquelinekay · 2015-11-12T23:33:06Z

Should I just close this PR then? The discussion seems like it will be useful later. Maybe start a new issue about init/shutdown and invalidating rclcpp entities and link here?

* remove fprintf, use logging macros * consistent includes

Co-authored-by: Mauro Passerino <mpasserino@irobot.com>

* Always publish inter-process on TRANSIENT_LOCAL pubs (ros2#152) * Backport upstream Action fixes - ros2#2531 * Fix actions feedback race - ros2#2451 * Fix data race in Actions: Part 3 --------- Co-authored-by: Mauro Passerino <mpasserino@irobot.com>

* Always publish inter-process on TRANSIENT_LOCAL pubs (ros2#152) * Fix actions feedback race - ros2#2451 * Fix data race in Actions: Part 3 --------- Co-authored-by: Mauro Passerino <mpasserino@irobot.com>

set signal_status to 0 in rclcpp::init

2998422

jacquelinekay added the in progress Actively being worked on (Kanban column) label Nov 12, 2015

jacquelinekay mentioned this pull request Nov 12, 2015

Test for multithreaded execution ros2/system_tests#72

Merged

jacquelinekay self-assigned this Nov 12, 2015

jacquelinekay added in review Waiting for review (Kanban column) and removed in progress Actively being worked on (Kanban column) labels Nov 12, 2015

jacquelinekay removed the in review Waiting for review (Kanban column) label Nov 12, 2015

jacquelinekay closed this Nov 13, 2015

jacquelinekay mentioned this pull request Nov 13, 2015

rclcpp lifecycle: refine design of init, shutdown #154

Closed

jacquelinekay deleted the reset_signal branch November 13, 2015 01:19

nnmm pushed a commit to ApexAI/rclcpp that referenced this pull request Jul 9, 2022

remove fprintf, use logging macros (ros2#152)

5cdef52

* remove fprintf, use logging macros * consistent includes

mauropasse added a commit to mauropasse/rclcpp that referenced this pull request Aug 21, 2024

Always publish inter-process on TRANSIENT_LOCAL publishers (ros2#152)

c2acb19

Co-authored-by: Mauro Passerino <mpasserino@irobot.com>

mauropasse pushed a commit to mauropasse/rclcpp that referenced this pull request Aug 23, 2024

Always publish inter-process on TRANSIENT_LOCAL pubs (ros2#152)

9435bee

mauropasse pushed a commit to mauropasse/rclcpp that referenced this pull request Aug 23, 2024

Always publish inter-process on TRANSIENT_LOCAL pubs (ros2#152)

3b8e7a2

Conversation

jacquelinekay commented Nov 12, 2015

Uh oh!

dirk-thomas commented Nov 12, 2015

Uh oh!

wjwwood commented Nov 12, 2015

Uh oh!

dirk-thomas commented Nov 12, 2015

Uh oh!

wjwwood commented Nov 12, 2015

Uh oh!

dirk-thomas commented Nov 12, 2015

Uh oh!

wjwwood commented Nov 12, 2015

Uh oh!

jacquelinekay commented Nov 12, 2015

Uh oh!

dirk-thomas commented Nov 12, 2015

Uh oh!

wjwwood commented Nov 12, 2015

Uh oh!

jacquelinekay commented Nov 12, 2015

Uh oh!

dirk-thomas commented Nov 12, 2015

Uh oh!

jacquelinekay commented Nov 12, 2015

Uh oh!

wjwwood commented Nov 12, 2015

Uh oh!

jacquelinekay commented Nov 12, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants