Skip to content

set signal_status to 0 in rclcpp::init#152

Closed
jacquelinekay wants to merge 1 commit intomasterfrom
reset_signal
Closed

set signal_status to 0 in rclcpp::init#152
jacquelinekay wants to merge 1 commit intomasterfrom
reset_signal

Conversation

@jacquelinekay
Copy link
Copy Markdown
Contributor

shutdown should be the inverse function of init. This PR sets state that is set in shutdown back to what its starting value should be in init, to make the functions more symmetric.

Ideally, calling init, shutdown, init should have the same effect as calling init once.

@jacquelinekay jacquelinekay added the in progress Actively being worked on (Kanban column) label Nov 12, 2015
@jacquelinekay jacquelinekay self-assigned this Nov 12, 2015
@jacquelinekay jacquelinekay added in review Waiting for review (Kanban column) and removed in progress Actively being worked on (Kanban column) labels Nov 12, 2015
@dirk-thomas
Copy link
Copy Markdown
Member

Node::init() calls rmw_init. Therefore a symmetric function must do something similar.

Also the signal handling is not restored.

And there is a temporal gap after calling shutdown() and being able to call init() again (since it takes a while until it is actually shutdown).

@wjwwood
Copy link
Copy Markdown
Member

wjwwood commented Nov 12, 2015

(since it takes a while until it is actually shutdown)

What are you referring to?

@dirk-thomas
Copy link
Copy Markdown
Member

Afaik inoking shutdown is not instant. It might take a while until all running threads notice that and exit.

@wjwwood
Copy link
Copy Markdown
Member

wjwwood commented Nov 12, 2015

You mean in the middleware or within our code (like the multi threaded executor)?

@wjwwood
Copy link
Copy Markdown
Member

wjwwood commented Nov 12, 2015

Yeah but in most cases I think it would be ok, since init is not going to be called again until they are done (until what ever spin function called is done).

A bigger concern for me is what an rmw_shutdown should do. It can't destroy subscriptions and publishers and participants, etc. yet since, as you pointed out, things are still in progress. We probably need a finalize or something like that in addition to shutdown. Where shutdown notifies, but until the blocking finalize call is called and returns the system is still going. I dunno it all part of a large lifecycle which we don't have a clear definition of yet. This is also related to something I'm dealing with in rcl which is how to handle the case where a subscription handle is trying to be used, but the corresponding node has been destroyed.

@jacquelinekay
Copy link
Copy Markdown
Contributor Author

Right now, rmw_init just checks if the DomainParticipantFactory instance exists. It does not have any side effects. rmw_create_node and rmw_destroy_node allocate and deallocate the DomainParticipant and rclcpp::Node manages its own lifecycle by calling rmw_destroy_node in a custom deleter. Similarly, publishers and subscribers create and destroy their rmw representations in the destructor of the C++ object.

https://github.com/ros2/rmw_connext/blob/master/rmw_connext_shared_cpp/src/shared_functions.cpp#L119
https://github.com/ros2/rmw_opensplice/blob/master/rmw_opensplice_cpp/src/rmw_init.cpp#L26
https://github.com/ros2/rclcpp/blob/master/rclcpp/src/rclcpp/node.cpp#L108

Do we intend for rmw_init to ever change shared state? I don't see the need for an rmw_shutdown right now, and I think planning for it is a premature optimization.

@dirk-thomas
Copy link
Copy Markdown
Member

If our API does not contain a rmw_shutdown function that prohibits any rmw implementation to perform anything stateful stuff in rmw_init.

We could also argue to remove rmw_init since currently we don't do anything in this function.

But having only one side seems wrong to me.

@wjwwood
Copy link
Copy Markdown
Member

wjwwood commented Nov 12, 2015

It's not clear to me that rmw_init will need to change shared state, but since we want others to be able to write their own middleware we can't assume they don't need a symmetric call or that rmw_init can be called repeatedly. We can choose to make that the case, but we have to make that clear in the documentation and then all future middlewares would have to deal with it.

@jacquelinekay
Copy link
Copy Markdown
Contributor Author

OK, this is what I'm hearing:

We should add an rmw_shutdown function to the rmw API, even if it is just a stub for now.

We should decide if calling init, shutdown, init should even be permitted, or if init and shutdown are meant to be called once each in one process.

If multiple calls to init/shutdown is permitted, we should add rmw_shutdown to rclcpp::shutdown.

This might be more controversial: if multiple calls to init/shutdown are NOT permitted, there should be a function (rclcpp::interrupt?) that sends the signal to interrupt spin without tearing down the global rclcpp state or the rmw global state.

I'm not sure what the path I should take in this PR is. I just want a way of interrupting spin in a gtest with multiple cases. Right now calling shutdown ruins the global state in rclcpp such that I can't call spin again in the same process.

@dirk-thomas
Copy link
Copy Markdown
Member

Imo multiple calls should be permitted. But only alternating - never two consecutive calls of the same function.

One difficult question is: are the calls permitted immediately after each other or might there be some "delay" before they can be called again?

@jacquelinekay
Copy link
Copy Markdown
Contributor Author

Ideally, shutdown could block until all threads created by rclcpp or the middleware during execution have joined. That way init could be called any time after shutdown finished. It would be complicated to implement this with the codebase in its current state, however. Within rclcpp::spin, you risk running an entire cycle of spin before shutdown takes effect.

Middleware threads are more problematic because we're not sure what threads are spawned in the middleware and when they join. We could make it a requirement of rmw_shutdown that it must block until all the threads in the middleware are joined. But this doesn't fit well with our current implementation, where each rclcpp entity manage the lifecycle of its counterpart rmw object, which often has one or more threads associated with it. So the middleware threads do not get joined until the rclcpp entities are destructed.

(Actually, I'm pretty sure some Opensplice threads don't get joined until the process finishes--like the garbage collector--or that might be a different process.)

We could make it a requirement that the next rclcpp::init must happen in an independent scope of the previous call to rclcpp::init/rclcpp::shutdown. This seems prone to user error, however.

So, I don't know how to answer that questions besides "it seems to work without requiring a wait in our current implementation".

@wjwwood
Copy link
Copy Markdown
Member

wjwwood commented Nov 12, 2015

@jacquelinekay and I talked off line and I suggested that we just add a cancel() function to the executors which will interrupt spin without shutting down rclcpp. That way she can move forward by calling rclcpp::init() once and interrupt spin for her tests using executor.cancel().

Separately we need to come up with a completely strategy for the library lifecycle and implement that.

@jacquelinekay jacquelinekay removed the in review Waiting for review (Kanban column) label Nov 12, 2015
@jacquelinekay
Copy link
Copy Markdown
Contributor Author

Should I just close this PR then? The discussion seems like it will be useful later. Maybe start a new issue about init/shutdown and invalidating rclcpp entities and link here?

@jacquelinekay jacquelinekay deleted the reset_signal branch November 13, 2015 01:19
nnmm pushed a commit to ApexAI/rclcpp that referenced this pull request Jul 9, 2022
* remove fprintf, use logging macros

* consistent includes
mauropasse added a commit to mauropasse/rclcpp that referenced this pull request Aug 21, 2024
Co-authored-by: Mauro Passerino <mpasserino@irobot.com>
mauropasse pushed a commit to mauropasse/rclcpp that referenced this pull request Aug 23, 2024
mauropasse pushed a commit to mauropasse/rclcpp that referenced this pull request Aug 23, 2024
mauropasse added a commit to mauropasse/rclcpp that referenced this pull request Aug 27, 2024
* Always publish inter-process on TRANSIENT_LOCAL pubs (ros2#152)

* Backport upstream Action fixes

 - ros2#2531

* Fix actions feedback race

 - ros2#2451

* Fix data race in Actions: Part 3

---------

Co-authored-by: Mauro Passerino <mpasserino@irobot.com>
mauropasse added a commit to mauropasse/rclcpp that referenced this pull request Aug 27, 2024
* Always publish inter-process on TRANSIENT_LOCAL pubs (ros2#152)

* Fix actions feedback race

 - ros2#2451

* Fix data race in Actions: Part 3

---------

Co-authored-by: Mauro Passerino <mpasserino@irobot.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants