Skip to content

Keep service around until client exits#48

Merged
jacquelinekay merged 7 commits intomasterfrom
service_lifetime
May 3, 2016
Merged

Keep service around until client exits#48
jacquelinekay merged 7 commits intomasterfrom
service_lifetime

Conversation

@jacquelinekay
Copy link
Copy Markdown
Contributor

Connects to ros2/launch#27
The remote services test is flaky for Connext because the service process needs to stick around until the client exits.

@jacquelinekay jacquelinekay added the in progress Actively being worked on (Kanban column) label Apr 19, 2016
@jacquelinekay jacquelinekay self-assigned this Apr 19, 2016
@jacquelinekay jacquelinekay added in review Waiting for review (Kanban column) and removed in progress Actively being worked on (Kanban column) labels Apr 19, 2016
@jacquelinekay
Copy link
Copy Markdown
Contributor Author

the "while true" loop in the service was leading to some problems with the return code, so I've changed that to a sleep for 5 seconds, which shouldn't give an error if interrupted.

http://ci.ros2.org/job/ci_osx/979/
http://ci.ros2.org/job/ci_windows/1257

I believe the alternative is replicating the signal handling code in rclcpp in the rcl test suite.

// Our scope exits should take care of fini for everything
// stick around until we are killed by the client
// To avoid an infinite loop and signal handling, just sleep for a "long" time
std::this_thread::sleep_for(std::chrono::seconds(3));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not spin here instead? Or sleep in a loop?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess spin is not there.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing with @jacquelinekay offline, we agreed that adding a new exit handler to launch which allows for the "sigterm" return code after the server was interrupted by launch is the right way to go. Then the server can just busy-loop wait forever rather than do a single sleep here.

@wjwwood
Copy link
Copy Markdown
Member

wjwwood commented Apr 20, 2016

@jacquelinekay
Copy link
Copy Markdown
Contributor Author

This change now needs ros2/launch#27

@dirk-thomas
Copy link
Copy Markdown
Member

I think the way this uses the exit handlers doesn't work as you expect it to. If the second executable (which is the client) has a a primary exit handler its return code will determine the rc of the launch file. When the client exits it will sigint / sigkill all other processes. But the new ignore_signal_exit_handler will override the launch file rc with 0 when it is being signaled - independent of what the intended rc from the primary exit handler is.

@jacquelinekay
Copy link
Copy Markdown
Contributor Author

I see. If I only want the launch file to return 0 if the client returned zero and the service returned zero or sigint/sigkill, how would I implement this in launch? Can I access the original return code of the other process containing the primary exit handler from the process that is getting shut down? I assume context.task_state refers to the task state of the current process.

@dirk-thomas
Copy link
Copy Markdown
Member

The ignore signal handler should simply not be a primary exit handler, since you want to use it on the non-primary executable. It never needs to change the launch rc directly.

If context.task_state.returncode is one of the expected signals then just send that one to 0. Then do the default exit handling.

The primary exit handler can then still decide to use ignore_returncode or not.

@jacquelinekay
Copy link
Copy Markdown
Contributor Author

Continuing this discussion in ros2/launch#27

@jacquelinekay
Copy link
Copy Markdown
Contributor Author

I removed the extraneous sleep that @wjwwood pointed out. I don't think it's need since wait_for_service is used. From some local testing, the inter-process services test still passes. Testing on OSX since it failed in the earlier link that I gave (before I made changes to the exit handler).

http://ci.ros2.org/job/ci_osx/1004

@tfoote
Copy link
Copy Markdown
Contributor

tfoote commented May 3, 2016

@jacquelinekay to run CI for other platforms.

@jacquelinekay
Copy link
Copy Markdown
Contributor Author

@jacquelinekay
Copy link
Copy Markdown
Contributor Author

The changed test passes for all platforms. Therefore I think this is good to merge.

@wjwwood
Copy link
Copy Markdown
Member

wjwwood commented May 3, 2016

lgtm, but since @dirk-thomas was previously commenting, I'd wait for feedback from him too.

@dirk-thomas
Copy link
Copy Markdown
Member

My comment about the exit handler was addressed within the launch PR. +1

@jacquelinekay jacquelinekay merged commit 2811eb1 into master May 3, 2016
@jacquelinekay jacquelinekay deleted the service_lifetime branch May 3, 2016 22:55
@jacquelinekay jacquelinekay removed the in review Waiting for review (Kanban column) label May 3, 2016
ivanpauno pushed a commit that referenced this pull request Jan 2, 2020
changes to support rcl refactor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants