Skip to content

Implement signal handling in Python#728

Closed
sloretz wants to merge 3 commits intomasterfrom
python_signal_handler
Closed

Implement signal handling in Python#728
sloretz wants to merge 3 commits intomasterfrom
python_signal_handler

Conversation

@sloretz
Copy link
Copy Markdown
Contributor

@sloretz sloretz commented Mar 20, 2021

Part of #665 because it removes the signal handler CPython extension

This implements signal handling in Python using the signal library rather than doing it in C. I think the benefits outweigh the drawbacks.

Benefits:

  • Less C code to maintain
  • Don't have to port the atomics usage in the sigint module to C++
  • One less compilation unit depending on rclpy_handle_...
  • Don't need to worry about signal safety in Python signal handler

Drawbacks

  • The signal handler isn't run exactly when the signal happens - The real signal handler is setting a flag that tells CPython to call the Python signal handlers ASAP
  • If pure C code takes a really really long time, the Python signal handler won't get called until the interpreter gets to run again
    • But C code taking a long time without releasing the GIL would already be a problem on its own
  • It adds an extra frame to the stack-trace

With signal handling in C

$ ros2 run examples_rclpy_minimal_publisher publisher_member_function
[INFO] [1616203329.647568651] [minimal_publisher]: Publishing: "Hello World: 0"
[INFO] [1616203330.133528959] [minimal_publisher]: Publishing: "Hello World: 1"
[INFO] [1616203330.633883396] [minimal_publisher]: Publishing: "Hello World: 2"
^CTraceback (most recent call last):
  File "/home/osrf/ws/ros2/install/examples_rclpy_minimal_publisher/lib/examples_rclpy_minimal_publisher/publisher_member_function", li
ne 11, in <module>
    load_entry_point('examples-rclpy-minimal-publisher', 'console_scripts', 'publisher_member_function')()
  File "/home/osrf/ws/ros2/build/examples_rclpy_minimal_publisher/examples_rclpy_minimal_publisher/publisher_member_function.py", line
43, in main
    rclpy.spin(minimal_publisher)
  File "/home/osrf/ws/ros2/install/rclpy/lib/python3.8/site-packages/rclpy/__init__.py", line 196, in spin
    executor.spin_once()
  File "/home/osrf/ws/ros2/install/rclpy/lib/python3.8/site-packages/rclpy/executors.py", line 693, in spin_once
    handler, entity, node = self.wait_for_ready_callbacks(timeout_sec=timeout_sec)
  File "/home/osrf/ws/ros2/install/rclpy/lib/python3.8/site-packages/rclpy/executors.py", line 679, in wait_for_ready_callbacks
    return next(self._cb_iter)
  File "/home/osrf/ws/ros2/install/rclpy/lib/python3.8/site-packages/rclpy/executors.py", line 576, in _wait_for_ready_callbacks
    _rclpy.rclpy_wait(wait_set, timeout_nsec)
KeyboardInterrupt

With signal handling in Python, the last frame in the stack trace printed is a frame inside the Python signal handler.

 ros2 run examples_rclpy_minimal_publisher publisher_member_function
[INFO] [1616203196.522705663] [minimal_publisher]: Publishing: "Hello World: 0"
[INFO] [1616203197.007797048] [minimal_publisher]: Publishing: "Hello World: 1"
[INFO] [1616203197.507855562] [minimal_publisher]: Publishing: "Hello World: 2"
^CTraceback (most recent call last):
  File "/home/osrf/ws/ros2/install/examples_rclpy_minimal_publisher/lib/examples_rclpy_minimal_publisher/publisher_member_function", li
ne 11, in <module>
    load_entry_point('examples-rclpy-minimal-publisher', 'console_scripts', 'publisher_member_function')()
  File "/home/osrf/ws/ros2/build/examples_rclpy_minimal_publisher/examples_rclpy_minimal_publisher/publisher_member_function.py", line
43, in main
    rclpy.spin(minimal_publisher)
  File "/home/osrf/ws/ros2/install/rclpy/lib/python3.8/site-packages/rclpy/__init__.py", line 196, in spin
    executor.spin_once()
  File "/home/osrf/ws/ros2/install/rclpy/lib/python3.8/site-packages/rclpy/executors.py", line 693, in spin_once
    handler, entity, node = self.wait_for_ready_callbacks(timeout_sec=timeout_sec)
  File "/home/osrf/ws/ros2/install/rclpy/lib/python3.8/site-packages/rclpy/executors.py", line 679, in wait_for_ready_callbacks
    return next(self._cb_iter)
  File "/home/osrf/ws/ros2/install/rclpy/lib/python3.8/site-packages/rclpy/executors.py", line 576, in _wait_for_ready_callbacks
    _rclpy.rclpy_wait(wait_set, timeout_nsec)
  File "/home/osrf/ws/ros2/install/rclpy/lib/python3.8/site-packages/rclpy/signals.py", line 62, in __sigint_handler
    cls.__old_handler(signum, frame)
KeyboardInterrupt

@sloretz sloretz self-assigned this Mar 20, 2021
@sloretz
Copy link
Copy Markdown
Contributor Author

sloretz commented Mar 20, 2021

CI (build: --packages-up-to rclpy test: --packages-select rclpy)

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@sloretz sloretz mentioned this pull request Mar 20, 2021
34 tasks
Copy link
Copy Markdown
Contributor

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to say that I'm nervous about this in general. I don't have anything specific to point to, but every time we've messed with the signal handling it takes quite a while to iron all of the bugs out.

That's not to say that we shouldn't do this; but I do think we should consider waiting on this one until after Galactic, which will give us a whole year to figure out where the bugs are.

Copy link
Copy Markdown
Contributor

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a fine port, except for the fact this no longer chains signal handlers at the C level (CPython signal handling doesn't do that). If Python signal handlers are setup before any C extension with proper chaining does so, we should be OK I guess.

@sloretz
Copy link
Copy Markdown
Contributor Author

sloretz commented Mar 22, 2021

I don't have anything specific to point to, but every time we've messed with the signal handling it takes quite a while to iron all of the bugs out.
That's not to say that we shouldn't do this; but I do think we should consider waiting on this one until after Galactic, which will give us a whole year to figure out where the bugs are.

Yeah, that's fair. I can leave this PR up until Galactic. If we do that, I'll need to think of a way to be backwards compatible with the use of rclpy_handle for guard conditions in the C code as we eliminate rclpy_handle_t elsewhere.

@sloretz sloretz changed the base branch from master to post_galactic_freeze April 7, 2021 20:44
@sloretz sloretz force-pushed the python_signal_handler branch from a35877a to d940805 Compare April 9, 2021 00:28
@sloretz sloretz changed the base branch from post_galactic_freeze to master April 9, 2021 00:39
@sloretz
Copy link
Copy Markdown
Contributor Author

sloretz commented Apr 9, 2021

Re-re-targeted back to master and re-marking as draft. This should stay in rolling for a while before it's considered for backport (if at all).

@sloretz sloretz changed the title Implement signal handling in Python [Waiting until after Galactic] Implement signal handling in Python Apr 9, 2021
@ahcorde
Copy link
Copy Markdown
Contributor

ahcorde commented Apr 14, 2021

@sloretz is there any way to retarget this to post_galactic_freeze ? I converted GuardCondition to use C++ classes, but I had to cherry-pick these two commits.

@sloretz sloretz changed the title [Waiting until after Galactic] Implement signal handling in Python Implement signal handling in Python Apr 28, 2021
sloretz added 2 commits April 28, 2021 13:45
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
@sloretz sloretz force-pushed the python_signal_handler branch from 09a1c90 to e4c7776 Compare April 28, 2021 20:51
Signed-off-by: Shane Loretz <sloretz@osrfoundation.org>
@sloretz
Copy link
Copy Markdown
Contributor Author

sloretz commented Apr 28, 2021

This PR is ready for review to be merged into master/Rolling. It should not be backported.

CI (build: --packages-up-to rclpy test: --packages-select rclpy)

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@sloretz
Copy link
Copy Markdown
Contributor Author

sloretz commented Apr 28, 2021

@ros-pull-request-builder retest this please

@clalancette
Copy link
Copy Markdown
Contributor

Can I suggest we do a full build+test before we merge this one? That will at least give us an idea if any of the signals that launch files send to processes have problems.

@sloretz
Copy link
Copy Markdown
Contributor Author

sloretz commented May 5, 2021

All downstream packages CI (build: --packages-above-and-dependencies rclpy test: --packages-above rclpy)

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

@sloretz sloretz marked this pull request as draft May 11, 2021 18:48
@sloretz
Copy link
Copy Markdown
Contributor Author

sloretz commented May 11, 2021

I converted this back to a draft because the signal handling in Python is not as straight forward as I hoped. the signal() method can only be called from the main thread, which makes it tricky to unregister our handler when the library is shut down :-/

@sloretz
Copy link
Copy Markdown
Contributor Author

sloretz commented Aug 11, 2021

I'm going to close this in favor of #814. Using Python's built-in signal library has a lot of downsides, and being only able to register and unregister a signal handler from the same thread Python was initialized in is especially hard to work around.

@sloretz sloretz closed this Aug 11, 2021
@sloretz sloretz deleted the python_signal_handler branch August 11, 2021 23:24
@sloretz sloretz mentioned this pull request Oct 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants