Skip to content

Lifecycle manager deadlock when shutting down during bringup #5437

@cboostjvisser

Description

@cboostjvisser

Bug report

Required Info:

  • Operating System:
    • osrf/ros:jazzy-desktop
  • Computer:
    • PC
  • ROS2 Version:
    • Jazzy Docker
  • Version or commit hash:
  • DDS implementation:
    • Fast DDS

Steps to reproduce issue

When launching an application that is using the lifecycle manager with multiple lifecycle nodes, a deadlock can be triggered.
If during the bringup of the lifecycle nodes, a shutdown is initiated(CTRL+C), the bringup and shutdown sequences of the lifecycle manager will run at the same time.

What then happens is a race condition that could result in the bringup logic deadlocking on spin_until_future_complete of one of the nodes, that will never be completed because shutdown has already been initiated.
Right here:

if (callback_group_executor_.spin_until_future_complete(future_result) !=

Expected behavior

Expected behavior is that the bringup sequence will be aborted, and all the nodes shutdown.

On the jazzy branch I have added a stop() function in service_client.hpp. This stop function cancels any running spin* functions on service_client's internal executor.

  void stop()
  {
    if (client_)
    {
      callback_group_executor_.cancel();
    }
  }

If we then call this stop function in the destructor of LifecycleServiceClient (lifecycle_service_client.hpp), the deadlock is avoided.

  ~LifecycleServiceClient()
  {
    change_state_.stop();
  };

I have not tested this on Kilted or Main. I see there have been changes to the lifecycle manager after Jazzy, I think the deadlock is still in there.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions