Skip to content

FastDDS doesn't report an error if it fails to setup reliable communication through SHM transport #3536

@ksuszka

Description

@ksuszka

Is there an already existing issue for this?

  • I have searched the existing issues

Expected behavior

If for whatever reason FastDDS is not able to set up the SHM communication channel it should fail with a meaningful error.

Current behavior

FastDDS fails to set up the SHM communication channel but it doesn't report any errors so the process doesn't know that the communication doesn't work.

Steps to reproduce

Example of this behaviour is described in #3535

Fast DDS version/commit

ros-humble-fastrtps-cmake-module/now 2.2.0-2jammy.20230112.142430 amd64 [installed,local]
ros-humble-fastrtps/now 2.6.4-1jammy.20230117.223829 amd64 [installed,local]
ros-humble-rmw-fastrtps-cpp/now 6.2.2-1jammy.20230117.225910 amd64 [installed,local]
ros-humble-rmw-fastrtps-shared-cpp/now 6.2.2-1jammy.20230117.225455 amd64 [installed,local]
ros-humble-rosidl-typesupport-fastrtps-c/now 2.2.0-2jammy.20230112.145514 amd64 [installed,local]
ros-humble-rosidl-typesupport-fastrtps-cpp/now 2.2.0-2jammy.20230112.145146 amd64 [installed,local

Platform/Architecture

Other. Please specify in Additional context section.

Transport layer

Default configuration, UDPv4 & SHM

Additional context

To be able to use FastDDS reliably a user needs to know if it fails to set up a reliable communication.

The sample given in #3535 is just a simple example to demonstrate the issue.

More realistic scenario looks like:

  • DevOps logins into the machine on his account and tests dockerized deployment of a system with a few dozen nodes.
  • It works correctly so he prepares deployment to run under the service user account.
  • It starts correctly. It doesn't report any issues. But it doesn't work. The communication between a few nodes doesn't work although the configuration is correct and it worked when he tested it earlier.

Something similar happened in our case and we lost a significant amount of time to find out that messages between two specific nodes weren't exchanged in a big system despite proper configuration and no errors reported.

XML configuration file

No response

Relevant log output

No response

Network traffic capture

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    need more infoIssue that requires more info from contributor

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions