Fix memory leak in Transform Listener by zygfrydw · Pull Request #630 · ros2/geometry2

zygfrydw · 2023-09-26T11:41:10Z

There is a memory / message leak when using TransformListener constructed with internal node.

When TransformListener is constructed in the following way:

auto tf_buffer = tf2_ros::Buffer(node->get_clock());
auto tf_listener = tf2_ros::TransformListener(tf_buffer);

There is a memory leake due to messages published to /parameter_events which is not served in the internal node.
We have observed it when using CycloneDDS with shared memory (Iceoryx) and we have created a repo to reproduce this error.

The reason for this issue is that, when in TransformListener is created dedicated thread, the subcriptions for /tf and /tf_static topic are added to callback group. When internal node is created, the node creates subcription on topic /parameter_events. However this subscription is created in default callback group which is never spinned by the executor (the executor is told explicitly to run only the callback group with /tf and /tf_static).
The topic /parameter_events is created with QOS with history of 1000 messages. Therefore not served messages live forever in /parameter_events for each TransformListener.
In the application with default CycloneDDS this results in small, constant memory leak in each instance of TransformListener, however when messages are taken from pool like when using Shared Memory, the memory pool is exausted quickly.

The /parameter_events is created in rclcpp::TimeSource::NodeState in rclcpp/src/rclcpp/time_source.cpp:270.

In my opinion this subcription on parameter events is not necessary in this internal node, however I do not see any option to disable this. Additionaly, the problem will come back if someone adds other subscriptions in Node in future, hance I perfer current solution with adding default callback group to executor.

Other possible solution is:

executor_ = std::make_shared<rclcpp::executors::SingleThreadedExecutor>();
if(optional_default_node_)
{
  // If the optional_default_node is created it is necessary to add the default callback group
  // to ensure that internal node subscriptions like /parameter_events are processed.
  executor_->add_node(node_base_interface_);
} else{
  executor_->add_callback_group(callback_group_, node_base_interface_);
}

clalancette · 2023-09-27T15:47:20Z

In my opinion this subcription on parameter events is not necessary in this internal node, however I do not see any option to disable this.

So we do actually need to react to parameter events in TimeSource. The reason is so that we can do the right thing when the user enables or disables simulation time via the use_sim_time parameter.

However, we don't exactly need to subscribe to it anymore. Since Iron, we have the ability to get a callback after a parameter has been updated, using the post_set_parameters_callback (like in https://github.com/ros2/demos/blob/36a4c0bd6ae0531ad3096806c41efd7062a1cb31/demo_nodes_cpp/src/parameters/set_parameters_callback.cpp#L86-L106). Additionally, using the post_set_parameters_callback will be more performant than the subscription.

So my suggestion is that instead we open a PR to rclcpp, replacing https://github.com/ros2/rclcpp/blob/9284d7cefa9e37ce3b6926a44e6177d6e4a4b62a/rclcpp/src/rclcpp/time_source.cpp#L279-L288 with an add_on_post_set_parameters_callback. That will solve this leak.

We can also consider taking this PR, for the reasons you describe, but I think that is a separate issue. @zygfrydw are you willing to open that PR to rclcpp?

zygfrydw · 2023-10-02T10:58:14Z

I am not exactly sure what it changes to replace the parameter_subscription_ with post_set_parameters_callback in terms of this pull request.

In the TimeSource::NodeState::attachNode there is already a subscription to on_set_parameters_callback:

on_set_parameters_callback_ = node_parameters_->add_on_set_parameters_callback(
  std::bind(&TimeSource::NodeState::on_set_parameters, this, std::placeholders::_1));

post_set_parameters_callback = rclcpp::AsyncParametersClient::on_parameter_event(
  node_topics_,
  [this](std::shared_ptr<const rcl_interfaces::msg::ParameterEvent> event) {
    if (node_base_ != nullptr) {
      this->on_parameter_event(event);
    }
  });

I do not understand why both on_set_parameters_callback and parameter_subscription_ are necessary and why changing the parameter_subscription_ to post_set_parameters_callback would change anything.

In my opinion, it is not correct to spin only the /tf and /tf_static methods of the internal Transform Listener node.
I am not aware of other elements of node routines that require spinning, so removing post_set_parameters_callback may be sufficient for now. However in my opinion it is a fragile fix because if in the future someone adds some subscription to Node the issue will come back.

We are currently locked on the ROS Humble version so we hope you merge this pull request to ROS Humble as well.

jayyoung · 2023-11-08T14:07:47Z

Hello,
We ran into what looks like this issue and discovered this PR. During our investigation we independently discovered that all of our memory-leaking nodes were ones that instantiated a TransformListener in the way you described in your example. We're testing the fix here now, but I wanted to drop by to show the effects of the memory leak, as we did quite some testing on multiple robots in our fleet, and in simulation, to get a picture of its severity, and what triggers it.

Here is a graph of time vs. memory usage of every node in our system, gathered over 548 minutes (9.1 hours) of robot idle time. The lines with an incline are nodes with TransformListener instantiations that leaked memory. The flat lines are the rest of our system.

From eyeballing these results, it looks to me like every TransformListener leaks about ~20mb of memory over that ~9hr period, and nodes with multiple instantiations of a TransformListener leak harder than those with just one, or none. This fits your explanation of a never-decreasing, immortal message queue per-instance of that object. In our case, this overall accumulation led to an extra ~1.2gb of memory usage by our system. This is pretty severe for those of us doing long-term robotics tasks.

Just wanted to put here some experimental support for your diagnosis. Not sure if the fix here helps us yet, but we are very suspicious that this object is the one that is leaky, and we will report back.

nachovizzo · 2023-11-13T11:43:47Z

I also created a small repository to run experiments. I spawn a dummy node that doesn't do anything, just advertise a parameter which I later use to send a big string payload. After putting the payload into the parameter, I started seeing memory leaks. The confusing part is that with/without TransformListener I still observe leaks, and with 2 different DDS implementations.

Indeed if a tf2_ros::TransformListener object is part of the nodes the memory leak is notable

LeakyNode - No tf stuff

class LeakyNode : public rclcpp::Node {
public:
    explicit LeakyNode(const std::string &name) : Node(name) {
        this->declare_parameter("my_parameter", "world");
    }
};

LeakyNode - with dummy TransformListener

class LeakyNode : public rclcpp::Node {
public:
    explicit LeakyNode(const std::string &name) : Node(name) {
        this->declare_parameter("my_parameter", "world");
        tf_buffer = std::make_unique<tf2_ros::Buffer>(this->get_clock());
        tf_listener = std::make_unique<tf2_ros::TransformListener>(*tf_buffer);
    }
    std::unique_ptr<tf2_ros::TransformListener> tf_listener;
    std::unique_ptr<tf2_ros::Buffer> tf_buffer;
};

Conclusions

I don't have anyone yet 😢 So far all the experiments I have done are in some way contradictory (common in memory leak hunting). Any leads on how to nail down this problem would be highly appreciated

ahcorde · 2024-05-15T10:55:16Z

@zygfrydw Was this issue fixed with the @nachovizzo PR #636 ?

Feel free to reopen it again, if the problem is still there

ralwing · 2025-08-04T13:43:30Z

I've just tested current ros2:humble ros-humble-geometry2/jammy,now 0.25.15-1jammy.20250719.035645 amd64 [installed,automatic] executing testing docker and the problem remains:

Mempool [m_chunkSize = 1064, numberOfChunks = 100, used_chunks = 100 ] has no more space left
2025-08-04 13:28:23.888 [ Error ]: MemoryManager: unable to acquire a chunk with a chunk-payload size of 152The following mempools are available:  MemPool [ ChunkSize = 1064, ChunkPayloadSize = 1024, ChunkCount = 100 ]
2025-08-04 13:28:23.889 [Warning]: ICEORYX error! MEPOO__MEMPOOL_GETCHUNK_POOL_IS_RUNNING_OUT_OF_CHUNKS
Mempool [m_chunkSize = 1064, numberOfChunks = 100, used_chunks = 100 ] has no more space left
2025-08-04 13:28:23.890 [ Error ]: MemoryManager: unable to acquire a chunk with a chunk-payload size of 152The following mempools are available:  MemPool [ ChunkSize = 1064, ChunkPayloadSize = 1024, ChunkCount = 100 ]
2025-08-04 13:28:23.890 [Warning]: ICEORYX error! MEPOO__MEMPOOL_GETCHUNK_POOL_IS_RUNNING_OUT_OF_CHUNKS
Mempool [m_chunkSize = 1064, numberOfChunks = 100, used_chunks = 100 ] has no more space left
2025-08-04 13:28:23.891 [ Error ]: MemoryManager: unable to acquire a chunk with a chunk-payload size of 152The following mempools are available:  MemPool [ ChunkSize = 1064, ChunkPayloadSize = 1024, ChunkCount = 100 ]

Also tested for other ros2 distributions:

jazzy - memory leak
kilted - fixed
rolling - fixed

@clalancette @zygfrydw @ahcorde I'm kindly asking to reopen this issue.

clalancette · 2025-08-06T11:52:19Z

@clalancette @zygfrydw @ahcorde I'm kindly asking to reopen this issue.

This is an old issue that we think we've solved a long time ago. Please open a new issue with your findings and how to reproduce.

ralwing · 2025-08-06T13:56:07Z

With all respect, it doesn't make sense to open a new issue, since it wasn't fixed and will contain the same content as this one. I've only reproduced the steps which @zygfrydw mentioned.
I only additionally tested it on different ros2 distros.

Fix messages like in /parameter_events event.

d87ac9d

zygfrydw requested review from ahcorde and clalancette as code owners September 26, 2023 11:41

clalancette changed the base branch from ros2 to rolling September 27, 2023 15:37

ksuszka mentioned this pull request Oct 2, 2023

Possible memory leak in Cyclone/Iceoryx subscriber history queue ros2/rmw_cyclonedds#471

Open

clalancette mentioned this pull request Oct 2, 2023

ros time source should not use parameter client ros2/rclcpp#580

Closed

clalancette self-assigned this Oct 12, 2023

nachovizzo mentioned this pull request Nov 21, 2023

Fix constantly increasing memory in std::list #636

Merged

3 tasks

ahcorde closed this May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix memory leak in Transform Listener#630

Fix memory leak in Transform Listener#630
zygfrydw wants to merge 1 commit intoros2:rollingfrom
zygfrydw:fix-tf-listener

zygfrydw commented Sep 26, 2023 •

edited

Loading

Uh oh!

clalancette commented Sep 27, 2023

Uh oh!

zygfrydw commented Oct 2, 2023 •

edited

Loading

Uh oh!

jayyoung commented Nov 8, 2023

Uh oh!

nachovizzo commented Nov 13, 2023

Uh oh!

ahcorde commented May 15, 2024

Uh oh!

ralwing commented Aug 4, 2025 •

edited

Loading

Uh oh!

clalancette commented Aug 6, 2025

Uh oh!

ralwing commented Aug 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

zygfrydw commented Sep 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clalancette commented Sep 27, 2023

Uh oh!

zygfrydw commented Oct 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jayyoung commented Nov 8, 2023

Uh oh!

nachovizzo commented Nov 13, 2023

LeakyNode - No tf stuff

LeakyNode - with dummy TransformListener

Conclusions

Uh oh!

ahcorde commented May 15, 2024

Uh oh!

ralwing commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clalancette commented Aug 6, 2025

Uh oh!

ralwing commented Aug 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

zygfrydw commented Sep 26, 2023 •

edited

Loading

zygfrydw commented Oct 2, 2023 •

edited

Loading

ralwing commented Aug 4, 2025 •

edited

Loading