Skip to content
This repository was archived by the owner on May 31, 2025. It is now read-only.

Resolve memory leak.#1503

Merged
dirk-thomas merged 15 commits intoros:melodic-develfrom
tahsinkose:kinetic-devel
Apr 15, 2019
Merged

Resolve memory leak.#1503
dirk-thomas merged 15 commits intoros:melodic-develfrom
tahsinkose:kinetic-devel

Conversation

@tahsinkose
Copy link
Copy Markdown
Contributor

Delete g_rosout_appender explicitly instead of assigning it to NULL.

@mgrrx
Copy link
Copy Markdown
Contributor

mgrrx commented Sep 10, 2018

I've stumbled upon the exact same thing recently but I haven't found time to publish it. I solved it a bit different:

-  g_rosout_appender = 0;
+  if (g_rosout_appender)
+      delete g_rosout_appender;
+  g_rosout_appender = NULL;

@tahsinkose
Copy link
Copy Markdown
Contributor Author

Thank you very much for your contribution @mgrrx.

As far as I know, delete checks whether pointer is NULL or not; thus deleting a null pointer is a safe operation. Also, assigning NULL after deleting is a controversial practice in terms of correctness.

Nevertheless, I think it is about time migrating to smart pointers from raw pointers to automate memory management in such a crucial package as this one. Especially in applications that run on-board nodes with distant missions (e.g. a UAV in a SAR mission), this might result in critical failures.

@dirk-thomas dirk-thomas changed the base branch from kinetic-devel to melodic-devel September 10, 2018 18:27
@dirk-thomas
Copy link
Copy Markdown
Member

I changed the target branch to melodic-devel (the current default branch). Please rebase you patch against that one.

During the initialization the appender is also registered with rosconsole:

ros::console::register_appender(g_rosout_appender);

In order to safely delete this it should also be unregistered before.

Delete g_rosout_appender explicitly instead of assigning it to NULL.
Follow deletion with NULL assignment.
@tahsinkose
Copy link
Copy Markdown
Contributor Author

@dirk-thomas I couldn't find any function call for de-registration either in ros::console or ros::console::impl. Could you point to a specific location if there exists one?

@dirk-thomas
Copy link
Copy Markdown
Member

Could you point to a specific location if there exists one?

There might not be one yet?

@tahsinkose
Copy link
Copy Markdown
Contributor Author

OK, I will add a deregistry function to rosconsole repository soon.

@dirk-thomas
Copy link
Copy Markdown
Member

Please update the package manifest that the next upcoming patch release of rosconsole is required which will be version 1.13.8.

@tahsinkose
Copy link
Copy Markdown
Contributor Author

I have updated the version number of ros_comm. I don't know a way to specify a release as the dependency in the package.xml file. If I have made anything wrong or insufficient, please correct me.

<package>
<name>ros_comm</name>
<version>1.14.3</version>
<version>1.14.4</version>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have been more specific. Please revert this change and instead modify the file clients/roscpp/package.xml and replace these two lines:

<build_depend>rosconsole</build_depend>
<run_depend>rosconsole</run_depend>

with:

<build_depend version_gte="1.13.8">rosconsole</build_depend>
<run_depend version_gte="1.13.8">rosconsole</run_depend>

This basically expresses that roscpp needs a newer version of rosconsole since it requires the new function to exist. [CI will fail until a new release of rosconsole is being made.]

dirk-thomas
dirk-thomas previously approved these changes Sep 11, 2018
Copy link
Copy Markdown
Member

@dirk-thomas dirk-thomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thank you for working on this.

CI will need to be rerun after rosconsole has been released.

@dirk-thomas
Copy link
Copy Markdown
Member

@ros-pull-request-builder retest this please

@dirk-thomas dirk-thomas dismissed their stale review November 13, 2018 17:43

Doesn't work

@tahsinkose
Copy link
Copy Markdown
Contributor Author

Hi, did this time out?

@dirk-thomas
Copy link
Copy Markdown
Member

@ros-pull-request-builder retest this please

@dirk-thomas
Copy link
Copy Markdown
Member

I retriggered the PR jobs since the previous builds have been removed due to the age. The latest builds show numerous test failures.

@tahsinkose
Copy link
Copy Markdown
Contributor Author

I believe that is the result of changes in the upstream. I will try to have a stable build with current upstream + my changes.

@tahsinkose
Copy link
Copy Markdown
Contributor Author

tahsinkose commented Apr 11, 2019

I have found the problem. Line at https://github.com/ros/rosconsole/blob/ba4e1ae06b5bce3039ac5d20853248b51198e431/src/rosconsole/impl/rosconsole_log4cxx.cpp#L365 has an inherent assumption on the registration of the appender. However there are tests wherein any log appender isn't registered. Please see the below discussion for this phenomena.


ros::console::deregister_appender(g_rosout_appender);
delete g_rosout_appender;
g_rosout_appender = 0;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe try wrapping these three lines in if (g_rosout_appender) { ... }.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it won't work since g_rosout_appender is a valid pointer. However, in rosconsole side g_log4cxx_appender remains NULL.

Copy link
Copy Markdown
Contributor Author

@tahsinkose tahsinkose Apr 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm preparing to submit a PR for non-NULL check of g_log4cxx_appender in rosconsole repository. However, it is a little bit awkward while g_rosout_appender is not NULL and g_log4cxx_appender is NULL. Is it the correct behavior?

Ok, I have found where g_log4cxx_appender is zeroed. It is assigned to NULL in the shutdown() function of rosconsole::log4cxx, which is called before deregistration. What is necessary is to remove that line. It requires the rebuild of rosconsole repository.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From reading this file g_rosout_appender is only set when also being passed to register_appender (within the start() function). Therefore I think it needs a detailed stacktrace when / why that is the case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi again Dirk,

I have inspected shutdown and deregister_appender functions in rosconsole. Following are the functions and my diagnosis:

void shutdown()
{
  if(g_log4cxx_appender)
  {
    const log4cxx::LoggerPtr& logger = log4cxx::Logger::getLogger(ROSCONSOLE_ROOT_LOGGER_NAME);
    logger->removeAppender(g_log4cxx_appender);
    g_log4cxx_appender = 0; // **Nullification**
  }
  // reset this so that the logger doesn't get crashily destroyed
  // again during global destruction.  
  //
  // See https://code.ros.org/trac/ros/ticket/3271
  //
  log4cxx::Logger::getRootLogger()->getLoggerRepository()->shutdown();
}

,

void deregister_appender(LogAppender* appender){
  if(g_log4cxx_appender->getAppender() == appender) //** Trying NULL->getAppender() --> SEGFAULT **/
  {
    const log4cxx::LoggerPtr& logger = log4cxx::Logger::getLogger(ROSCONSOLE_ROOT_LOGGER_NAME);
    logger->removeAppender(g_log4cxx_appender);
    delete g_log4cxx_appender;
    g_log4cxx_appender = 0;
  }
}

In my PR, I was trying to call deregister_appender after shutdown, which directly results in a Segmentation Fault that makes the tests fail. In fact, I have implemented this feature in here and here. Since all the tests had passed, I thought there was no problem. However shutdown() and deregister_appender() results in segmentation fault in any context when used subsequently.
Furthermore, logger->removeAppender(g_log4cxx_appender); already deletes g_log4cxx_appender. Therefore deregister_appender is problematic since the beginning.

So, we don't need to intervene in rosconsole level. What is necessary is to just delete g_rosout_appender in init.cpp and then nullify it according to conventions.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, the result of valgrind --leak-check=full ./devel_isolated/test_roscpp/lib/test_roscpp/test_roscpp-left_right is below:

==5754== Memcheck, a memory error detector
==5754== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==5754== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==5754== Command: ./devel_isolated/test_roscpp/lib/test_roscpp/test_roscpp-left_right
==5754== 
[ INFO] [1555272531.408215746]: ping!
[ INFO] [1555272531.886234315]: ping!
[ INFO] [1555272532.383400126]: ping!y los
[ INFO] [1555272532.883404174]: ping!
[ INFO] [1555272533.383366457]: ping!
==5754== 
==5754== HEAP SUMMARY:
==5754==     in use at exit: 3,457 bytes in 44 blocks
==5754==   total heap usage: 2,016 allocs, 1,972 frees, 296,086 bytes allocated
==5754== 
==5754== LEAK SUMMARY:
==5754==    definitely lost: 0 bytes in 0 blocks
==5754==    indirectly lost: 0 bytes in 0 blocks
==5754==      possibly lost: 0 bytes in 0 blocks
==5754==    still reachable: 3,457 bytes in 44 blocks
==5754==         suppressed: 0 bytes in 0 blocks
==5754== Reachable blocks (those to which a pointer was found) are not shown.
==5754== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==5754== 
==5754== For counts of detected and suppressed errors, rerun with: -v
==5754== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

You can check current repository for the exact same test node. There will be a number of possibly lost bytes. This version produces the above result.

@tahsinkose
Copy link
Copy Markdown
Contributor Author

tahsinkose commented Apr 14, 2019

Could you please rerun the build for bionic? Honestly, I don't understand why it fails to produce results. In my local, it passes as follows:

[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from Subscriptions
[ RUN      ] Subscriptions.pubSubNFast
received message 0
received message 1
received message 2
received message 3
received message 4
received message 5
received message 6
received message 7
received message 8
received message 9
received message 10
received message 11
received message 12
received message 13
received message 14
received message 15
received message 16
received message 17
received message 18
received message 19
received message 20
received message 21
received message 22
received message 23
received message 24
received message 25
received message 26
received message 27
received message 28
received message 29
received message 30
received message 31
received message 32
received message 33
received message 34
received message 35
received message 36
received message 37
received message 38
received message 39
received message 40
received message 41
received message 42
received message 43
received message 44
received message 45
received message 46
received message 47
received message 48
received message 49
received message 50
received message 51
received message 52
received message 53
received message 54
received message 55
received message 56
received message 57
received message 58
received message 59
received message 60
received message 61
received message 62
received message 63
received message 64
received message 65
received message 66
received message 67
received message 68
received message 69
received message 70
received message 71
received message 72
received message 73
received message 74
received message 75
received message 76
received message 77
received message 78
received message 79
received message 80
received message 81
received message 82
received message 83
received message 84
received message 85
received message 86
received message 87
received message 88
received message 89
received message 90
received message 91
received message 92
received message 93
received message 94
received message 95
received message 96
received message 97
received message 98
received message 99
received message 100
received message 101
success
msgs_received == 101
[       OK ] Subscriptions.pubSubNFast (4323 ms)
[----------] 1 test from Subscriptions (4323 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (4323 ms total)
[  PASSED  ] 1 test.

@dirk-thomas
Copy link
Copy Markdown
Member

@ros-pull-request-builder retest this please

@dirk-thomas
Copy link
Copy Markdown
Member

Thanks for the patch.

@dirk-thomas dirk-thomas merged commit 9e49232 into ros:melodic-devel Apr 15, 2019
@tahsinkose
Copy link
Copy Markdown
Contributor Author

Thanks for the support all along the way :)

@tahsinkose tahsinkose deleted the kinetic-devel branch April 15, 2019 19:02
tahsinkose added a commit to tahsinkose/ros_comm that referenced this pull request Apr 15, 2019
* Resolve memory leak.

Delete g_rosout_appender explicitly instead of assigning it to NULL.
Follow deletion with NULL assignment.

* Deregister g_rosout_appender

* revert unrelated whitespace change

* Update init.cpp

* Increment version number.

* space in comments

* Merge

* Revert "Increment version number."

This reverts commit 795c8fd.

* Add newer rosconsole dependencies

* Update rosconsole dependencies in package.xml

* Sync with upstream completely

* Add changes

* Remove deregister function, since it is already done in shutdown().

* Remove unnecessary .catkin_workspace file.
dirk-thomas pushed a commit that referenced this pull request Aug 4, 2020
* Resolve memory leak.

Delete g_rosout_appender explicitly instead of assigning it to NULL.
Follow deletion with NULL assignment.

* Deregister g_rosout_appender

* revert unrelated whitespace change

* Update init.cpp

* Increment version number.

* space in comments

* Merge

* Revert "Increment version number."

This reverts commit 795c8fd.

* Add newer rosconsole dependencies

* Update rosconsole dependencies in package.xml

* Sync with upstream completely

* Add changes

* Remove deregister function, since it is already done in shutdown().

* Remove unnecessary .catkin_workspace file.
dirk-thomas pushed a commit that referenced this pull request Aug 4, 2020
* Resolve memory leak.

Delete g_rosout_appender explicitly instead of assigning it to NULL.
Follow deletion with NULL assignment.

* Deregister g_rosout_appender

* revert unrelated whitespace change

* Update init.cpp

* Increment version number.

* space in comments

* Merge

* Revert "Increment version number."

This reverts commit 795c8fd.

* Add newer rosconsole dependencies

* Update rosconsole dependencies in package.xml

* Sync with upstream completely

* Add changes

* Remove deregister function, since it is already done in shutdown().

* Remove unnecessary .catkin_workspace file.
josephduchesne pushed a commit to josephduchesne/ros_comm that referenced this pull request Jan 24, 2024
* Resolve memory leak.

Delete g_rosout_appender explicitly instead of assigning it to NULL.
Follow deletion with NULL assignment.

* Deregister g_rosout_appender

* revert unrelated whitespace change

* Update init.cpp

* Increment version number.

* space in comments

* Merge

* Revert "Increment version number."

This reverts commit 795c8fd.

* Add newer rosconsole dependencies

* Update rosconsole dependencies in package.xml

* Sync with upstream completely

* Add changes

* Remove deregister function, since it is already done in shutdown().

* Remove unnecessary .catkin_workspace file.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants