Conversation
| return NULL; | ||
| } | ||
| rcl_wait_set_t * wait_set = (rcl_wait_set_t *)PyCapsule_GetPointer(pywait_set, "rcl_wait_set_t"); | ||
| rcl_wait_set_t * wait_set = PyCapsule_GetPointer(pywait_set, "rcl_wait_set_t"); |
There was a problem hiding this comment.
@ivanpauno why doesn't rcl_wait_set_t follow the same treatment that other rcl types' allocations do i.e. using rclpy_handle_t too?
There was a problem hiding this comment.
Good question. Currently, the rcl_wait_set_t capsule didn't have a deleter, and deletion was handled manually in Python.
To avoid doing this PR bigger, I didn't refactor how wait sets were used. It should be done in a follow up PR.
There was a problem hiding this comment.
Let's open an issue to track that work.
hidmic
left a comment
There was a problem hiding this comment.
LGTM but we should follow up on leaking resources ASAP
sloretz
left a comment
There was a problem hiding this comment.
Partial review. Will continue later.
| The managed `rclpy_handle_t` object will not be destructed immediately if another handle | ||
| called `other.requires(this)`. | ||
| In that case, the managed object will be destroyed after all its | ||
| dependents are destroyed. |
There was a problem hiding this comment.
Can Handle call destroy() on all the downstream handles too? Without that, there must be something else that has duplicate information of which handle requires what so they can be destroyed in the right order. If I want to destroy Node, I must destroy the publishers, subscribers, etc created from it first. In practice this responsibility is owned by Node.destroy_node(), but that only works because Node also knows which handles require its handle.
That duplicate knowledge is hard to do for Context. Without that knowledge duplicated it means rclpy.shutdown() doesn't actually shutdown the library; instead everything must wait for the garbage collector.
There was a problem hiding this comment.
That duplicate knowledge is hard to do for Context. Without that knowledge duplicated it means rclpy.shutdown() doesn't actually shutdown the library; instead everything must wait for the garbage collector.
Up to now, shutting down rclpy wasn't destructing all the rcl created objects, as Context wasn't using a Handle. If we want that feature, I would rather do it in a follow-up PR, and limit this one to fix the order destruction problem (it can be done by keeping track of all the nodes created by a context).
There's also some inconsistency of what happens when a node is destructed:
- Publishers, Subscriptions, Clients, Services, Timers, Guards created by a node are destructed.
- Action clients and action servers aren't.
There was a problem hiding this comment.
We can also rely on things being explicitly destroyed in reversed order and fail if destruction of an object other objects still depend on is attempted.
If I may, I have to ask, where does this need of explicitly destroying objects come from? It's somewhat incompatible with Python, it creates invalid objects, it makes resource management more complex and it could be easily circumvented by calling gc.collect if need be. It almost seems to me like we're fighting the language to do things in a way it was not meant to.
There was a problem hiding this comment.
If I may, I have to ask, where does this need of explicitly destroying objects come from? It's somewhat incompatible with Python, it creates invalid objects, it makes resource management more complex and it could be easily circumvented by calling gc.collect if need be. It almost seems to me like we're fighting the language to do things in a way it was not meant to.
I hear you. This is an old discussion. I've been on that side before, and I'm still mostly on that side. I agree we don't need to free memory right away. There's nothing special about our ram usage vs python's, so if the python garbage collector is good enough for some then it is good enough for all.
The reason to destroy these objects is not about their memory usage, it's about freeing system resources. Explicitly freeing system resources without waiting for the garbage collector is not fighting the language; it's pretty normal.
- https://docs.python.org/3/library/socket.html#socket.close
- https://docs.python.org/3/library/io.html#io.IOBase.close
- https://docs.python.org/3/library/subprocess.html#subprocess.Popen.kill
Maybe it could be circumvented by calling gc.collect(), but that's not how python objects usually solve this situation: https://docs.python.org/3/tutorial/inputoutput.html .
If you’re not using the with keyword, then you should call f.close() to close the file and immediately free up any system resources used by it. If you don’t explicitly close a file, Python’s garbage collector will eventually destroy the object and close the open file for you, but the file may stay open for a while. Another risk is that different Python implementations will do this clean-up at different times.
Yes rcl objects have to be destroyed in order, but that has nothing to do with why we want to destroy them. We want to destroy them because that's the only way the rcl API gives us to free system resources they use. Depending on the rmw implementation, nodes/pubs/subs/etc open sockets, they're using network bandwidth by continuing to participate in discovery, subscribers are receiving data from other publishers on the network, service servers may be accepting requests even if the intent is for another server to take their place, etc. That's where the need of explicitly destroying objects comes from.
There was a problem hiding this comment.
I understand the point of freeing resources. But, my point is that rclpy.shutdown is currently not doing so. I think that the responsability of doing that should be of the Context itself, and not of Handle. I can add that, but IMO it's subject of a follow up.
There was a problem hiding this comment.
my point is that rclpy.shutdown is currently not doing so. I think that the responsability of doing that should be of the Context itself, and not of Handle. I can add that, but IMO it's subject of a follow up.
That's a fair point. I think I'm getting hung up on what I intended Handle to do vs the reality of that never being fully implemented. I do like the approach this PR takes, and I think it makes it pretty easy to get to a state of everything being cleaned up right away.
If h1.requires(h0) caused h0 to store a weakref to h1, then h0.destory() could call h1.destroy() without worrying about destruction order of h1's dependents because the reference counting in this PR would take care of that. I think doing that is important for the reasons above, but you're right it doesn't need to be done right away if the feature doesn't fully exist now. I'll dismiss my review since it seems to be blocking progress. Sorry about that.
267e393 to
cd37c5a
Compare
cd37c5a to
034da37
Compare
c0c56b9 to
b08ad1b
Compare
|
@hidmic About the regression test you asked, I gave it a try, but it's super hard to reproduce the problem. My main reason to move lifetime manangement to C is that destruction order is not documented in Python, nor how the gc works. Both of them are an implementation detail of a particular python interpreter, and users shouldn't make any supposition about how they work. From PEP 442:
|
|
I think listing that PEP to back up |
hidmic
left a comment
There was a problem hiding this comment.
LGTM but for a question, though some green CI would be nice too
| with node1.handle: | ||
| pass | ||
| finally: | ||
| rclpy.shutdown(context=context) |
There was a problem hiding this comment.
@ivanpauno why were these tests dropped? they certainly have to be rewritten but still apply conceptually, no?
There was a problem hiding this comment.
Those tests were relaying in using requires, which is not exposed more.
I think that the tests I wrote in https://github.com/ros2/rclpy/blob/39a93ac5cde247b53137441e34f431d68ba211c4/rclpy/src/test/rclpy_common/test_handle.cpp are a good replacement.
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com> Add node handle dependeny on context handle Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
b08ad1b to
39a93ac
Compare
See 826e6cc. |
hidmic
left a comment
There was a problem hiding this comment.
LGTM again but for a few nits, and pending green CI
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Signed-off-by: Ivan Santiago Paunovic <ivanpauno@ekumenlabs.com>
Supersedes #470.
See #470 (comment).
Solves strange bugs related with destruction order.
Included in this PR:
rclpy_handle_tstruct, related methods, and usage of it inHandleclass.rclpy_handle_twhere needed.Not included (subject of a follow up):
rmw_request_id_tcapsule is leaking.rmw_qos_profile_tcapsule is leaking.wait setrelated code, so it can be wrapped by aHandle.rclpy_actionobjects, so they can be wrapped in aHandle.