Memory leak and invalid use fixes#2326
Conversation
Delete things that were allocated, be clear about ownership.
…non-CLI bindings).
Otherwise DeleteVisitor will do it.
This solves mlpack#2146 in a better way than mlpack#2234.
Add `ownsLayers` members to AddMerge and Sequential to avoid double-frees or memory leaks.
birm
left a comment
There was a problem hiding this comment.
I'm still not sure what's up with the hd5 file test, but this looks good.
Wondering what it would take to automate the test you ran... (I mean of course we could just add that flag, but is there something quantifiable we could have azure spit out?)
favre49
left a comment
There was a problem hiding this comment.
This is great, looks like hours of work. Everything LGTM, besides some nitpicks.
src/mlpack/methods/ann/brnn_impl.hpp
Outdated
| // Remove mergeLayer from the forward and backward RNNs so it doesn't get | ||
| // deleted. This assumes that mergeLayer is the last layer! | ||
| forwardRNN.network.pop_back(); | ||
| backwardRNN.network.pop_back(); |
There was a problem hiding this comment.
This doesn't look right. We never add the mergeLayer to the forward and backward RNN but the last layer of the RNN's to the merge layer if you look here.
I guess we only need change the comment though. If reset is true then we should pop the last layers from the RNN's to avoid deleting twice otherwise we should simply delete the mergeLayer
There was a problem hiding this comment.
Oh, you're right! I read it backwards. In any case, yeah, removing the last layer from the forward and backward RNNs is sufficient to prevent a memory leak. I've updated the comment in c42b9f7. 👍
|
Thanks for spending so many ours on the issues, I'll take a look at the changes later today. |
zoq
left a comment
There was a problem hiding this comment.
Nice work, this looks great to me.
| { | ||
| // Remove the last layers from the forward and backward RNNs, as they are held | ||
| // in mergeLayer. So, when we use DeleteVisitor with mergeLayer, those two | ||
| // layers will be properly (and not doubly) freed. |
|
I'll go ahead and merge this then. |
Inspired by #2314, #2310, #2300, and others, I built mlpack with
-fsanitize=addressand ranmlpack_test. The results were... noisy. But I took some time and I fixed everything (except for one, detailed later).All of the ANN layers are properly freed now; it took a while to get this write, and
DeleteVisitorwill now delete internally-held layers when the layer has aModel()method. This may solve some issues people were having.CNN layers don't make an alias of the input from
Forward()anymore. This should fix Can't Train a Normal CNN. #2146 .Test code properly frees things that it allocates. There were some tricky places where deallocation had been overlooked.
Bindings handle exceptions properly and deallocate any allocated memory when an exception is thrown. It makes the code a bit uglier, but it's safer.
There is one bug I didn't solve, which surfaces in the reinforcement learning code, in
q_learning_impl.hpp:This code actually copies an
FFN, and each individual layer will require a copy constructor to fix that memory leak. That's the same issue as in #2314, and anyway, that isn't solved here. But everything else in the tests is.I do think it's important that we merge this before a release, otherwise we'll probably get lots of segfault and invalid memory access reports. :)