Conversation
|
I came to the same conclusion about the |
|
Looks like Azure forgot to update the build status for the |
Thanks for looking into the issue, my first impression is that switching to |
|
@zoq agreed; do you think that that's something we should open a separate issue for? (Also I guess we should change |
|
Looks like Azure forgot to update the build status for the |
Right, change input and output and yes we should open a seperate issue for that, or maybe I just do it myself. I guess the question is do we want to include the change, switching from |
|
This code touches the ANN code, and there was recently a big refactoring of the ANN code in #2259, so be sure to merge the master branch into your branch here to make sure that nothing will fail if this PR is merged. 👍 (I'm pasting this message into all possibly relevant PRs. Even ones that I personally opened. :)) |
|
I don't think I'm going to merge this yet. I'm doing a closer look at the convolution code and I'm wondering if |
|
Sounds good. |
This solves mlpack#2146 in a better way than mlpack#2234.
|
Closing in favor of #2326. |
This PR is an attempt to fix #2146, although not in the way that I would have hoped. Unfortunately, it seems that for now, for the sake of memory safety, we will need to copy the input matrix.
Specifically, the problem is this: layers like
ConvolutionandAtrousConvolutionandTransposedConvolution(and some others) create an alias of the given input matrix whenForward()is called; this is the memberinputTemp. This alias uses the same memory as the given input matrix; it does not make a copy, and it is not responsible for freeing the memory when the alias's destructor is called. That memberinputTempmay then be used inBackward()andGradient().However, the problem is that there exist layer types that will cause the input matrix to other layers to be freed during the forward pass; this means that any layer that took an alias during the forward pass is now holding a pointer to invalid memory. (I can't remember if it is individual layers that cause the freeing, or if that happens as part of the
FFNandRNNclasses. In either case, the result is the same.)I thought that perhaps we could simply use
std::move()instead of creating an alias, but this turns out to not work because other parts of theFFNandRNNforward/backward passes rely on having the layer input still available even afterForward()is called on the layer. So, for now, I've simply copied the input. That's not the fastest solution, but I think it's better to have a slow solution that works instead of a fast one that sometimes segfaults. :)Some points for discussion in this PR:
If the layer input is still expected even after
Forward()is called by theFFNclass, then should forward really takearma::mat&&s for both input and output? Or should the input be, e.g.,const arma::mat&?Have we written anywhere the specific expectations with what can and can't be done with the inputs to
Forward()implementations? (I guess that's kind of related to the first question.)Is it possible to refactor
FFNandRNNsuch that thestd::move()approach (instead of the copy approach I used here) can work?I ask these questions because I'm not particularly familiar with the internal design of this part of the code, so maybe there is just something I'm not aware of, I don't know. :) Relevant folks for this discussion: @zoq, @ShikharJ, @walragatver, @sreenikSS, @saksham189. (Sorry for so many tags. It just seems to me like there are a couple of fundamental issues here that might be worth discussing. 👍)