[GSoC] Augmented RNN models - benchmarking framework by sidorov-ks · Pull Request #1005 · mlpack/mlpack

sidorov-ks · 2017-05-23T17:05:07Z

This PR is part of my GSoC project "Augmented RNNs".
Imeplemented:

class CopyTask for evaluating models on the sequence copy problem, showcasing benchmarking framework;
unit test for it (a simple non-ML model that is hardcoded to copy the sequence required number of times is expected to ace the CopyTask).

mlpack-jenkins · 2017-05-24T08:39:14Z

Can one of the admins verify this patch?

zoq · 2017-05-25T17:58:32Z

src/mlpack/methods/ann/augmented/tasks/copy_impl.hpp

+}
+
+template<typename ModelType>
+double CopyTask::Evaluate(ModelType& model) {


Now that I see the interface, I see a couple of problems that we have to discuss:

Right now the way we train a model e.g. RNN or LogisticRegression, etc. is we define all necessary model parameters (layer, regularization constant, etc.) and call the Train function. The Train takes the predictor and responses as input followed by an optional optimizer instance where we define the number of iterations/epochs, the threshold to terminate the optimization before we reach the maximum number of iterations, etc..

arma::mat predictor, responses; Model model(...); Optimizer optimizer(...); model.Train(predictor, responses, optimizer);

The Evaluate function somehow replicates the functionality of the optimizer like the number of epochs. I'm not sure we should replicate the behavior since the way we train a model might be different from class to class. Like none of the existing classes can handle arma::field as input. There is no option to specify optimizer parameter, or what if we like to use the upcoming cross-validation feature as part of the evaluation pipeline. So I think instead of providing an evaluation function that takes a model and trains it we should see the Tasks as a generator that generates input/output samples for our model to train on. This enables us to train the model independently from the task. I guess besides the code replication another problem is I can't think of an easy solution to incorporate curriculum learning which might be necessary to get stable results into the class without providing a bunch of parameters for each task. So I propose the following:

CopyTask { public: CopyTask(...) {} void Sample(arma::mat& predictor, arma::mat& responses) { // Generate predictor responses pair or pairs with the given parameter. } } CopyTask copyTask; copyTask.Sample(predictor, responses); model.Train(predictor, responses, optimizer);

Let me know what you think.

This sounds like a good idea to me, because I've already ran into similar issue when trying to implement SortTask - it was a huge copy-paste from CopyTask, which failed to work nonetheless 😢.
However, I would also propose to create one more header file with evaluation functions (sklearn-ish), like this:

#include <score.hpp> auto score_fn = scorers::sequence_precision; // number of sequences model has got right. // As an option: // auto precision = scorers::bit_precision; // = number of dataset bits model has got right. // ... CopyTask defined as in the previous comment copyTask.Sample(predictor_train, responses_train); copyTask.Sample(predictor_test, responses_test); arma::mat responses_test_pred; // or whatever type is required; model.Train(predictor_train, responses_train, optimizer); model.Predict(predictor_test, responses_test_pred); auto score = score_fn(responses_test, responses_test_pred);

IMHO, this will be even more in accord with the concern separation principle.

At the moment, I'm at an internship at our local Web-development company, though, so I hope to try out the ideas in this thread tonight or in the next few days.

It's just an idea but do you think since the score function depends on the task we should integrate the function into the Generator/Task class?

Maybe it is true, but I am not sure that the score function is the part of the problem. IMHO the task should specify only inputs and their respective outputs. The reason is that there are usually several viable ways to evaluate the distance between predictions and ground truths (e.g., in regression, you can use Euclidean / Manhattan metrics, use L1 / L2 regularizer with different degrees of regularization).

Even here, in the case of CopyTask, there are two plausible ways to evaluate the performance: bit precision and sequence precision.

That's true, there are often multiple ways to calculate the performance, and I think there are situations where you like to use the same functions during training, so I agree writing a new class/classes for the scoring functions is a good idea.

sidorov-ks · 2017-06-05T08:56:38Z

Wow, it was unreasonably buggy coding from my side, but it looks like I've finally got it right :)

So now, if I correctly remember the schedule, all it takes for Week 1 is to implement the data generator for SortTask and AddTask. I hope this won't be ridiculously hard, but it still requires to overload the evaluation methods for supporting sequences of binary numbers - in contrast to sequences of bits.

zoq · 2017-06-05T12:09:05Z

src/mlpack/methods/ann/augmented/tasks/copy.hpp

+  * @param maxLength Maximum length of sequence that has to be repeated by model.
+  * @param nRepeats Number of repeates required to solve the task.
+  */
+  CopyTask(int maxLength, int nRepeats);


I would use const size_t maxLength and const size_t nRepeats here, we mostly use size_t instead of int if support of negative values isn't needed.

zoq · 2017-06-05T12:10:56Z

src/mlpack/methods/ann/augmented/tasks/copy.hpp

+  * @param labels The variable to store output sequences.
+  * @param batchSize The dataset size.
+  */
+  void GenerateData(


Minor style issue, please take a look at the Method Declarations sections in the style guide: https://github.com/mlpack/mlpack/wiki/DesignGuidelines#method-declarations.

zoq · 2017-06-05T12:11:45Z

src/mlpack/methods/ann/augmented/tasks/copy.hpp

+private:
+  // Maximum length of a sequence.
+  int maxLength;
+  // Nomber of repeats the model has to perform to complete the task.


Minor spelling issue; Numberinstead of Nomber.

zoq · 2017-06-05T12:13:32Z

src/mlpack/methods/ann/augmented/tasks/copy.hpp

+  int nRepeats;
+};
+}
+}


Can you add a comment for the closing namespace here, to improve the readability?

zoq · 2017-06-05T12:14:22Z

src/mlpack/methods/ann/augmented/tasks/copy_impl.hpp

+    maxLength(maxLength),
+    nRepeats(nRepeats)
+{
+  // Just storing task-specific paramters.


Minor spelling issue: parameters instead of paramters.

zoq · 2017-06-05T12:16:08Z

src/mlpack/methods/ann/augmented/tasks/copy_impl.hpp

+) {
+  input = arma::field<arma::irowvec>(batchSize);
+  labels = arma::field<arma::irowvec>(batchSize);
+  for (int i = 0; i < batchSize; ++i) {


Can you take a look at the Brace Placement section: https://github.com/mlpack/mlpack/wiki/DesignGuidelines#brace-placement. I wonder why the Style Check doesn't complain about this.

zoq · 2017-06-05T12:29:14Z

src/mlpack/methods/ann/augmented/tasks/copy_impl.hpp

+  input = arma::field<arma::irowvec>(batchSize);
+  labels = arma::field<arma::irowvec>(batchSize);
+  for (int i = 0; i < batchSize; ++i) {
+    size_t size = maxLength;


I think we can remove size and just use maxLength, what do you think?

Maybe we can use random value of size, so that we can feed variable-length data to our models?

Interesting idea, but I'm not sure we should do that inside the Generator class, since the required distribution depends on the user? If someone likes inputs with another length he could just call with another length right? Also, maybe it's a good idea, to use arma::mat instead of arma::field, since this way if someone likes to train only with a static length, he can directly use the output, since the model takes arma::mat as input.

zoq · 2017-06-05T12:44:53Z

src/mlpack/methods/ann/augmented/tasks/copy_impl.hpp

+  labels = arma::field<arma::irowvec>(batchSize);
+  for (int i = 0; i < batchSize; ++i) {
+    size_t size = maxLength;
+    arma::irowvec item = arma::randi<arma::irowvec>(size, arma::distr_param(0, 1));


I think we can avoid the extra copy here and use input(i) instead of `ìtem``, maybe I missed something?

zoq · 2017-06-05T12:47:50Z

I hope this won't be ridiculously hard, but it still requires to overload the evaluation methods for supporting sequences of binary numbers - in contrast to sequences of bits.

Not sure what evaluation method you mean here, can you explain further?

sidorov-ks · 2017-06-05T13:15:49Z

Not sure what evaluation method you mean here, can you explain further?

I mean that even though other tasks are still evaluated by sequence precision, AddTask and SortTask require processing/outputting of sequences of numbers (ain't interesting to sort 0s and 1s 😃), in contrast to CopyTask, which requires to copy sequences of 0s and 1s. The current implementation of SequencePrecision takes two field<irowvec>s as arguments --- to support two other tasks, it needs to be able to take two field<imat>s as arguments.

zoq · 2017-06-05T16:56:24Z

Thanks for the clarification, now I get what you mean 👍

In a previous comment you said you are currently at an internship, does the internship end this week or next week or something else? Just wanted to make sure you are able to work full-time on the project.

sidorov-ks · 2017-06-05T17:13:58Z

In a previous comment you said you are currently at an internship, does the internship end this week or next week or something else? Just wanted to make sure you are able to work full-time on the project.

Yes, both the internship and the entrance exam to Yandex School of Data Analysis have recently wrapped up (exam, the last event so far, was held on Sunday). So from here, I am able to work full-time.

zoq · 2017-06-05T17:26:40Z

Thanks again for the clarification, the course looks interesting: https://yandexdataschool.com/edu-process/program/data-analysis not sure which one you did, but I guess every course is somehow interesting, at least the description looks promising. Hopefully, your exam went well.

Just a reminder, would be great to see the first weekly report today or tomorrow.

sidorov-ks · 2017-06-06T08:28:19Z

Fixed the style issues you've mentioned in the review, now switching to the other two tasks. By the way, what's up with Jenkins? It's being much more severe that it used to be.

Thanks again for the clarification, the course looks interesting: https://yandexdataschool.com/edu-process/program/data-analysis not sure which one you did, but I guess every course is somehow interesting, at least the description looks promising. Hopefully, your exam went well.

That (Data Analysis) is precisely the program I've applied to :) However, I'm also going to take on some courses from Big Data program (e.g., parallel computation course - when you know that LAPACK is supreme to just writing for-loop, but don't know precisely why 😸)

zoq · 2017-06-06T08:53:40Z

The Jenkins Style Checks only checked files in mlpack/core now it also checks files in methods and tests. I fixed almost all issues that are not related to any PR here: #1019, so realistically you should only look at the files you changed and try to fix the issues pointed out by cpplint.

That (Data Analysis) is precisely the program I've applied to :) However, I'm also going to take on some courses from Big Data program (e.g., parallel computation course - when you know that LAPACK is supreme to just writing for-loop, but don't know precisely why 😸)

Right, I think in mlpack's context switching OpenBLAS with MKL would also result in some noticeable performance improvements at least in some situations.

sidorov-ks · 2017-06-06T09:26:38Z

Travis CI has also got crazy - now VanillaNetworkTest from convolutional_network_test.cpp also crashes with getting 50% classification error.

zoq · 2017-06-06T09:35:45Z

Sometimes test fail because they e.g. didn't reach the expected error in the specified number of iterations using the given starting point, for more information take a look at #922

Realistically, you don't have to worry about failing test that are not related with your code.

sidorov-ks · 2017-06-06T10:22:31Z

Now I have a problem I can't easily solve. I need to generate a random N*5 matrix (N 5-bit numbers). However, when I use randi(N, 5, distr_param(0, 1)), it returns a sequence of 00000 and 11111 (in other words, only rows get randomized - not columns).

I also tried setting manually all values to numbers from RNG (the current implementation), but it has the same problem. How to fix it?

zoq · 2017-06-06T10:32:49Z

using:

arma::Mat<size_t> output = arma::randi<arma::Mat<size_t>>(10, 5, arma::distr_param(0, 1));

I get:

        0        0        0        1        0
        0        1        1        1        1
        1        1        1        0        1
        0        0        1        0        1
        1        0        1        1        1
        0        1        1        0        0
        0        1        0        1        1
        1        0        1        1        1
        1        0        0        1        0
        1        0        1        0        0

Isn't that what you expected?

sidorov-ks · 2017-06-06T17:40:14Z

Isn't that what you expected?

Yes, but this one didn't work either :( Speaking of this, can I use vector<vector<int>> to create imat? So far I can't find any other viable way to fix that, but tomorrow I'll try again to debug it.

Speaking about the weekly report, I think it would be better to end tomorrow third task + fix the mentioned bug and then (once again, tomorrow) write the complete report on Week 1. What do you think?

zoq · 2017-06-06T18:31:16Z

So, do you mean my output is what you like, but it's not working for you? If that's the case, can you reduce the problem to a simple case? Easier to debug the problem.

Sending the weekly report tomorrow, sounds fine for me.

sidorov-ks · 2017-06-07T07:54:26Z

So, do you mean my output is what you like, but it's not working for you? If that's the case, can you reduce the problem to a simple case? Easier to debug the problem.

Yes, I was trying to achieve your output. I have resolved the problem - it was due to my really stupid bug in debug output producing.

I have also implemented AddTask, so now I have only one question - how to merge the style fixes from master branch to my fork?

After that, I'm planning to write the Week 1 report. Then (according to our schedule) we are switching to some real (but not augmented) models - I hope to implement LSTM baseline and record its scores.

sidorov-ks · 2017-06-07T08:05:55Z

Whoops, my bad, there were only my errors. Trying to fix them.

sidorov-ks · 2017-06-07T08:43:44Z

Finally a clear style-check run :)

zoq · 2017-06-07T10:56:39Z

After that, I'm planning to write the Week 1 report. Then (according to our schedule) we are switching to some real (but not augmented) models - I hope to implement LSTM baseline and record its scores.

Sounds good, I think we could also compare against GRU (once it's merged), without changing a lot of code.

sidorov-ks · 2017-06-07T11:53:02Z

AppVeyor is running for a really long time. How can I dequeue my previous commits from here: https://ci.appveyor.com/project/mlpack/mlpack/history

UPDATE They have just been dequeued. Thanks ;)

zoq · 2017-06-07T11:57:39Z

I'm not sure if you are allowed to cancel your job at the moment, you could login into appveyor and see if you do, but in the meantime let me do it for you.

zoq · 2017-06-07T15:56:27Z

src/mlpack/methods/ann/augmented/tasks/add.hpp

+class AddTask
+{
+ public:
+  AddTask(size_t bitLen);


Can you comment each parameter as we did for the other methods? Also, I think it would be good idea, to improve the general task description e.g. we could a simple example how to use the code and show an example output. What do you think?

zoq · 2017-06-07T15:57:01Z

src/mlpack/methods/ann/augmented/tasks/add.hpp

+ public:
+  AddTask(size_t bitLen);
+
+  void GenerateData(arma::field<arma::irowvec>& input,


Do you think it would be cleaner to rename the function to Generate?

zoq · 2017-06-08T12:22:28Z

src/mlpack/methods/ann/augmented/tasks/copy.hpp

+  * @param maxLength Maximum length of sequence that has to be repeated by model.
+  * @param nRepeats Number of repeates required to solve the task.
+  */
+  CopyTask(size_t maxLength, size_t nRepeats);


Can you use the const qualifier here const maxLength and const nRepeats.

zoq · 2017-06-08T12:25:49Z

src/mlpack/methods/ann/augmented/tasks/copy.hpp

+  * @param labels The variable to store output sequences.
+  * @param batchSize The dataset size.
+  */
+  void Generate(arma::field<arma::irowvec>& input,


The data should be col major, also I was thinking that even if the data is binary, it might be good to use arma::col instead of `àrma::icolhere since the most of the models takearma::mat`` as input, and that way we could avoid a cast. What do you think?

zoq · 2017-06-08T12:26:51Z

src/mlpack/methods/ann/augmented/tasks/copy.hpp

+  */
+  void Generate(arma::field<arma::irowvec>& input,
+                    arma::field<arma::irowvec>& labels,
+                    size_t batchSize);


Can we use const size_t batchSize here, to be consistent with the rest of the codebase.

zoq · 2017-06-08T12:30:18Z

src/mlpack/methods/ann/augmented/tasks/copy_impl.hpp

+  // Just storing task-specific parameters.
+}
+
+void CopyTask::Generate(arma::field<arma::irowvec>& input,


Minor style issue, can you align the labels and batchSize parameter with the input parameter.

zoq · 2017-06-08T12:31:35Z

src/mlpack/methods/ann/augmented/tasks/copy.hpp

+  *
+  * @param input The variable to store input sequences.
+  * @param labels The variable to store output sequences.
+  * @param batchSize The dataset size.


Perhaps batchSize isn't the best parameter name here? Do you think size could work?

I would like to leave the long name, because with size variable would be not as clear which size is meant (sample size? maximum sequence length? size of sequence elements?). The long name, in contrast, eliminates any chance to slip later.

Okay, sounds reasonable to me.

zoq · 2017-06-08T12:35:58Z

src/mlpack/methods/ann/augmented/tasks/copy_impl.hpp

+{
+  input = arma::field<arma::irowvec>(batchSize);
+  labels = arma::field<arma::irowvec>(batchSize);
+  std::srand(unsigned(std::time(0)));


I don't think it's a good idea to set a seed here, what if someone likes to generate the same output e.g. to debug some problem? If someone needs different results in a single run he can set math::RandomSeed(std::time(NULL));before the run.

zoq · 2017-06-08T12:37:38Z

src/mlpack/methods/ann/augmented/tasks/copy_impl.hpp

+  std::srand(unsigned(std::time(0)));
+  for (size_t i = 0; i < batchSize; ++i) {
+    // Random uniform length from [2..maxLength]
+    size_t size = 2 + std::rand() % (maxLength - 1);


We could use RandInt from core/math/random.hpp here.

zoq · 2017-06-08T12:52:36Z

src/mlpack/methods/ann/augmented/tasks/copy_impl.hpp

+    // Random uniform length from [2..maxLength]
+    size_t size = 2 + std::rand() % (maxLength - 1);
+    input(i) = arma::randi<arma::irowvec>(size, arma::distr_param(0, 1));
+    arma::irowvec item_ans = arma::irowvec(nRepeats * size);


Do you think we can use repmat here, to simplify the code?

zoq · 2017-06-08T12:53:21Z

src/mlpack/methods/ann/augmented/tasks/copy_impl.hpp

+// In case it hasn't been included yet.
+#include "copy.hpp"
+
+#include <cstdlib>


Looks like we can remove #include <cstdlib>and #include <ctime>.

zoq · 2017-06-11T11:17:55Z

src/mlpack/methods/ann/augmented/tasks/copy_impl.hpp

+    // Random uniform length from [2..maxLength]
+    size_t size = RandInt(2, maxLength+1);
+    input(i) = arma::randi<arma::colvec>(size, arma::distr_param(0, 1));
+    arma::colvec item_ans = arma::conv_to<arma::colvec>::from(


Maybe I missed something, but it looks like we could avoid an extra copy here if we directly use labels(i) for the output of arma::conv_to<arma::colvec>::from(...

zoq · 2017-06-11T11:25:48Z

src/mlpack/methods/ann/augmented/tasks/score_impl.hpp

+                         arma::field<arma::colvec> predOutputs)
+{
+  double score = 0;
+  auto testSize = trueOutputs.n_elem;


We should try to avoid auto: I think if people go through the code using auto does not support the general code understanding, since you have to look further to figure out the return values. Also here is a quote from the armadillo page:

Can I use the C++11 auto keyword with Armadillo objects and/or expressions?
Use of C++11 auto is not recommended with Armadillo objects and expressions. Armadillo has a template meta-programming framework which creates lots of short lived temporaries that are not handled by auto.

zoq · 2017-06-11T11:28:56Z

src/mlpack/methods/ann/augmented/tasks/score_impl.hpp

+
+  for (size_t i = 0; i < testSize; i++)
+  {
+    auto prediction = trueOutputs.at(i);


Can we work with trueOutputs instead of a copy? I guess the compiler would optimize that, but would be easy for us to do it.

zoq · 2017-06-11T11:32:46Z

src/mlpack/methods/ann/augmented/tasks/score_impl.hpp

+    }
+    else
+    {
+      for (size_t j = 0; j < prediction.n_elem; ++j) {


Maybe you could simplify the expression here by doing something like: arma::accu(output.at(j) == prediction.at(j)) == prediction.n_elem, not sure if the expression necessary faster or slower.

I've looked through armadillo's docs and found an even better way: http://arma.sourceforge.net/docs.html#approx_equal

This method resolves all our problems in this place:

It should be fast - it's armadillo native method (should be parallelized) + should stop after first mismatch (equality check, nothing more)

No more code duplication - it accepts all armadillo objects, whether it's a matrix, vector or whatever else.

The only problem I see is, that approx_equalwas introduced in Version 6.700, but we support >= 4.200.0, so I guess what we could do here is to backport the function as we did for the e.g. ind2sub in core/arma_extend.

zoq · 2017-06-11T11:36:42Z

src/mlpack/methods/ann/augmented/tasks/score_impl.hpp

+  return score;
+}
+
+double SequencePrecision(arma::field<arma::mat> trueOutputs,


We could combine the functions for arma::field<arma::mat> and arma::field<arma::colvec> and could avoid some codeduplication by using a template:

template<typename eT> double SequencePrecision(arma::field<eT> trueOutputs, arma::field<eT> predOutputs) { ... }

zoq · 2017-06-11T11:47:43Z

src/mlpack/methods/ann/augmented/tasks/sort_impl.hpp

+    // Random uniform length from [2..maxLength]
+    size_t size = RandInt(2, maxLength+1);
+    input(i) = arma::randi<arma::mat>(bitLen, size, arma::distr_param(0, 1));
+    arma::mat item_ans = arma::mat(bitLen, size);


Please use camel casing for all names, for more information take a look at: https://github.com/mlpack/mlpack/wiki/DesignGuidelines#naming-conventions

zoq · 2017-06-11T12:03:23Z

src/mlpack/methods/ann/augmented/tasks/sort_impl.hpp

+    size_t size = RandInt(2, maxLength+1);
+    input(i) = arma::randi<arma::mat>(bitLen, size, arma::distr_param(0, 1));
+    arma::mat item_ans = arma::mat(bitLen, size);
+    vector<pair<int, int>> vals(size);


Really neat code, I was wondering if we could use arma::sort_index to replace the key of std::pair.

zoq · 2017-06-11T12:04:23Z

src/mlpack/methods/ann/augmented/tasks/sort_impl.hpp

+#include <vector>
+#include <algorithm>
+#include <utility>
+#include <cstdlib>


I think we can remove #include <cstdlib>and #include <ctime> here.

sidorov-ks · 2017-06-13T05:45:33Z

I keep trying to create an LSTM baseline :) Right now, it kinda works, but there are two big problems:

I have no idea hwo to change model's rho (for LSTM layer, I just set its rho to the maximum required value) For this reason, LSTM model just kills sequences of length 2, but is unable to emit third symbol.
There is some weird bug when setting nRepeats to some value > 1. Precisely, the model.Train() instruction crashes. I wonder what's the difference between these cases.

Can you help me resolve those issues, please?

The current unit test code is in the last commit.

zoq · 2017-06-13T13:49:11Z

I looked into the issue and setting the correct sequence length and it turns out we have to do a little bit more as setting the rho value since the input size is only set once (https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/ann/rnn_impl.hpp#L206). Also, I think it would be convenient if the model propagates the rho value through the model, that way we only have to change it once. Meaning we need another routine that does this for us.

zoq · 2017-06-13T14:17:50Z

Here is a quick patch

https://gist.github.com/zoq/24f8b2e4826d837d604f9613615763bb
https://gist.github.com/zoq/01952c2be67bd9fbeeeac5d6c69b39cd
https://gist.github.com/zoq/c7ffc1f16e654c94097fd79d1687fef3

Let me know if I should push the changes to my fork if that's easier to read for you.

and here is my output for nRepeat = 2

Input:
        0   1.0000   1.0000

Model output:
        0   1.0000   1.0000

True output:
        0   1.0000   1.0000

Input:
   1.0000        0   1.0000

Model output:
   1.0000        0   1.0000

True output:
   1.0000        0   1.0000

=======================================
Input:
   1.0000   1.0000

Model output:
   1.0000   1.0000

True output:
   1.0000   1.0000

=======================================
Input:
        0        0

Model output:
        0        0

True output:
        0        0

=======================================
Final score: 1

and the output for nRepeat = 3

Input:
        0   1.0000   1.0000

Model output:
        0        0        0

True output:
        0   1.0000   1.0000        0   1.0000   1.0000

=======================================
Input:
   1.0000   1.0000   1.0000

Model output:
   1.0000   1.0000   1.0000

True output:
   1.0000   1.0000   1.0000   1.0000   1.0000   1.0000

=======================================
Final score: 0

As for the repeat issue, without any further information, I can't see how the model should predict the right output.

sidorov-ks · 2017-06-14T08:02:10Z

Implemented the changed you described, now it works nicely for nRepeats=1. Interstingly, that on my machine nRepeats=3 crashes (but your output suggests that the model has learned to copy but failed to learn to repeat) with the same message:

error: subtraction: incompatible matrix dimensions: 1x1 and 2x1
unknown location(0): fatal error: in "AugmentedRNNsTasks/LSTMBaselineTest": std::logic_error: subtraction: incompatible matrix dimensions: 1x1 and 2x1

Right now I'm implementing similar test for AddTask and SortTask, after what I'll proceed to record the scores and write Week 2 report.

However, I would like to know whether unit test is the appropriate place for running a baseline solution. Maybe I should implement it somewhere else, e.g. some kind of examples binary?

zoq · 2017-07-31T10:45:32Z

src/mlpack/tests/CMakeLists.txt

  radical_test.cpp
  random_forest_test.cpp
+  random_test.cpp
+  random_forest_test.cpp


Looks like there is a merge artifact, if you remove the duplication travis should build the PR again.

zoq · 2017-07-31T11:32:30Z

src/mlpack/methods/ann/layer/layer_types.hpp

 #include <mlpack/methods/ann/layer/parametric_relu.hpp>
 #include <mlpack/methods/ann/layer/reinforce_normal.hpp>
 #include <mlpack/methods/ann/layer/select.hpp>
-#include <mlpack/methods/ann/layer/cross_entropy_error.hpp>


Not sure what happened here, but we should not remove the recently introduced cross entropy error function.

zoq · 2017-07-31T11:39:48Z

src/mlpack/methods/ann/augmented/tasks/add_impl.hpp

+      weights[0] = 2;
+      // Increasing length by 1 double the number of valid numbers.
+      for (size_t i = 1; i < bitLen - 1; ++i)
+        weights[i] = 2 * weights[i - 1];


What do you think should we use linspace in combination with exp2 here?

zoq · 2017-07-31T11:41:28Z

src/mlpack/core/math/random.hpp

+ *                accepts unnormalized probabilties as long as they are
+ *                non-negative and sum to a positive number.
+ * @return A random integer sampled from the specified distribution.
+ */


I would go with arma::vec for the weights instead of std::vector feels more consistent with the rest of the codebase, what do you think?

zoq

Mainly minor style issues.

zoq · 2017-07-31T13:56:20Z

src/mlpack/core/math/random.hpp

+      runningSum += el;
+      slots[i] = runningSum;
+    }
+  }


I think we can remove the check here, we already throw and exception if Probabilities < 0.

That's not the case: this check is needed for correctly processing [0 0 0 ... 0] weights.

Ah, right, thanks for the clarification!

zoq · 2017-07-31T13:56:25Z

src/mlpack/core/math/random.hpp

+ */
+inline int RandInt(const arma::vec& weights)
+{
+  // Build cumulative probabilities from event probabilities.


What about using arma::vechere, that way it's parallelised and we could write slots /= runningSum; instead of iterating over the vector.

True, but is there any "arma-native" implementation of binary search? Since we compute cumulative sums, we won't really get benefits from parallelization - the cumsum is "inherently" sequential.

UPD Read your next commentary and understood that here we can use armadillo vector as if it was STL vector.

True, cumsum is sequential, I guess the only operation that we could parallelise here is the accumulation. The performance boost should be negligible, we mainly save some lines of code.

zoq · 2017-07-31T14:01:28Z

src/mlpack/core/math/random.hpp

+{
+  // Build cumulative probabilities from event probabilities.
+  std::vector<double> slots(weights.size());
+  double runningSum = 0;


I was thinking if that check is necessary, since it was made clear in the documentation, we could simplify the expression:

arma::vec slots = cumsum(weights); slots /= arma::accu(weights); return std::lower_bound(slots.begin(), slots.end(), Random()) - slots.begin();

Let me know what you think.

Well, we still need to run that check - just to alert user if he tries to use it in the wrong way (intentialloy or otherwise)

You mean the probabilities check? I'm not sure we have to, but I don't mind it here, we could use arma::any if you like or just leave the for loop.

Reimplemented with arma::any. I think we do need those checks - it would lead to unpredicted consequences on weight vectors with negative elements. The binary search part is not robust to this case, and it is still nonsensical, so I like the idea to cut it out before the lower_bound run.

Sounds fine for me, thanks for the input.

zoq · 2017-07-31T14:02:25Z

src/mlpack/methods/ann/augmented/tasks/add_impl.hpp

+
+AddTask::AddTask(const size_t bitLen) : bitLen(bitLen)
+{
+  if (bitLen <= 0) {


Can you go through the code and put he { on a new line?

zoq · 2017-07-31T14:02:50Z

src/mlpack/methods/ann/augmented/tasks/add_impl.hpp

+      // We have two binary numbers with exactly two digits (10 and 11).
+      // weights[0] = 2;
+      // Increasing length by 1 double the number of valid numbers.
+      /*for (size_t i = 1; i < bitLen - 1; ++i)


Can we remove the code here?

zoq · 2017-07-31T14:08:23Z

src/mlpack/tests/augmented_rnns_tasks_test.cpp

+using namespace mlpack::ann::augmented::tasks;
+using namespace mlpack::ann::augmented::scorers;
+
+using namespace mlpack::ann;


I think we can remove the namespaces here.

zoq · 2017-07-31T14:11:12Z

src/mlpack/tests/augmented_rnns_tasks_test.cpp

+  void Predict(arma::field<arma::mat>& predictors,
+               arma::field<arma::mat>& labels)
+  {
+    auto sz = predictors.n_elem;


I think we can directly use predictors.n_elem; instead of sz here. In general we should avoid to use auto when using armadillo:

Use of C++11 auto is not recommended with Armadillo objects and expressions. Armadillo has a template meta-programming framework which creates lots of short lived temporaries that are not handled by auto.

In this case it would be just fine, but I think using it directly without using an alias is also just fine, an we can save another line.

zoq · 2017-07-31T14:12:38Z

src/mlpack/tests/augmented_rnns_tasks_test.cpp

+    predictors.reshape(3, predictors.n_elem / 3);
+    assert(predictors.n_rows == 3);
+    int num_A = 0, num_B = 0;
+    bool num = false; // true iff we have already seen the separating symbol


I guess, you mean if, also can you use the correct punctuation here?

No, I mean "iff" in the meaning "if an only if" - just like it is normally done in math texts

Ah, I see, didn't thought about that in the current context.

zoq · 2017-07-31T14:13:06Z

src/mlpack/tests/augmented_rnns_tasks_test.cpp

+    assert(predictors.n_rows == 3);
+    int num_A = 0, num_B = 0;
+    bool num = false; // true iff we have already seen the separating symbol
+    size_t len = predictors.n_cols;


I think there is no need to create an alias here, what do you think?

zoq · 2017-07-31T14:14:12Z

src/mlpack/tests/random_test.cpp

+BOOST_AUTO_TEST_SUITE(RandomTest);
+
+// Test for RandInt() sampler from discrete uniform distribution.
+BOOST_AUTO_TEST_CASE(DiscreteUniformRandomTest)


Thanks for write a test for the method!

zoq · 2017-07-31T18:26:16Z

src/mlpack/core/math/random.hpp

-    if (el < 0) {
-      std::ostringstream oss;
+  // Check that constraints on weights parameter are satisfied.
+  if (arma::min(weights) < 0) {


I know I'm really picky about the style, can you go through the code and adjust the lines with the following pattern: ) {.

Ran pcregrep -Mnr --color "\) \{\n" src/mlpack/ - is it okay if I fix this pattern elsewhere?

Sure, nice idea!

zoq · 2017-08-01T11:44:56Z

Temporary fixed the disk space issue, can you go through the issues listed here: http://masterblaster.mlpack.org/job/pull-requests%20mlpack%20style%20checks/656/cppcheckResult/source.all/?before=5&after=5

zoq

I think it looks good and we should merge it. We should let this sit for 3 days, before merge to give anyone else time to comment.

rcurtin

Thanks so much for the hard work with this PR. I agree that it is ready, if you can take care of the comment about RandInt().

rcurtin · 2017-08-02T13:05:40Z

src/mlpack/core/math/random.hpp

+ * @param weights Probabilities of individual numbers. Note that the function
+ *                accepts unnormalized probabilties as long as they are
+ *                non-negative and sum to a positive number.
+ * @return A random integer sampled from the specified distribution.


I think that instead of adding this overload of RandInt() we could use the functionality already available with DiscreteDistribution. Instead of

x = RandInt(weights);

we can create a DiscreteDistribution and use it:

DiscreteDistribution d(weights); x = d.Random();

What do you think?

(As I have mentioned in the IRC chat) I don't see the concise way to use it in 1-dim setting - although I do admit that I didn't understand the n-dim API.

Ah sorry, the code I gave above was wrong. You could do:

DiscreteDistribution d(1); // One-dimensional discrete distribution. d.Probabilities(0) = std::move(weights); x = d.Random();

I think this will do what you need, let me know if not.

rcurtin · 2017-08-02T13:07:08Z

src/mlpack/methods/ann/augmented/tasks/copy_impl.hpp

+    size_t totSize = vecInput.n_elem + addSeparator + vecLabel.n_elem;
+    input(i) = arma::zeros(totSize, 2);
+    input(i).col(0).rows(0, vecInput.n_elem-1) =
+      vecInput;


Some small style issues here---the wrapped lines should be indented 4 spaces, not 2, and vecInput.n_elem-1 should be vecInput.n_elem - 1. :)

rcurtin · 2017-08-02T13:08:14Z

src/mlpack/methods/ann/augmented/tasks/sort_impl.hpp

+          input(i).rows(origPtr, origPtr + bitLen - 1);
+        ptr += bitLen;
+        origPtr += bitLen;
+        sepInput.at(ptr, 0) = 0.5;


Ok, I see---thanks for the clarification.

rcurtin

Thanks for the refactoring. I'm glad we can keep the tests too. :) I only have a couple minor comments that you can address if you like (it'll make the code just a bit cleaner), otherwise I agree this is ready for merge.

rcurtin · 2017-08-05T22:42:23Z

src/mlpack/methods/ann/augmented/tasks/add_impl.hpp

+      weights = arma::exp2(arma::linspace(1, bitLen - 1, bitLen - 1));
+
+      mlpack::distribution::DiscreteDistribution d(1);
+      d.Probabilities(0) = std::move(weights);


You could compress this a bit, I think:

d.Probabilities(0) = arma::exp2(arma::linspace(1, bitLen - 1, bitLen - 1));

then you don't need the weights object. :)

rcurtin · 2017-08-05T22:42:43Z

src/mlpack/methods/ann/augmented/tasks/copy_impl.hpp

+      weights = arma::exp2(arma::linspace(1, maxLength - 1, maxLength - 1));
+
+      mlpack::distribution::DiscreteDistribution d(1);
+      d.Probabilities(0) = std::move(weights);


Same here, this can be compressed too.

rcurtin · 2017-08-05T22:44:32Z

src/mlpack/tests/random_test.cpp

+  {
+    arma::vec armaWeights(weightSet);
+    mlpack::distribution::DiscreteDistribution d(1); // One-dimensional discrete distribution.
+    d.Probabilities(0) = std::move(armaWeights);


You might be able to compress this a little bit too---d.Probabilities(0) = arma::vec(weightSet).

zoq · 2017-08-06T14:22:25Z

src/mlpack/methods/ann/augmented/tasks/copy_impl.hpp

+    {
+      arma::vec weights(maxLength - 1);
+
+      mlpack::distribution::DiscreteDistribution d(1);


We have to include #include <mlpack/core/dists/discrete_distribution.hpp> to avoid an undefined method issue. The same for the Add task.

zoq · 2017-08-07T21:51:12Z

Thanks for the great contribution!

coveralls · 2019-10-04T09:57:42Z

Coverage decreased (-43.8%) to 38.366% when pulling 8f4af1e on 17minutes:master into 5c68061 on mlpack:master.

zoq reviewed May 25, 2017

View reviewed changes

zoq reviewed Jun 5, 2017

View reviewed changes

zoq reviewed Jun 7, 2017

View reviewed changes

zoq reviewed Jun 8, 2017

View reviewed changes

zoq reviewed Jun 11, 2017

View reviewed changes

sidorov-ks added 4 commits July 31, 2017 12:45

Adding non-uniform version of RandInt + unit test

75eb088

Fixed the non-uniform sequence generation bug

3bf9531

Cleaning up unit tests for

18470f3

Trying to resolve a merge conflict

4259a8d

sidorov-ks force-pushed the master branch from 4348945 to 4259a8d Compare July 31, 2017 09:48

zoq reviewed Jul 31, 2017

View reviewed changes

Fixing CMakeList merge artifact

87afff8

zoq reviewed Jul 31, 2017

View reviewed changes

Added back CrossEntropy-related includes

cf8d4ab

zoq reviewed Jul 31, 2017

View reviewed changes

Making RandInt() compliant with armadillo data structures

77fa782

zoq reviewed Jul 31, 2017

View reviewed changes

Doing some final fixes from @zoq's review

b76eac7

zoq reviewed Jul 31, 2017

View reviewed changes

Fixed curly brace style issue

7f61e2a

Fixed cppcheck issues

df640f6

zoq approved these changes Aug 2, 2017

View reviewed changes

rcurtin approved these changes Aug 2, 2017

View reviewed changes

sidorov-ks added 2 commits August 2, 2017 19:12

Fixing minor spacing issue

3a4b666

Porting from RandInt() overload to DiscreteDistribution

94def97

rcurtin approved these changes Aug 5, 2017

View reviewed changes

Style fixes from @rcurtin's review

8eefd04

zoq reviewed Aug 6, 2017

View reviewed changes

zoq mentioned this pull request Aug 7, 2017

[GSoC] Added LSTM baseline model with copy task benchmark mlpack/examples#1

Closed

Adding DiscreteDistribution includes

8f4af1e

zoq merged commit 94907ce into mlpack:master Aug 7, 2017

zoq mentioned this pull request Aug 8, 2017

GRU layer for ANN module #1018

Merged

Uh oh!

Conversation

sidorov-ks commented May 23, 2017

Uh oh!

mlpack-jenkins commented May 24, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sidorov-ks commented Jun 5, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zoq commented Jun 5, 2017

Uh oh!

sidorov-ks commented Jun 5, 2017

Uh oh!

zoq commented Jun 5, 2017

Uh oh!

sidorov-ks commented Jun 5, 2017

Uh oh!

zoq commented Jun 5, 2017

Uh oh!

sidorov-ks commented Jun 6, 2017

Uh oh!

zoq commented Jun 6, 2017

Uh oh!

sidorov-ks commented Jun 6, 2017

Uh oh!

zoq commented Jun 6, 2017

Uh oh!

sidorov-ks commented Jun 6, 2017

Uh oh!

zoq commented Jun 6, 2017

Uh oh!

sidorov-ks commented Jun 6, 2017

Uh oh!

zoq commented Jun 6, 2017

Uh oh!

sidorov-ks commented Jun 7, 2017

Uh oh!

sidorov-ks commented Jun 7, 2017

Uh oh!

sidorov-ks commented Jun 7, 2017

Uh oh!

zoq commented Jun 7, 2017

Uh oh!

sidorov-ks commented Jun 7, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zoq commented Jun 7, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

sidorov-ks commented Jun 7, 2017 •

edited

Loading