Skip to content

Review Request: Muellerklein#20

Closed
FlorianMuellerklein wants to merge 9 commits intoReScience:masterfrom
FlorianMuellerklein:muellerklein
Closed

Review Request: Muellerklein#20
FlorianMuellerklein wants to merge 9 commits intoReScience:masterfrom
FlorianMuellerklein:muellerklein

Conversation

@FlorianMuellerklein
Copy link

@FlorianMuellerklein FlorianMuellerklein commented Jul 1, 2016

Muellerklein

Dear @ReScience/editors,

I request a review for the reproduction of the following paper:

  • He, Kaiming, et al. "Identity mappings in deep residual networks." arXiv preprint arXiv:1603.05027 (2016)

and one result from

  • Sergey Zagoruyko, Nikos Komodakis, "Wide Residual Neural Networks." arXiv preprint arXiv:1605.07146 (2016)

I believe the original results have been faithfully reproduced as explained in the accompanying article.

Repository lives at https://github.com/FlorianMuellerklein/ReScience-submission/tree/muellerklein


EDITOR

  • Editor acknowledgment (@emmanuelle 03 July 2016)
  • Reviewer 1 (@ogrisel 20 July 2016)
  • Reviewer 2 (@ozancaglayan 24 August 2016)
  • Review 1 decision [accept/reject]
  • Review 2 decision [accept] 15 March 2017
  • Editor decision [accept/reject]

@rougier rougier changed the title Review Request Review Request: Muellerklein Jul 1, 2016
@FlorianMuellerklein
Copy link
Author

FlorianMuellerklein commented Jul 1, 2016

AUTHOR

There are some formatting issues that I will resolve if this gets accepted.

@rougier
Copy link
Member

rougier commented Jul 18, 2016

EDITOR-IN-CHIEF

@emmanuelle Any update on this ?

@ogrisel
Copy link

ogrisel commented Jul 20, 2016

@emmanuelle asked me to review this submission. I need to reinstall my GTX 980Ti to be able to run the code but hopefully I will do it in the coming week. Meanwhile @FlorianMuellerklein is there any change you want to include in the submission? What are the formatting changes you plan?

@FlorianMuellerklein
Copy link
Author

@ogrisel I had some issues with the image sizes and some captions in the pdf that I need to fix.

But, after submitting this pull request someone commented on my github repo that I might have a mistake in the projection shortcut. They said that the shortcut should take the preactivation as it's input. I think they may be correct, should I update my pull request or wait for reviewer input before changing anything? I linked to the issue on my repo below.

FlorianMuellerklein/Identity-Mapping-ResNet-Lasagne#2

@rougier
Copy link
Member

rougier commented Jul 20, 2016

@FlorianMuellerklein You can push your changes, they should be tracked here. For the formatting issues in the paper, I will handle it in the edition stage if accepted.

@ogrisel Thank you !

@ogrisel
Copy link

ogrisel commented Jul 21, 2016

Please @FlorianMuellerklein feel free to push the changes. If you launch a series of new runs please report the new numbers in a new column in your tables so that we can track the impact of that change on the final validation error of the models.

@FlorianMuellerklein
Copy link
Author

@ogrisel I pushed the changes so everything should be ready to go. I won't have time to check the new numbers for a little while (time to get some more GPUs) but I definitely will.

@ogrisel
Copy link

ogrisel commented Jul 22, 2016

It seems that the submission misses a file:

Traceback (most recent call last):
  File "train_nn.py", line 92, in <module>
    train_X, test_X, train_y, test_y = load_pickle_data_cv()
  File "/home/ogrisel/code/ReScience-submission/code/utils.py", line 58, in load_pickle_data_cv
    np.save('data/pixel_mean.npy', pixel_mean)
  File "/home/ogrisel/.virtualenvs/py27/local/lib/python2.7/site-packages/numpy/lib/npyio.py", line 477, in save
    fid = open(file, "wb")
IOError: [Errno 2] No such file or directory: 'data/pixel_mean.npy'

@FlorianMuellerklein
Copy link
Author

@ogrisel That line should be saving the pre-pixel mean to be used to preprocess the testing data later on. I don't understand why it would give an error.

@ogrisel
Copy link

ogrisel commented Jul 22, 2016

I ok I was in the wrong folder.

Edit: Actually no, when I run the script from the top level folder I get this other error:

$ python code/train_nn.py wide 16 4
Using gpu device 0: GeForce GTX 980 Ti (CNMeM is disabled, cuDNN 4007)
Using wide ResNet with depth 16 and width 4.
Traceback (most recent call last):
  File "code/train_nn.py", line 92, in <module>
    train_X, test_X, train_y, test_y = load_pickle_data_cv()
  File "/home/ogrisel/code/ReScience-submission/code/utils.py", line 19, in load_pickle_data_cv
    fo_1 = open('../data/cifar-10-batches-py/data_batch_1', 'rb')
IOError: [Errno 2] No such file or directory: '../data/cifar-10-batches-py/data_batch_1'

I believe the expectations wrt the folder structure have been broken when converting from the original repo to this submission PR.

@ogrisel
Copy link

ogrisel commented Jul 22, 2016

Could you please extend the readme to to give the command line parameters to reproduce each of the score reported in the table. I guess those are:

  • python code/train_nn.py wide 16 4
  • python code/train_nn.py normal 110
  • python code/train_nn.py normal 164

but it would be great if you could confirm this.

Also for the sake of reproducibility I think you should include the version number of the libraries you used (and the sha1 digest of the git repo for lasagne).

@FlorianMuellerklein
Copy link
Author

@ogrisel Ok I'll update the README with the command lines and versions as soon as I can.

The depth is actually the multiplier for how many residual blocks to insert between each downsampling layer. So for ResNet-110 and ResNet-164 depth=18 and for the wide residual network depth=2.

python code/train_nn.py wide 2 4
python code/train_nn.py normal 18
python code/train_nn.py bottleneck 18

@ogrisel
Copy link

ogrisel commented Jul 22, 2016

Thanks!

@ogrisel
Copy link

ogrisel commented Jul 23, 2016

Ok I trained the "wide 2 4" configuration and could reach 0.957476 final validation accuracy. However the weights folder did not exist (probably because git does not checkout empty folders) so saving the weights failed. You might want to make sure to create the folders if they don't exists before writing files in this script :)

iter: 198 | TL: 0.192 | VL: 0.202 | Vacc: 0.956 | Ratio: 0.95 | Time: 157.3
iter: 199 | TL: 0.192 | VL: 0.2 | Vacc: 0.955 | Ratio: 0.96 | Time: 157.3
Final Acc: 0.957476
Traceback (most recent call last):
  File "train_nn.py", line 133, in <module>
    f = gzip.open('data/weights/%s%d_resnet.pklz'%(variant,depth), 'wb')
  File "/usr/lib/python2.7/gzip.py", line 34, in open
    return GzipFile(filename, mode, compresslevel)
  File "/usr/lib/python2.7/gzip.py", line 94, in __init__
    fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb')
IOError: [Errno 2] No such file or directory: 'data/weights/wide2_resnet.pklz'

@FlorianMuellerklein
Copy link
Author

FlorianMuellerklein commented Jul 23, 2016

@ogrisel Oh no, I'm really sorry about that. I'll add that to my next update.

EDIT: Updated to include python package versions, commands to reproduce. train_nn.py now creates the necessary folders if they don't already exist.

@ogrisel
Copy link

ogrisel commented Jul 24, 2016

I trained again a "wide 2 4" model and I got:

Accuracy on the testing set,  0.9513

So a 4.86% test error. This is significantly better than what you write in your report. Maybe because of d42627a?

I don't have the plot because of the matplotlib backend issue that I fixed in the mean time. You should probably store the train / val losses data into a file to decouple training from convergence monitoring but this is not a big deal. Here is the text log of that run if you are interested.

Also it seems that 2 runs don't yield the same results despite the numpy rng seeding at the top of the script. I suspect that the init is done by the theano rng which is probably no seeded.

@ogrisel
Copy link

ogrisel commented Jul 25, 2016

Another missing folder for the script to work by default: plots

Traceback (most recent call last):
  File "train_nn.py", line 150, in <module>
    pyplot.savefig('plots/%s%d_resnet.png'%(variant,depth))
  File "/home/ogrisel/.virtualenvs/py27/local/lib/python2.7/site-packages/matplotlib/pyplot.py", line 695, in savefig
    res = fig.savefig(*args, **kwargs)
  File "/home/ogrisel/.virtualenvs/py27/local/lib/python2.7/site-packages/matplotlib/figure.py", line 1533, in savefig
    self.canvas.print_figure(*args, **kwargs)
  File "/home/ogrisel/.virtualenvs/py27/local/lib/python2.7/site-packages/matplotlib/backend_bases.py", line 2233, in print_figure
    **kwargs)
  File "/home/ogrisel/.virtualenvs/py27/local/lib/python2.7/site-packages/matplotlib/backends/backend_agg.py", line 521, in print_png
    filename_or_obj = open(filename_or_obj, 'wb')
IOError: [Errno 2] No such file or directory: 'plots/normal18_resnet.png'

Edit I had launched the run prior to your last change in 0134558. This specific issue should be fixed now. Please ignore this comment.

@ogrisel
Copy link

ogrisel commented Jul 25, 2016

Here is the result of another run (this time ResNet-110):

$ python train_nn.py normal 16  # ResNet-110
[...]
iter: 197 | TL: 0.159 | VL: 0.276 | Vacc: 0.943 | Ratio: 0.58 | Time: 347.6
iter: 198 | TL: 0.158 | VL: 0.28 | Vacc: 0.942 | Ratio: 0.56 | Time: 347.0
iter: 199 | TL: 0.157 | VL: 0.271 | Vacc: 0.943 | Ratio: 0.58 | Time: 346.8
Final Acc: 0.948774
$ python test_model.py normal 16
[...]
Accuracy on the testing set,  0.9385

So the test error for this run of ResNet-110 is 6.15% which is slightly better than the reported 6.37% in the Identity Mappings paper (median of 5 runs).

@FlorianMuellerklein
Copy link
Author

@ogrisel Wow that is interesting that the wide ResNet did so much better, how close were the two runs with the wide net?

Your ResNet-110 is after my latest commit as well? As soon as I get some spare GPU time I will re-run these and update everything on my end.

@rougier
Copy link
Member

rougier commented Oct 7, 2016

@ogrisel @ozancaglayan 🔔

@rougier
Copy link
Member

rougier commented Oct 19, 2016

Dear @emmanuelle @ogrisel @ozancaglayan,

Is there anything we can do to have this review to move forward ?

@rougier
Copy link
Member

rougier commented Nov 2, 2016

Dear @emmanuelle @ogrisel @ozancaglayan,

Any update on this review ?

@ozancaglayan
Copy link

ozancaglayan commented Nov 3, 2016

Sorry for the delay, I am really busy these days.

Code

  • There were some issues previously mentioned by me and @ogrisel about the folder mess. It would be nice if the code once cloned works out-of-the-box for maximum reproducibility.
  • Fixing the numpy rng seed can be good for improved reproducibility but this may change your reported results.

Article

  • In the first paragraph, the sentences starting with "main idea", "basic idea" and "the residual blocks are basically" should be reformulated/merged for ease of reading.
  • The first figure can be enhanced with a higher resolution one. The caption of the figure is not positioned correctly.
  • "Whenever the number of filters are increased [comma here] the first convolution layer..."
  • First sentence after the first table should be the caption of the table or it should be a bold title, etc.
  • The next sentence there is problematic. What is parallel to what? By saying "are added after each block", do you mean "are summed"?
  • The next paragraph, you can say "[1] also introduces a bottleneck architecture" instead of mentioning the full article name
  • When you cite the papers inside brackets in the whole article, I think they should be before the final dot of the sentence, e.g. "foo bar [1]." instead of "foo bar. [1]"
  • Preactivation Residual Networks [2] and Wide Residual Networks[3] may be better than "... [2][3]"
  • Figure title for preactivation resnet is not positioned correctly.
  • The tables need captions for clarity.
  • Paragraph is wrongly wrapped with the figure showing basic-wide and wide-dropout blocks.
  • I think there's no need to put a quoting from an article inside quotes since you cite it just after the sentence. An example is from the Methods section: "32x32 images, with the per-pixel mean subtracted" [1]
  • Data Augmentation last sentence suggestion: "Batch size of 64 is used instead of 128 because of hardware constraints."
  • Training and Regularization: The learning rate schedule between the curly braces can be explained textually for clarity.
  • Hardware and software last sentence -> "but with more memory"
  • Results last paragraph: "...in the cross-entropy during training and also much less of a quadratic trend than seen in other neural networks." is not very clear to me.

@rougier rougier self-assigned this Nov 14, 2016
@rougier
Copy link
Member

rougier commented Nov 14, 2016

@FlorianMuellerklein Can you make the suggested modification ?
@ogrisel @ozancaglayan What is you decision, does the paper needs more review or can you make a decision about acceptation/rejection ?

@ozancaglayan
Copy link

Should I wait for the modifications to decide?

@rougier
Copy link
Member

rougier commented Nov 17, 2016

No, but you can condition your decision (e.g. provided the author makes the suggested changes...)

@FlorianMuellerklein
Copy link
Author

@rougier Sorry, I've been pretty busy lately as well. I'll have these edits soon.

@ozancaglayan
Copy link

OK, provided the author makes the suggested changes, my decision is to accept the paper.

@rougier
Copy link
Member

rougier commented Nov 29, 2016

@ogrisel Can you tell you decision about the paper or do you need more time ?

@rougier
Copy link
Member

rougier commented Jan 11, 2017

Dear @FlorianMuellerklein @ogrisel @emmanuelle @ozancaglayan,

This submission is now 6 months old and we need to try to move things forwards.
Could you have a look at past messages and possibly answer questions? Thanks.

@rougier
Copy link
Member

rougier commented Mar 15, 2017

@FlorianMuellerklein
Copy link
Author

I just pushed an update with most of the changes that Ozan requested.

@rougier
Copy link
Member

rougier commented Mar 15, 2017

@FlorianMuellerklein Thanks for the update!
@ozancaglayan Can you make your decision based on the new update?

@rougier
Copy link
Member

rougier commented May 3, 2017

@ReScience/reviewers We need some new reviewers here if anyone interested!

@ozancaglayan
Copy link

As I mentioned before, I accept the paper.

@rougier
Copy link
Member

rougier commented May 3, 2017

@ozancaglayan Sorry, I didn't see it.

@rougier
Copy link
Member

rougier commented May 29, 2017

🛎 @grisel Can you tell me if you're able to handle this review or if we need to look for another reviewer ? No problem if you can't but I need to know to try to move this review forward.

@rth
Copy link

rth commented Jun 1, 2017

🛎️ @grisel Can you tell me if you're able to handle this review or if we need to look for another reviewer ? No problem if you can't but I need to know to try to move this review forward.

With an "o" @ogrisel ..

@ogrisel
Copy link

ogrisel commented Jun 9, 2017

Sorry for the lack of reply. I will try to find some time next week do review the updated paper / code.

@remram44
Copy link

It seems that this uses an unreleased version of Lasagne (which requires a development version of Theano as well). It would be great if the precise Git hashes used to run this could be included in the README. I applaud your including the precise version of numpy and Theano, but you left out sklearn and matplotlib as well.

The pip freeze command might help!

@rougier
Copy link
Member

rougier commented Oct 31, 2017

@FlorianMuellerklein Are you still following this submission ? We might need to assign a new reviewer to complete the process but first I wanted to know if you're still interested.

@FedericoV
Copy link

If @FlorianMuellerklein is not available, I can jump in - but not sure if it's ideal to have an additional pair of eyes at this point in the process.

@rougier
Copy link
Member

rougier commented Oct 31, 2017

Thanks @FedericoV. @FlorianMuellerklein is the author of the replication. @ozancaglayan alread accepted the paper and @ogrisel has not responded yet. At this point, if you can review it, that would be great.

@rougier rougier removed the Stalled label Oct 31, 2017
@rougier rougier closed this Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants