-
Notifications
You must be signed in to change notification settings - Fork 97
Review Request: Hathway and Goodman #51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thansk for your submission, I'll assign an editor soon. |
|
In fact, I can editing it myself unless @otizonaizit, @benoit-girard o @oliviaguest are willing to do it. |
|
I could edit it, especially if the final publication steps are automatized... |
|
@benoit-girard They are not (yet). We need to rewrite both the author and editor instruction, get them approved by everybody and then things should be much more simple. |
|
Talking of author instructions, we found it rather difficult to get the markdown->pdf compilation working. We wrote a Dockerfile and a couple of scripts that automate the process without having to install anything locally (other than docker). We'd be very happy if you wanted to make that available to other authors. Check out the readme, Dockerfile and |
|
@thesamovar That's great. We're also in the process of changing the submission process because you're not the only one to have had problems with md/pandoc. It will be only latex in a few days hopefully. Authors should be able to use overleaf if needed. I'll edit you submission, sorry for the delay. |
|
Thanks ! |
|
I've installed the code, just two issues to report for now: On our cluster, matplotlib raises Now |
|
A few issues with the array job submission:
In the meantime I'll go ahead using |
|
Should we reply and make changes as comments come in or wait until reviews are finished? |
|
I won't start my review before next week, so I think it would be better if you fix the basic runtime problems already, so I do not run into them too. |
…ter.sh, added relative paths in code/cluster/main_cluster.py and code/cluster/analysis_main_cluster.py
|
Thank you for your feedback - I fixed the tab error in In The section Running on a cluster in the |
|
Thank you for the modifications and explanations !
Indeed queuing systems vary and it's unfeasible to provide a script for all of them. Regarding paths, our cluster has a similar behaviour; one workaround is to I've had further trouble with the compilation of the network by Brian failing on cluster nodes. This seems to be due to concurrent access by several processes. I've tried setting |
General impressionsFirst of all, I want to congratulate the authors for that replication effort — this is clearly not an easy one to tackle. It's impressive that you managed to port all the relevant details from an event-based Matlab codebase to a timestep-driven Brian implementation. This is nice work and despite some issues which I will detail below I am inclined to believe that the original article has been replicated. I also really appreciate that you went beyond just replicating the article and had a look at the importance of some implementation details. The point about the self-stabilising properties of the RNN rule is enlightening and adds an important nuance to the original work: not just any Hebbian flavour of STDP will do! Running the replicationI performed all the experiments on our university cluster for convenience, even the serial ones. The cluster runs Torque on Scientific Linux 7.5 (Nitrogen). Installation was straightforward and the instructions based on a conda environment let me run I sometimes had to clean the Brian output directories ( Running Running The parameter exploration code was tricky to get to run. There were frequent but non-deterministic compilation or run failures on cluster nodes: it seems that parallel jobs can't share the same Brian output directory. Do you know why that wasn't an issue on your cluster? Could multiple runs potentially use the same standalone code, or does it have to be recompiled for each parameter combination? My first solution was to set In the end here's what worked for me:
With those settings the array job took 3 to 4 hours to complete. Here is my submission script for the record (requires a few changes to CodeThe replication is implemented in Python with few dependencies: it uses Brian for the network simulation and Numpy for creating the input patterns. The authors used Numba to speed up some functions and Brian's standalone mode to compile the network. One advantage of Brian, for a replication, is that all model equations are clearly visible instead of being hidden in a library of standard models. I found the code generally readable, though it is complex in some parts, especially the manipulation of spike trains in Random seeding issueThe results are not deterministic: some figures are different after every run. My guess is that it's because the code sets Thus the data generated in python (ie. the input spike trains) does not use a fixed seed at the moment. This could be an issue for someone trying to validate their own replication, and also because some of your points involve picking a particular run that displays certain characteristics, and that selection cannot be reproduced reliably at the moment. FiguresHere I will be commenting on the replication of the original figures as well as on the reproduction of the figures of the replication. ;-) Overall, the generated figures are clear, well labeled and aesthetically pleasing. Figure 1 is well replicated. Figure 2 looks fine except it is missing some annotations — are these added by hand? Not a big deal though. In the caption, the dashes around "in this example" look a bit short (use For figures 3A and 4A, I understand that you were able to re-run the original matlab code with your own inputs and pass the results to your plotting routines, hence the exact same spike patterns in 4A and 4B. Correct me if I'm wrong. Figure 3B looks fine, but shows wide variations between runs (seeding issue). Rarely, it entirely fails to find the pattern (expected). Figure 4 looks fine appart from the seeding issues. Minor issue: some of the white circles are almost entirely obscured by dark circles, thus the apparent density may not reflect the data — maybe some transparency would give better results. I found the caption to be quite obscure at first: the plot looks like a spike raster, but it says it's a plot of the weights. Only after reading the original paper did I understand that this was a plot of input spikes, coloured by the corresponding weight. The caption could be amended to make this clearer. Figure 5 can be created in two ways: The restricted version created by The full version created by the cluster has some issues. The legend is missing the 'orig paper' label. The plots also show only one of the two conditions: dt=1e-4 or 1e-6. But while the legend says 1e-4, the plain black lines seem to correspond better to the dotted lines in the replication paper, which labels them 1e-6. Looking into the logs from the cluster run, I find Figure 6 is missing its markers, and the variance seems to be half as large as reported for the 1e-4 condition. But the figure still validates the point about the effect of the timestep, with low variance and the expected mean at dt=1e-6. I'm unable to reproduce figure 7 reliably. For figures 7C and 7D, this seems to be due to the seeding issue. Most of the runs are a lot worse than the one shown in the article — which sort of proves your point, actually! 7B sometimes looks like the one you show, and sometimes RNN converges to a value > 0.5 instead, as shown below. 7A looks very different: the curves are always straights lines. There must be another problem here. Figure 8 is well reproduced. Model descriptionThe authors chose not to repeat the model equations that didn't differ from Masquelier et al (2008), for instance those for the STDP rule. I am personally fine with this. Case of Δt=0The replication says:
The variant seems to be implemented in Therefore it will be used to generate figure 5 when run on the cluster. But the text isn't very clear about the scope of this variant, as it first mentions "the rest of the paper" and only then figure 5 in passing. Maybe you could be more explicit about using the variant to generate figure 5 specifically? Or should the variant have been commented out for generating figure 5 too? Regarding the variant code, I'm not sure why you take the mean of LTP and LTD in this case, rather than the just sum? Other results from the original paperWhile I am reasonably confident that the replication does reproduce the original work, adding a replication of fig. 1 (original) would confirm the equivalence of the input spike trains. Another figure wich could be nice to replicate is fig. 4 (original), showing the output firing patterns over time. Though in this case the equivalence is implied to some degree by your fig. 3 (fig. 5 original) which shows the development of selective responses. Other remarksThe following remarks are things that popped into my mind while reading the work — not comments to address for the review! I'm curious about the fact that fig. 4 in the original paper shows the membrane potential relaxing to baseline over the exact duration of the pattern. I guess this is because the later spikes moved onto the LTD side of the output spike earlier in the run, and were therefore potentiated less? I also wonder about the importance of the after-spike hyperpolarisation for learning. Is it critical for stabilising the learning, does it have to match the duration of the pattern or the time constants of the EPSPs, and would the model still work with a shorter or longer period? |
|
Thanks @damiendr for the excellent and detailed review. We'll reply in detail later, but for now just wanted to make a quick comment about Brian standalone mode. Your problems with the generated code directories of Brian are unavoidable with the current release version of Brian unfortunately, but I've created an issue to sort this out for the next release (brian-team/brian2#973). In the meantime we could sort it out by explicitly creating a temporary directory (https://docs.python.org/2/library/tempfile.html#tempfile.mkdtemp) and then deleting the directory after we've got the data back (more or less what you did). In the release version of Brian the code needs to be recompiled for each parameter change (something I'm also working on fixing in Brian at the moment). By default, we use |
|
Thank you for taking the time and effort to get the parameter exploration to run and for providing your submission script, @damiendr! As a response to the small issues you reported, I made a few changes to the figures. All other comments will be addressed after the second review. Running the replicationDan already commented on the Brian standalone mode and how to manage the directories until the next release version of Brian. Our cluster runs gcc version 5.4.0 by default, it's good to know that that older versions might create an issue. I added a line into the .sh file to load gcc 5.4.0. I only tried main.py --new True on my work computer and laptop and not on the cluster, so I wasn't aware that it took so long on older nodes, I will make a note of that in the README. FiguresFigure 5: I fixed the cosmetic problems in figure_5_new() and the labeling in main_cluster.py. The reason the 1e-6 condition is missing in both fig_5_new() and the cluster version is that a time sep of 0.001 ms (1e-6) instead of 0.1 ms (1e-4) takes about 100 times longer to complete. Since the results from the two time steps is not that large, we thought it might be sufficient to run the larger time steps (0.1 ms). It is of course possible to run them at 0.001 ms as well, the time step can be changed in line 28 of run_simulation.py and line 33 of main_cluster.py. Figure 6: I fixed the missing markers. Figure 7A: The plotting resolution was wrong, this has now been corrected. The plotting resolution for these two figures is slightly different, but the general increases/decreases should be clearly visible. Figure 7B: I cannot reproduce the increase of weights for RNN that you show. Did this occur often? Case of Δt=0You are right, the normal STDP rule was commented out and the modified rule was commented in - I changed this now. This might also be the reason why the performance was a bit better at pattern frequency 0.5 and looks more like the dotted line in the article figure (dt=1e-6 s). While this was a mistake, it shows that this modification only has an impact in very specific circumstances. |
Response to Reviewer 1Below we comment on the review by @damiendr and the changes made to the article and the code in response. Running the replication
The README now states that gcc version 5.4.0 or higher is required.
A note was made in the README file that runs might take longer.
Running large array jobs on a cluster can be difficult to get to run. Issues are difficult to foresee and to debug, so for the moment the code remains as it is, but thank you very much for your solutions and notes - they will definitely be helpful for someone else with a similar cluster! Code
Comments were added to the functions in create_input.py. Most of them were adapted from the Matlab code of the original article. Random seeding issue
Thank you for pointing this out, the random seed issue is now corrected - each run with a certain seed is now deterministic. Figure 7CD now shows a similar behaviour to the one in the article with seed 28 as an example. Figures
Yes, the annotations are done by hand.
This has been corrected.
We used the input spikes created by the Matlab code and ran it both in the original code and in ours, therefore the spike patterns are the same.
Indeed, with every seed, Figure 3B will look slightly different.
We acknowledge that it is not immediately apparent that this figure represents a raster plot due to the high density of spikes and the overlaps. Since we were replicating the figure in the original paper (Figure 6a), we did not want to change the format of the plot. A colorbar was added to make the colouring of the dots clearer.
The caption was changed to make it clearer that it shows a raster plot of spikes.
The cosmetic problems in figure_5_new() and the labeling in main_cluster.py were corrected. The reason the 1e-6 condition is missing in both fig_5_new() and the cluster version is that a time sep of 0.001 ms (1e-6) instead of 0.1 ms (1e-4) takes about 100 times longer to complete. Since the results from the two time steps is not that large, we thought it might be sufficient to run the larger time steps (0.1 ms).
Markers were added. The variance might be lower due to the random seeds not being fixed (runs were not the same), but the point is still the same.
This has now been addressed and Figure 7CD now shows a behaviour similar to the one shown in the article (seed=28 in this example, which is the default seed for Figure 7CD).
If this behaviour only occurred once, it might have been one of the rare failed runs, for example now seed=0 will fail and show this behaviour.
The plotting resolution was wrong, this has now been corrected. The plotting resolution for these two figures is slightly different in the article figure, but the general increases/decreases should be clearly visible. Model description
We will leave it like it is unless Reviewer 2 has a different opinion. Case of Δt=0
This was a mistake, it should have run the standard STDP equations, this was changed in the main_cluster.py script. Using the modified STDP rule might be the reason why the performance was a bit better at pattern frequency 0.5 and therefore looks more like the dotted line in the article figure (which corresponds to dt=1e-6 s). Apart from that condition the dotted and the solid lines are very similar. The modification only comes into play at an output neuron spike, when it falls into the same time window as some input neuron spikes. In the pattern frequency = 0.5 condition there are more instances of the pattern and therefore (if successful) more output neuron spikes. It is therefore interesting that modifying the STDP rule for the case of Δt=0 increases the performance. While for Figure 5 the standard STDP rule should have been run (and was run for the article figures), it shows that this modification only has an impact in very specific circumstances.
The article text was changed to explain the modified learning rule better. In essence, the modification accounts for the fact that input neuron and output neuron spikes fall into the same time window in a time-based simulation, which would not happen in an event-based simulation. If there is one input neuron spike and one output neuron spike in the same time window, there is a 50% chance of the input neuron spike being before the output neuron spike and vice versa. The modification reflects this and adds the mean of LTP and LTD in those cases. Other results from the original paper
main.py and run_simulation.py now also create Figure 9 (original Figure 4) and Figure 10 (original Figure 1) as supplemental figures (they have not been added to the article). Note that Figure 9 only looks like the original figure if the pattern finding was successful and that the two spikes per pattern in original Figure 4b only rarely occur. Other remarks
Yes this is correct, the spikes just after the output spike are mostly depressed, therefore the voltage is slow to reach levels close to threshold again during the pattern.
Yes, there seems to be a minimum time between spikes necessary for the model to work. Increasing the negative afterpotential to create longer periods between spikes works just as well despite lower output neuron firing rates that result from the large afterpotential. Decreasing the size of the afterpotential leads to a faster increase in voltage after a spike and therefore higher firing rates, which at some point leads to a failure to find the pattern (very high firing rates lead to an overall depression of all weights and a failure to become specific the pattern). We did not attempt to quantify this (how small the afterpotential can be) but it would be interesting to see whether there it is a relationship with parameters such as the time constants of the EPSPs. |
|
Sorry for the delayed review. The first review was already quite extensive and I agree with all the points (which are already solved anyway), so I will just make a couple of additional remarks. The submission is of extremely high quality, be it the correctness/exhaustiveness of the reimplementation, the description of the reimplementation, or the added value of the additional experiments performed on the original model. What I particularly appreciate is the understanding of the conditions and mechanisms allowing the Masquelier model to work or not. I tried last year to reproduce that same model using another neurosimulator (ANNarchy) and failed miserably (it detected patterns, but there were a lot of false positive, i.e. spikes between two patterns. There was not enough depression). Now I understand why: I took dt=0.5 ms, wrongly understood what they meant with nearest spike approximation (I implemented what you call NN, not RNN, so it ended up with too many potentiations) and did not care about simultaneous pre and post spikes. It had no chance to work... Thanks to your work I will try again. Running the replicationI ran the preloaded simulations on an arch system without anaconda ( I ran the complete simulation on a Mint desktop using anaconda and the suggested package versions ( CodeThe code is clear and well structured. It could have deserved a bit more comments, but for who is already familiar with brian, it should not be a problem to understand what it is doing. ArticleThe paper is very clear and well written. The results from Masquelier et al. 2008 were successfully reproduced (within a very reasonable margin due to random initializations and a fundamentally different implementation - SRM vs. LIF). I am fine with not repeating the equations from Masquelier which have not changed, but I find Eqs 1,2,3 confusing. One should also specify the numerical method used to solve the ODEs: Euler, exact, exponential? It plays surely a role regarding the right values for dt. It would be nice to add some comments on how to go from a SRM model to a discretized ODE-based model. Perhaps recap the idea of SRM, and that the kernels they use are just alpha functions which can be very well modeled by two coupled ODEs. In SRM, they cut the kernels at +/- 7*tau. Is something similar necessary with ODEs, or are the variables already so close to zero that it does not matter? Effect of learning rule at Typos: (version 5de5d9a)
The new figures 9 and 10 are nice, do you plan to ship a supplementary file too, or do they only appear when running the code? |
|
@damiendr @vitay Thanks for your very detailed review. I think @damiendr comments have already been adressed. @pamelahathway @thesamovar Can you address the remaining comments and questions? After this, @damiendr and @vitay, could you tell me if you accept the submission for publication? |
Response to Reviewer 2Thank you for your insightful comments, @vitay. Here are our answers to the second review and the code and article with the incorporated changes.
These were also errors we made when we started, the original paper is not very clear on the learning rule and does not use discrete time steps and therefore does not comment on this.
Since we are not completely sure that future versions will work, we will leave the
It would work on a single machine, but would take days if not weeks, so it definitely is best run in parallel on a cluster.
More comments were added to the Brian code.
It is true that we missed adding the weights to In Brian, x is a variable of the postsynaptic neuron (the equations
Since ODEs in this case there is an exact solution to the equations, we used the exact (in the code it is called We ran 10 repetitions or each
This was added to the text: "Brian on the other hand uses differential equations to model the system parameters and evaluates those equations for each time step. The SRM kernels can be modelled using alpha functions [@Brette2007], so we converted the postsynaptic potential and the spike afterpotential into the following differential equations:" There is a detailed explanation of how to go from SRM to ODE's in the Brian documentation: http://brian2.readthedocs.io/en/stable/user/converting_from_integrated_form.html
Using ODE's eliminates the necessity to use such cut-offs as in the original paper. However, the variables are already 0 at +/-7*tau.
We removed that half sentence since most modelling papers probably do ignore this effect.
This was corrected, thank you.
We think this is correct, so we will leave out the 'the'.
The two figures do not add much more information about the equivalence of the two models, so they will only appear when running the code. |
|
Thanks for your detailed reply. I am satisfied with the changes and recommend the acceptance of the submission. |
|
Thank you for your replies. I have been able to re-run the updated code on our cluster and I can confirm that the results for figure 5 now match. The fact that the modified rule caused the original discrepancy is interesting — I did not expect the Δt=0 case to make so much difference in practice. I learned something here about these corner cases of neural simulations, which is good! The modifications for the other figures and the text also address my remarks, and my recommendation is therefore to accept the article. |
|
Congratulation @thesamovar, the article is accepted for publication. It will take a few days until it appears online. Don't hesitate to ping me if it is not the case. |
|
Article has been published and will soon appeared at https://rescience.github.io/read/ Repository has been archived at https://github.com/ReScience-Archives/Hathway-Goodman-2018 |









AUTHOR
Dear @ReScience/editors,
I request a review for the following replication:
Original article
Title: Spike Timing Dependent Plasticity Finds the Start of Repeating Patterns in Continuous Spike Trains
Author(s): Masquelier T, Guyonneau R, Thorpe SJ
Journal (or Conference): PLoS ONE
Year: 2008
DOI: 10.1371/journal.pone.0001377
PDF: https://doi.org/10.1371/journal.pone.0001377
Replication
Author(s): Hathway P, Goodman D
Repository: https://github.com/pamelahathway/ReScience-submission
PDF: https://github.com/pamelahathway/ReScience-submission/article
Keywords: STDP, spatio-temporal spike pattern
Language: Python
Domain: Computational Neuroscience
Results
Potential reviewers
Christoph Metzner
Pierre Yger
EDITOR
1 June, 2018)3 June, 2018)9 June, 2018)27 June 2018)1 July 2018)6 July 2018)