Speeding up simplify_neuron by dokato · Pull Request #472 · natverse/nat

dokato · 2021-06-30T13:01:20Z

Apart from suggestions @jefferis listed in #471 I noticed (following profiling) that calling apply on columns is much longer than just running dijkstra from start node again, compare:

> system.time(which.max(apply(dd, 2, robust_max)) )
   user  system elapsed 
  3.784   0.744   4.622 

> system.time(igraph::distances(ng, v=start, to=leaves, mode = 'out'))
   user  system elapsed 
  0.204   0.004   0.209

This already reduces time by half:

> system.time(simplify_neuron(large_neuron[[1]]))
   user  system elapsed 
  5.489   1.780   7.399 
> system.time(simplify_neuron(large_neuron[[1]]))
   user  system elapsed 
  2.730   0.834   3.565

Said neuron below to compare:
large_neuron.rds.zip

* may want to think about making pruned_edges return what it receives

* but actually it turns out that it does not do the same as simplify_neuron because the first computed path does not use the root as an origin

* seems to match simplify_neuron now * still slower

* this does mean that some distances will be double computed but perhaps that is for the best when the memory requirements get so large. * I think we could get some further efficiencies by dropping leaves.

* when finding a simple path between two points across a neuron it is much faster *not* to use weights

dokato · 2021-07-02T22:24:52Z

I benchmarked your latest commits:

# large neuron
benchmark("simplify_neuron_old" = {
  simplify_neuron_old(nrn[[1]], n=2)
},"simplify_neuron" = {
  simplify_neuron(nrn[[1]], n=2)
},"simplify_neuron2" = {
  simplify_neuron2(nrn[[1]], n=2)
}, replications = 25)

                 test replications elapsed relative user.self sys.self
2     simplify_neuron           25  52.759    1.000    49.262    3.017
1 simplify_neuron_old           25 202.476    3.838   124.206   63.886
3    simplify_neuron2           25 113.827    2.157   112.653    0.965

# small neuron
benchmark("simplify_neuron_old" = {
  simplify_neuron_old(Cell07PNs[[11]], n=2)
},"simplify_neuron" = {
  simplify_neuron(Cell07PNs[[11]], n=2)
},"simplify_neuron2" = {
  simplify_neuron2(Cell07PNs[[11]], n=2)
}, replications = 25)

                 test replications elapsed relative user.self sys.self
2     simplify_neuron           25   0.209    1.035     0.202    0.007
1 simplify_neuron_old           25   0.202    1.000     0.184    0.018
3    simplify_neuron2           25   0.232    1.149     0.228    0.004

Looks like we get almost 4x gain and are slightly faster than simplify_neuron2.

Some outstanding comments:

# FIXME check if dist was 0 => no valid path - should we stop and throw error here?
can simplify_neuron2 be removed?
why CI gets stuck?

jefferis · 2021-07-03T20:35:44Z

Thanks @dokato. I added another commit, which changes things quite a bit more (still for the better I hope in most cases).

jefferis · 2021-07-03T22:25:07Z

# FIXME check if dist was 0 => no valid path - should we stop and throw error here? Redundant as removed by f1a686e
can simplify_neuron2 be removed? Yes I think so.
why CI gets stuck? Probably travis.org deactivation. I have got some complex github actions setups working e.g. fafbseg but I never finished the nat PR.

dokato · 2021-07-04T21:26:23Z

Great, that looks really good - almost 8x gain on complex cases. See updated benchmarks below:

# large neuron
                 test replications elapsed relative user.self sys.self
2     simplify_neuron           25  37.768    1.000    31.178    2.541
1 simplify_neuron_old           25 296.742    7.857   131.747   86.796
3    simplify_neuron2           25 126.646    3.353   115.958    3.775

# small neuron
                 test replications elapsed relative user.self sys.self
2     simplify_neuron           25   0.252    1.000     0.228    0.011
1 simplify_neuron_old           25   0.275    1.091     0.217    0.028
3    simplify_neuron2           25   0.319    1.266     0.297    0.014

codecov · 2021-07-11T15:18:07Z

Codecov Report

Merging #472 (234a8ba) into master (0fe1b29) will decrease coverage by 0.03%.
The diff coverage is 84.61%.

@@            Coverage Diff             @@
##           master     #472      +/-   ##
==========================================
- Coverage   76.89%   76.85%   -0.04%     
==========================================
  Files          47       47              
  Lines        5856     5825      -31     
==========================================
- Hits         4503     4477      -26     
+ Misses       1353     1348       -5

Impacted Files	Coverage Δ
R/neuron.R	`83.49% <83.33%> (-0.06%)`	⬇️
R/ngraph.R	`86.63% <100.00%> (+0.07%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0fe1b29...234a8ba. Read the comment docs.

jefferis · 2021-07-16T11:57:42Z

@dokato I'm going to merge this now. Thanks for getting it going. The codecov/project diff failure is a false positive (the missing lines are warnings that are never triggered because they shouldn't be!). Is there a way to adjust the config for codecov to be a bit more forgiving?

dokato · 2021-07-16T12:14:46Z

Not sure, but I'll take a look into that. I'm still confused of why coveralls hasn't worked for us.

* was initially part of #472 but later superseded

dokato added 2 commits June 30, 2021 13:57

[upt] speeding up simplify_neuron part 1

ee4e615

[upd] another gain with simplify_neuron ta to matrixStats

29e6d4c

dokato changed the title ~~[upt] speeding up simplify_neuron part 1~~ [WIP] speeding up simplify_neuron Jun 30, 2021

dokato mentioned this pull request Jul 1, 2021

improve performance of simplify_neuron #471

Closed

jefferis added 6 commits July 1, 2021 15:04

factor out private prune_edges_ng for ngraphs

a641c12

* may want to think about making pruned_edges return what it receives

speed up Sri's simplify_neuron2 for testing

ebbfbc4

* but actually it turns out that it does not do the same as simplify_neuron because the first computed path does not use the root as an origin

Switch Sri's implementation to start from the root

6b8c2b5

* seems to match simplify_neuron now * still slower

simplify_neuron only compute distances when required

1b9ba38

* this does mean that some distances will be double computed but perhaps that is for the best when the memory requirements get so large. * I think we could get some further efficiencies by dropping leaves.

leafpath doesn't need to use weights

7caf92f

* when finding a simple path between two points across a neuron it is much faster *not* to use weights

Compute fewer distances and do not pre-compute dists

f1a686e

dokato changed the title ~~[WIP] speeding up simplify_neuron~~ Speeding up simplify_neuron Jul 9, 2021

dokato and others added 3 commits July 9, 2021 14:03

cleaned obscure implementation of simplify_neuron(2)

a418463

Merge branch 'master' into opt-simp-neuron

a95bdce

don't test simplify_neuron2

234a8ba

jefferis merged commit 8d8b28c into natverse:master Jul 16, 2021

jefferis added a commit that referenced this pull request Jul 17, 2021

remove unused matrixStats dependency

a112342

* was initially part of #472 but later superseded

dokato deleted the opt-simp-neuron branch August 25, 2021 15:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speeding up simplify_neuron#472

Speeding up simplify_neuron#472
jefferis merged 11 commits intonatverse:masterfrom
dokato:opt-simp-neuron

dokato commented Jun 30, 2021

Uh oh!

dokato commented Jul 2, 2021

Uh oh!

jefferis commented Jul 3, 2021

Uh oh!

jefferis commented Jul 3, 2021 •

edited

Loading

Uh oh!

dokato commented Jul 4, 2021

Uh oh!

codecov bot commented Jul 11, 2021

Uh oh!

jefferis commented Jul 16, 2021

Uh oh!

dokato commented Jul 16, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dokato commented Jun 30, 2021

Uh oh!

dokato commented Jul 2, 2021

Uh oh!

jefferis commented Jul 3, 2021

Uh oh!

jefferis commented Jul 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dokato commented Jul 4, 2021

Uh oh!

codecov bot commented Jul 11, 2021

Codecov Report

Uh oh!

jefferis commented Jul 16, 2021

Uh oh!

dokato commented Jul 16, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jefferis commented Jul 3, 2021 •

edited

Loading