[MRG+1] Improve benchmark on NMF by TomDLT · Pull Request #5779 · scikit-learn/scikit-learn

TomDLT · 2015-11-10T10:29:44Z

Previous benchmark used simulated data, did not use the new coordinate descent solver (#4852), and I found the plot very uninformative:

This new benchmark tests NMF on two datasets:

20 newsgroup dataset: sparse, shape(11314, 39116)
Olivetti faces dataset: dense, shape(400, 4096)

It uses three different solvers:

Projected Gradient (deprecated)
Coordinate Descent
Multiplicative Update ([MRG+1] Add multiplicative-update solver in NMF, with all beta-divergence #5295 not merged)

and three different initialization schemes:

random
NNDSVD
NNDSVDAR

The total running time is about 2 minutes for 20 Newsgroups dataset, and 1 minute for Olivetti faces dataset.

On the plots, each point corresponds to one more iteration.

This change is

raghavrv · 2015-11-20T11:41:09Z

benchmarks/bench_plot_nmf.py

m --> mem maybe?

ogrisel

This need a rebase + see my comments, both other than that +1 for merge.

ogrisel · 2016-09-19T12:27:44Z

benchmarks/bench_plot_nmf.py

-            report(norm(X - np.dot(W, H)), tend)
-
-    return timeset, err
+@ignore_warnings


Please add a comment to explain which kind of warnings do you expect to ignore here.

It was a panda's indexing warning, I solved it and removed the ignore_warnings.

ogrisel · 2016-09-19T12:28:32Z

benchmarks/bench_plot_nmf.py

+
+
+# use joblib to cache results.
+# X_shape is specified in arguments for avoiding hashing X


Out of curiosity, how much overhead does the hashing of X adds?

A few seconds on each dataset, i.e. for ~200 calls to bench_one.
If I recall correctly, I added it mostly for larger datasets like MNIST or RCV1.

TomDLT · 2016-09-29T17:01:27Z

Rebased and comments addressed.
Another review?

TomDLT · 2016-10-05T13:51:43Z

This will need an update when [MRG+1] Add multiplicative-update solver in NMF, with all beta-divergence #5295 is merged.

Question: should we keep the projected-gradient solver in this benchmark, once it is removed from the package (in 0.19) ?

amueller · 2016-10-05T16:45:27Z

Depends on how much trouble it is to keep them around, I'd say. It's informative to have it in the benchmark.

amueller · 2016-10-05T16:46:25Z

lgtm

jnothman · 2016-11-24T02:24:49Z

Well, I think you should drop ProjectedGradient now that we're in 0.19; and ideally we'd get #5295 merged, but otherwise this does have +2 already.

tguillemot · 2016-12-16T13:57:44Z

@TomDLT Can you rebase and drop ProjectedGradient ?

TomDLT · 2016-12-16T14:38:33Z

Soon 😸

TomDLT · 2016-12-19T09:59:03Z

Update:

The benchmark is now up-to-date with [MRG+1] Add multiplicative-update solver in NMF, with all beta-divergence #5295.
I have added a class _PGNMF inside the benchmark since this solver has been removed from nmf.py. Not sure if we want to keep it, I can revert it easily anyway. Tell me what you think.

tguillemot

I can't tell you about _PGNMF.
LGTM anyway

tguillemot · 2016-12-19T10:16:08Z

benchmarks/bench_plot_nmf.py


+###################
+# Start of _PGNMF #
+###################


If you keep it maybe you can add a comment to say where comes from that codes and when it was removed of sklearn ?

amueller · 2016-12-19T15:27:51Z

Keep it there I think. Sent from phone. Please excuse spelling and brevity.

…

On Dec 19, 2016 5:34 AM, "Thierry Guillemot" ***@***.***> wrote: ***@***.**** approved this pull request. I can't tell you about _PGNMF. LGTM anyway ------------------------------ In benchmarks/bench_plot_nmf.py <#5779 (review)> : > +################### +# Start of _PGNMF # +################### If you keep it maybe you can add a comment to say where comes from that codes et when it was removed of sklearn ? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#5779 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAbcFi_kvF7jg94u16pUUu2EaN1DrIEFks5rJl2ggaJpZM4GfTpQ> .

jnothman · 2016-12-20T13:47:54Z

Thanks for the clarity @TomDLT !

* ENH improve benchmark on nmf * add projected gradient solver inside the benchmark file * add comments and authors for _PGNMF

TomDLT mentioned this pull request Nov 10, 2015

[MRG+1] refactor NMF and add CD solver #4852

Merged

amueller mentioned this pull request Nov 16, 2015

bench_plot_nmf not python3 compatible #5768

Closed

raghavrv reviewed Nov 20, 2015
View reviewed changes

benchmarks/bench_plot_nmf.py Outdated

Copy link
Copy Markdown

Member

raghavrv Nov 20, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m --> mem maybe?

TomDLT force-pushed the bench_nmf branch from 5729f65 to 26b0b59 Compare November 20, 2015 16:35

amueller added the Waiting for Reviewer label Dec 10, 2015

ogrisel approved these changes Sep 19, 2016

View reviewed changes

TomDLT force-pushed the bench_nmf branch 2 times, most recently from 57cdf03 to 38561e1 Compare September 19, 2016 14:47

TomDLT changed the title ~~[MRG] Improve benchmark on NMF~~ [MRG+1] Improve benchmark on NMF Oct 4, 2016

ogrisel mentioned this pull request Oct 5, 2016

[MRG+1] Add multiplicative-update solver in NMF, with all beta-divergence #5295

Merged

TomDLT added 2 commits December 19, 2016 10:44

ENH improve benchmark on nmf

d9d65a6

add projected gradient solver inside the benchmark file

c543ac2

TomDLT force-pushed the bench_nmf branch from 38561e1 to c543ac2 Compare December 19, 2016 09:52

tguillemot approved these changes Dec 19, 2016

View reviewed changes

add comments and authors for _PGNMF

54f2223

jnothman merged commit 3f0af16 into scikit-learn:master Dec 20, 2016

TomDLT deleted the bench_nmf branch December 20, 2016 14:19

sergeyf pushed a commit to sergeyf/scikit-learn that referenced this pull request Feb 28, 2017

DOC Improve benchmark on NMF (scikit-learn#5779)

621c308

* ENH improve benchmark on nmf * add projected gradient solver inside the benchmark file * add comments and authors for _PGNMF

Przemo10 mentioned this pull request Mar 17, 2017

update fork (#1) #8606

Closed

Sundrique pushed a commit to Sundrique/scikit-learn that referenced this pull request Jun 14, 2017

DOC Improve benchmark on NMF (scikit-learn#5779)

c9fc4a4

* ENH improve benchmark on nmf * add projected gradient solver inside the benchmark file * add comments and authors for _PGNMF

NelleV pushed a commit to NelleV/scikit-learn that referenced this pull request Aug 11, 2017

DOC Improve benchmark on NMF (scikit-learn#5779)

faf79cd

* ENH improve benchmark on nmf * add projected gradient solver inside the benchmark file * add comments and authors for _PGNMF

paulha pushed a commit to paulha/scikit-learn that referenced this pull request Aug 19, 2017

DOC Improve benchmark on NMF (scikit-learn#5779)

faeb067

* ENH improve benchmark on nmf * add projected gradient solver inside the benchmark file * add comments and authors for _PGNMF

maskani-moh pushed a commit to maskani-moh/scikit-learn that referenced this pull request Nov 15, 2017

DOC Improve benchmark on NMF (scikit-learn#5779)

d469099

* ENH improve benchmark on nmf * add projected gradient solver inside the benchmark file * add comments and authors for _PGNMF



		# use joblib to cache results.
		# X_shape is specified in arguments for avoiding hashing X

Uh oh!

Conversation

TomDLT commented Nov 10, 2015 • edited by lesteve Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

raghavrv Nov 20, 2015

Choose a reason for hiding this comment

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

ogrisel Sep 19, 2016

Choose a reason for hiding this comment

Uh oh!

TomDLT Sep 19, 2016

Choose a reason for hiding this comment

Uh oh!

ogrisel Sep 19, 2016

Choose a reason for hiding this comment

Uh oh!

TomDLT Sep 19, 2016

Choose a reason for hiding this comment

Uh oh!

TomDLT commented Sep 29, 2016

Uh oh!

TomDLT commented Oct 5, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amueller commented Oct 5, 2016

Uh oh!

amueller commented Oct 5, 2016

Uh oh!

jnothman commented Nov 24, 2016

Uh oh!

tguillemot commented Dec 16, 2016

Uh oh!

TomDLT commented Dec 16, 2016

Uh oh!

TomDLT commented Dec 19, 2016

Uh oh!

tguillemot left a comment

Choose a reason for hiding this comment

Uh oh!

tguillemot Dec 19, 2016 • edited by TomDLT Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amueller commented Dec 19, 2016 via email

Uh oh!

jnothman commented Dec 20, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

TomDLT commented Nov 10, 2015 •

edited by lesteve

Loading

TomDLT commented Oct 5, 2016 •

edited

Loading

tguillemot Dec 19, 2016 •

edited by TomDLT

Loading