Skip to content

MNT speed-up example plot_scalable_poly_kernels.py#21731

Closed
ghost wants to merge 3 commits intomainfrom
unknown repository
Closed

MNT speed-up example plot_scalable_poly_kernels.py#21731
ghost wants to merge 3 commits intomainfrom
unknown repository

Conversation

@ghost
Copy link
Copy Markdown

@ghost ghost commented Nov 21, 2021

#21598 @adrinjalali

Adapted the number of components (N_COMPONENTS) to make the example quicker while maintaining the same characteristic output graph and message of the example.

Before:
poly_bef

After
poly_aft
:

sveneschlbeck and others added 2 commits November 21, 2021 15:01
Corrected a typo in the original file. It read ``featrues`` instead of ``features``
@glemaitre glemaitre self-requested a review November 22, 2021 17:39
@glemaitre glemaitre changed the title Changed number of components to make example execution faster MNT speed-up example plot_scalable_poly_kernels.py Nov 22, 2021
# much faster if the number of training samples increases.

N_COMPONENTS = [250, 500, 1000, 2000]
N_COMPONENTS = [250, 450, 850, 1750]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually you should not have to redefine this variable. It could be reused from the previous cells.

@@ -99,7 +99,7 @@
# compensate for the stochastic nature of :class:`PolynomialCountSketch`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous line mentions x5 runs. Indeed we do only x3

@@ -99,7 +99,7 @@
# compensate for the stochastic nature of :class:`PolynomialCountSketch`.

n_runs = 3
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking that we could maybe avoid several runs and just mention that one should repeat it to have a more stable estimate in practice. I think that this is the only way that we can speed up, even more, the experiment.
@ogrisel WDYT?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glemaitre Alright, sounds good :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it sounds unnecessary to me to run this 3 times. We should either do it at least 5 times, or only once, and just say it can be run a few times to get a more reliable result there.

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
@adrinjalali adrinjalali changed the title MNT speed-up example plot_scalable_poly_kernels.py MNT speed-up example plot_scalable_poly_kernels.py Nov 22, 2021
@adrinjalali adrinjalali mentioned this pull request Nov 22, 2021
41 tasks
@siavrez
Copy link
Copy Markdown
Contributor

siavrez commented Nov 22, 2021

Removing multiple run of Linearsvc with PolynomialCountSketch. results are similar but runtime improves 23 +/- 1.56 seconds.

with 3 iterations
3run

with single iteration
singlerun

@adrinjalali
Copy link
Copy Markdown
Member

@sveneschlbeck would you have a chance to apply the suggested changes?

@adrinjalali adrinjalali added Stalled good first issue Easy with clear instructions to resolve help wanted labels Mar 18, 2022
@adrinjalali
Copy link
Copy Markdown
Member

Putting this available for any contributor who wants to finish the work.

@jsilke
Copy link
Copy Markdown
Contributor

jsilke commented Mar 19, 2022

I am happy to complete the work.

@jeremiedbb
Copy link
Copy Markdown
Member

Finished in #22903. Thanks @sveneschlbeck

@jeremiedbb jeremiedbb closed this Mar 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

good first issue Easy with clear instructions to resolve help wanted Stalled

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants