Changed n_jobs parameter to increase speed in plot_validation_curve.py · Pull Request #21638 · scikit-learn/scikit-learn

ghost · 2021-11-11T20:54:31Z

Adapted the n_jobs parameter from 1 to -1 (auto-detect mode) which halfed the time needed to run the module

ogrisel · 2021-11-12T08:34:37Z

Running with -1 by default is problematic because on machines with a large number of CPUs (e.g. 64 or more), spawning the workers can dominate with concurrent access to the hard disk just to start the python interpreters and import the modules. Furthermore it can also use too much memory and cause crashes.

This is why we would rather use a small number of workers (e.g. 2 instead of -1) when we want to use parallelism in examples or tests in scikit-learn.

adrinjalali · 2021-11-12T11:19:00Z

I agree with @ogrisel , and I think alternative is to find other ways to speed up the example. You can set the n_jobs to 2, and find other ways to further make the example faster.

ghost · 2021-11-12T12:12:27Z

@ogrisel @adrinjalali Okay, that makes sense, thanks for the explanation :) Will set n_jobs to 2 in a first step and then look for further change possibilities

ogrisel

LGTM, let's hope it runs faster on circle ci :)

adrinjalali · 2021-11-12T15:28:03Z

This example uses the digits dataset, and I think that's the main source of it being slow. It'd be nice if you could try either iris or a synthetic dataset to see if you can get similar plots while making it significantly faster (I've seen a 100x speedup in some examples by getting rid of the digits dataset)

ogrisel · 2021-11-12T15:43:52Z

I believe the combo of Gaussian RBF + digits is important to get such charateristic validation curves for gamma.

But maybe it would be possible to get similar results with a random sub-sample, or considering a binary classification subproblem such as 1 vs 2 (to make it non trivial):

X, y = load_digits(return_X_y=True)
subset_mask = np.isin(y, [1, 2])  # binary classification: 1 vs 2
X, y = X[subset_mask], y[subset_mask]

Since SVC is and One vs Rest classifier that should greatly help ;)

Edit: changed to 1 vs 2 which is slightly harder than 1 vs 7

adrinjalali · 2021-11-16T14:34:07Z

@sveneschlbeck could you please apply Olivier's suggestion?

ghost · 2021-11-16T14:54:46Z

@adrinjalali Yes, am on it!

ghost · 2021-11-16T14:59:56Z

@adrinjalali @ogrisel The result makes a big difference in exec time (18 sec vs. 3 sec) but the "C" isn't as big and clearly shaped as before. What do you think? Should I change the code after this result?:

adrinjalali · 2021-11-16T15:29:39Z

To me it still shows the effect the same way, I'd be happy with it.

* Changed n_jobs parameter to increase speed * Update plot_validation_curve.py * Update plot_validation_curve.py

Changed n_jobs parameter to increase speed

ec01697

ogrisel mentioned this pull request Nov 12, 2021

Increased speed by adding cv and n_jobs params plot_multi_metric_evaluation.py #21626

Merged

adrinjalali changed the title ~~Changed n_jobs parameter to increase speed~~ Changed n_jobs parameter to increase speed in plot_validation_curve.py Nov 12, 2021

Update plot_validation_curve.py

8303d51

adrinjalali mentioned this pull request Nov 12, 2021

Accelerate slow examples #21598

Closed

41 tasks

ogrisel approved these changes Nov 12, 2021

View reviewed changes

Update plot_validation_curve.py

d315798

adrinjalali approved these changes Nov 16, 2021

View reviewed changes

adrinjalali merged commit 7bfa9cc into scikit-learn:main Nov 16, 2021

ghost deleted the speed_increased_example_valcurve branch November 16, 2021 17:53

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Nov 22, 2021

DOC increase speed in plot_validation_curve.py (scikit-learn#21638)

8a4e63d

* Changed n_jobs parameter to increase speed * Update plot_validation_curve.py * Update plot_validation_curve.py

thomasjpfan mentioned this pull request Nov 24, 2021

MNT Adjust n_jobs=2 for examples and tests #21780

Merged

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Nov 29, 2021

DOC increase speed in plot_validation_curve.py (scikit-learn#21638)

09976a1

* Changed n_jobs parameter to increase speed * Update plot_validation_curve.py * Update plot_validation_curve.py

samronsin pushed a commit to samronsin/scikit-learn that referenced this pull request Nov 30, 2021

DOC increase speed in plot_validation_curve.py (scikit-learn#21638)

e112230

* Changed n_jobs parameter to increase speed * Update plot_validation_curve.py * Update plot_validation_curve.py

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Dec 24, 2021

DOC increase speed in plot_validation_curve.py (scikit-learn#21638)

3cc0402

* Changed n_jobs parameter to increase speed * Update plot_validation_curve.py * Update plot_validation_curve.py

glemaitre pushed a commit that referenced this pull request Dec 25, 2021

DOC increase speed in plot_validation_curve.py (#21638)

be935bd

* Changed n_jobs parameter to increase speed * Update plot_validation_curve.py * Update plot_validation_curve.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Changed n_jobs parameter to increase speed in plot_validation_curve.py#21638

Changed n_jobs parameter to increase speed in plot_validation_curve.py#21638
adrinjalali merged 3 commits intomainfrom
unknown repository

ghost commented Nov 11, 2021

Uh oh!

ogrisel commented Nov 12, 2021 •

edited

Loading

Uh oh!

adrinjalali commented Nov 12, 2021

Uh oh!

ghost commented Nov 12, 2021

Uh oh!

ogrisel left a comment

Uh oh!

adrinjalali commented Nov 12, 2021

Uh oh!

ogrisel commented Nov 12, 2021 •

edited

Loading

Uh oh!

adrinjalali commented Nov 16, 2021

Uh oh!

ghost commented Nov 16, 2021

Uh oh!

ghost commented Nov 16, 2021

Uh oh!

adrinjalali commented Nov 16, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ghost commented Nov 11, 2021

Uh oh!

ogrisel commented Nov 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrinjalali commented Nov 12, 2021

Uh oh!

ghost commented Nov 12, 2021

Uh oh!

ogrisel left a comment

Choose a reason for hiding this comment

Uh oh!

adrinjalali commented Nov 12, 2021

Uh oh!

ogrisel commented Nov 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adrinjalali commented Nov 16, 2021

Uh oh!

ghost commented Nov 16, 2021

Uh oh!

ghost commented Nov 16, 2021

Uh oh!

adrinjalali commented Nov 16, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ogrisel commented Nov 12, 2021 •

edited

Loading

ogrisel commented Nov 12, 2021 •

edited

Loading