[MRG] accelerate plot_gradient_boosting_regularization.py example #21598 by sply88 · Pull Request #21611 · scikit-learn/scikit-learn

sply88 · 2021-11-09T21:01:53Z

Speeds up ../examples/ensemble/plot_gradient_boosting_regularization.py (Issue #21598) by

reducing number of samples in train and test datasets from 2000 to 1500
reducing n_estimators from 1000 to 600

Reduction of n_estimators is compensated by increasing the learning rate from 0.1 to 0.2 (for models with shrinkage).

For me example runs in 13 sec now (previously plus 30).

Main message of final figure does not change:

adrinjalali · 2021-11-10T13:33:55Z

If the final output hasn't changed, we may be able to push further and speed up the example even more. Thanks for the work @sply88

sply88 · 2021-11-11T06:36:14Z

Original figure in example looks like this:

Could speed it up a bit more to around 9s by only using 400 boosting iterations. So the x-Axis of the figure in my original PR comment would end at 400 and the yellow and blue lines would not cross anymore. I don't think this would be a big issue because it would still be obvious that shrinkage is good and no-shrinkage (e.g. blue and yellow lines) is bad.
What do you think @adrinjalali?

adrinjalali · 2021-11-12T15:13:05Z

examples/ensemble/plot_gradient_boosting_regularization.py

+X_train, X_test = X[:1500], X[1500:]
+y_train, y_test = y[:1500], y[1500:]


you could also try reducing the number of samples in the make_hastie_10_2 line above

Have reduced both number of samples and number of estimators to get down to 5s.
Output below. Main message is still obvious I think.

thomasjpfan

Thank you for the PR @sply88 !

thomasjpfan · 2021-11-24T02:57:40Z

examples/ensemble/plot_gradient_boosting_regularization.py



-X, y = datasets.make_hastie_10_2(n_samples=12000, random_state=1)
+X, y = datasets.make_hastie_10_2(n_samples=3000, random_state=1)


What do you think of setting the following settings?

from sklearn.model_selection import train_test_split X, y = datasets.make_hastie_10_2(n_samples=4000, random_state=1) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.8, random_state=0) original_params = { "n_estimators": 400, ... }

It looks like it keeps a very similar message as the original:

…ng_regularization.py

thomasjpfan

Thank you for the update @sply88 !

LGTM

…t-learn#21598 (scikit-learn#21611) * accelerate plot_gradient_boosting_regularization.py example scikit-learn#21598 * speed up by less samples and less trees * use train_test_split instead of slicing

#21611) * accelerate plot_gradient_boosting_regularization.py example #21598 * speed up by less samples and less trees * use train_test_split instead of slicing

accelerate plot_gradient_boosting_regularization.py example scikit-le…

4b6af2f

…arn#21598

sply88 changed the title ~~accelerate plot_gradient_boosting_regularization.py example #21598~~ [MRG] accelerate plot_gradient_boosting_regularization.py example #21598 Nov 9, 2021

adrinjalali mentioned this pull request Nov 10, 2021

Accelerate slow examples #21598

Closed

41 tasks

adrinjalali reviewed Nov 12, 2021

View reviewed changes

speed up by less samples and less trees

3c9d342

thomasjpfan reviewed Nov 24, 2021

View reviewed changes

sply88 and others added 2 commits November 27, 2021 20:43

Merge branch 'scikit-learn:main' into accelerate-plot_gradient_boosti…

b4e2adc

…ng_regularization.py

use train_test_split instead of slicing

0f0fb3d

thomasjpfan approved these changes Nov 27, 2021

View reviewed changes

adrinjalali approved these changes Nov 29, 2021

View reviewed changes

adrinjalali merged commit f19bf4c into scikit-learn:main Nov 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MRG] accelerate plot_gradient_boosting_regularization.py example #21598#21611

[MRG] accelerate plot_gradient_boosting_regularization.py example #21598#21611
adrinjalali merged 4 commits intoscikit-learn:mainfrom
sply88:accelerate-plot_gradient_boosting_regularization.py

sply88 commented Nov 9, 2021

Uh oh!

adrinjalali commented Nov 10, 2021

Uh oh!

sply88 commented Nov 11, 2021

Uh oh!

adrinjalali Nov 12, 2021

Uh oh!

sply88 Nov 12, 2021

Uh oh!

thomasjpfan left a comment

Uh oh!

thomasjpfan Nov 24, 2021

Uh oh!

thomasjpfan left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		X_train, X_test = X[:1500], X[1500:]
		y_train, y_test = y[:1500], y[1500:]



		X, y = datasets.make_hastie_10_2(n_samples=12000, random_state=1)
		X, y = datasets.make_hastie_10_2(n_samples=3000, random_state=1)

Uh oh!

Conversation

sply88 commented Nov 9, 2021

Uh oh!

adrinjalali commented Nov 10, 2021

Uh oh!

sply88 commented Nov 11, 2021

Uh oh!

adrinjalali Nov 12, 2021

Choose a reason for hiding this comment

Uh oh!

sply88 Nov 12, 2021

Choose a reason for hiding this comment

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

thomasjpfan Nov 24, 2021

Choose a reason for hiding this comment

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants