[MRG] accelerate plot_gradient_boosting_regularization.py example #21598#21611
Conversation
|
If the final output hasn't changed, we may be able to push further and speed up the example even more. Thanks for the work @sply88 |
|
Original figure in example looks like this: Could speed it up a bit more to around 9s by only using 400 boosting iterations. So the x-Axis of the figure in my original PR comment would end at 400 and the yellow and blue lines would not cross anymore. I don't think this would be a big issue because it would still be obvious that shrinkage is good and no-shrinkage (e.g. blue and yellow lines) is bad. |
| X_train, X_test = X[:1500], X[1500:] | ||
| y_train, y_test = y[:1500], y[1500:] |
There was a problem hiding this comment.
you could also try reducing the number of samples in the make_hastie_10_2 line above
thomasjpfan
left a comment
There was a problem hiding this comment.
Thank you for the PR @sply88 !
|
|
||
|
|
||
| X, y = datasets.make_hastie_10_2(n_samples=12000, random_state=1) | ||
| X, y = datasets.make_hastie_10_2(n_samples=3000, random_state=1) |
There was a problem hiding this comment.
What do you think of setting the following settings?
from sklearn.model_selection import train_test_split
X, y = datasets.make_hastie_10_2(n_samples=4000, random_state=1)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.8, random_state=0)
original_params = {
"n_estimators": 400,
...
}It looks like it keeps a very similar message as the original:
thomasjpfan
left a comment
There was a problem hiding this comment.
Thank you for the update @sply88 !
LGTM
…t-learn#21598 (scikit-learn#21611) * accelerate plot_gradient_boosting_regularization.py example scikit-learn#21598 * speed up by less samples and less trees * use train_test_split instead of slicing
…t-learn#21598 (scikit-learn#21611) * accelerate plot_gradient_boosting_regularization.py example scikit-learn#21598 * speed up by less samples and less trees * use train_test_split instead of slicing



Speeds up ../examples/ensemble/plot_gradient_boosting_regularization.py (Issue #21598) by
n_estimatorsfrom 1000 to 600Reduction of
n_estimatorsis compensated by increasing the learning rate from 0.1 to 0.2 (for models with shrinkage).For me example runs in 13 sec now (previously plus 30).
Main message of final figure does not change:
