Skip to content

[NOMRG] new warmstart API for GBDTs#15105

Closed
NicolasHug wants to merge 12 commits intoscikit-learn:masterfrom
NicolasHug:warmstart_gbdt_gridsearch
Closed

[NOMRG] new warmstart API for GBDTs#15105
NicolasHug wants to merge 12 commits intoscikit-learn:masterfrom
NicolasHug:warmstart_gbdt_gridsearch

Conversation

@NicolasHug
Copy link
Copy Markdown
Member

@NicolasHug NicolasHug commented Sep 27, 2019

Ping @adrinjalali is this what was decided during the sprint?

(CI will break because I'm not catching the deprecation warnings)

@NicolasHug NicolasHug changed the title [NOMRG] new warmstart API + grid search for GBDTs [NOMRG] new warmstart API for GBDTs Oct 3, 2019
Copy link
Copy Markdown
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few notes as I was reading. Yeah this is what we had in mind IIRC (from the estimator's perspective).

The implementation in the pipeline would be trickier.


# For backward compat (for now)
if self.warm_start:
warnings.warn("warm_start parameter is deprecated", DeprecationWarning)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also mention how the user can fix it/what they should do instead.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure this is just some sort of proof of concept.

In the specific case of the hist GBDT, we don't need to deprecate anything since the warm-start hasn't even been released yet :)

@NicolasHug
Copy link
Copy Markdown
Member Author

The implementation in the pipeline would be trickier.

Are you sure? for the fit_param, we could just do e.g. pipe.fit(X, y, warm_start_with={'hist_gbdt__max_iter': 100}

and the _warmstartable_parameters attribute could just be a property that we would set depending on the last step? (This is assuming that only the last step of the pipeline can be warm-started which I guess is reasonable)

@adrinjalali
Copy link
Copy Markdown
Member

I was thinking of allowing warm start, or refitting the pipeline, having changed a parameter anywhere, and only refit the steps after the changed one. So in a sense, pipeline can warm start with almost any of its parameters

@NicolasHug
Copy link
Copy Markdown
Member Author

You mean, when doing pipe.set_params({'second_step__something': 12}) then not fitting the first step again when calling pip.fit()?

I feel like this is yet another kind of warm-starting, mostly orthogonal to the kind of warm-starting that this API is concerned with?

@adrinjalali
Copy link
Copy Markdown
Member

So for now, we forget about warm start on the pipeline for this PR, and we should test for metaestimators such as pipeline and Voting estimators.

@NicolasHug
Copy link
Copy Markdown
Member Author

Are you happy with how it looks so far @adrinjalali ? (test failure is unrelated)

I think we should write a SLEP for this.

Copy link
Copy Markdown
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we decide on the API, I think we should put the warm startable params higher in the meta-estimator hierarchy (I think), but otherwise looks good.

And yeah we should write a slep for the new API.

def _warmstartable_parameters(self):
# This property exposes the _warmstartable_parameters attribute, e.g.
# ['+last_step_name__param']
# We consider that only the last step can be warm-started. The first
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to make this design choice? I may have asked this question before, but don't remember your logic behind it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pipeline is either:

  • a sequence of transformers
  • a sequence of transformers + a predictor as the last step

It's safe to assume that transformers are not warm-startable. Maybe in the future one of them will be?? We can worry about that when that happens.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could have a word embedding transformer as a step, which very often is warm started. We may not have that inside sklearn, but the pipeline's API should support it, I think.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should leave the possibility open, and make the code future-proof. This is one of the reasons why _warmstartable_parameters is a list.

But I don't think we should implement support for that, we have no use-case ATM.

Concretely, supporting warm-start for transformers right now is writing code that isn't used (that would require updating the _fit method that fits all the transformers)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Yeah fair.

@adrinjalali
Copy link
Copy Markdown
Member

@jnothman I'd appreciate your thoughts on this one.

@NicolasHug
Copy link
Copy Markdown
Member Author

The benefits of this approach aren't clear to me, and we have enough SLEPs to deal with ATM. Closing, might re-open one day

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants