Skip to content

minor bugfix - setting _is_fitted to False before input checks in forecasters#941

Merged
fkiraly merged 8 commits intomainfrom
bugfix-reducer-isfitted
Jun 12, 2021
Merged

minor bugfix - setting _is_fitted to False before input checks in forecasters#941
fkiraly merged 8 commits intomainfrom
bugfix-reducer-isfitted

Conversation

@fkiraly
Copy link
Collaborator

@fkiraly fkiraly commented Jun 10, 2021

This is a minuscule change which is a preemptive fix.

The change sets self._is_fitted = False in the first line of fit, of a number of forecasters, before the input checks.

This prevents some unexpected behaviour when fit, of the estimator, is called multiple times in sequence, e.g., in grid search (as in the tutorial notebook).

It also prevents exceptions caused by this unexpected behaviour in any future refactor (e.g., #912) where _set_fh uses _is_fitted to check whether an fh is expected.

This should be done in all forecasters.

@fkiraly fkiraly requested a review from mloning as a code owner June 10, 2021 18:52
@fkiraly fkiraly changed the title minor bugfix - reducer sets _is_fitted to False before input checks minor bugfix - setting _is_fitted to False before input checks in some composers Jun 10, 2021
@fkiraly fkiraly requested a review from aiwalter June 10, 2021 19:14
@fkiraly fkiraly changed the title minor bugfix - setting _is_fitted to False before input checks in some composers minor bugfix - setting _is_fitted to False before input checks in forecasters Jun 10, 2021
@mloning
Copy link
Contributor

mloning commented Jun 11, 2021

Could you explain why this is needed for forecasters but not other estimator types? Is this related to the setting of the forecasting horizon? If that's the case, an alternative may be to pass the name of the caller method to the set_fh function, e.g. set_fh(fh, "fit").

@fkiraly
Copy link
Collaborator Author

fkiraly commented Jun 11, 2021

Is this related to the setting of the forecasting horizon?

Correct. It is to allow _set_fh, which gets self and fh as arguments, to distinguish a situation in which fit has been called twice from a situation in which fit and then predict has been called; in both situations, the method calling _set_fh internally.

If that's the case, an alternative may be to pass the name of the caller method to the set_fh function, e.g. set_fh(fh, "fit").

Yes, but it feels less python zen, and less consistent with the checkxyz functions that also exhibit behaviour based on the object state.

@fkiraly
Copy link
Collaborator Author

fkiraly commented Jun 12, 2021

do you want to look at this separately and/or accept this in prep for #912, @mloning, @aiwalter?

Copy link
Contributor

@mloning mloning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've thought about this a bit more and I prefer the option where we pass the name of the calling method to _set_fh, e.g. _set_fh(fh, method="fit"). For me, the main reasons are:

  • it keeps the functionality for checking/setting the fh contained to that function and does not assume state changes elsewhere in the code, like at the beginning of fit,
  • it keeps it closer to the implementation of other estimator types where we do not re-set the state at the beginning of fit,
  • it avoids the redundancy of having to set _is_fitted=False in the constructor and at the beginning of fit.

What do you think, @fkiraly?

@fkiraly
Copy link
Collaborator Author

fkiraly commented Jun 12, 2021

What do you think, @fkiraly?

Ok, happy to consider this as an endstate.

But, practically, going there would be much more more invasive than the current proposed change, since it deviates from the current signature (which you introduced).
Changing it would require a lot of manipulations in descendants - in every instance where _set_fh is called, it would have to change. Where, under the current change, only a line has to be added, and no logic needs (additional) change.

So, I would strongly recommend first carrying out this minimally invasive change, then the inheritance refactor, and then any well-isolated changes to the setters and checkers.
Note that after completing the interface refactor, this will have to be done in only a single location, namely the BaseForecaster.

To reiterate my suggestion: let's get with minimally invasive effort to the position where we can just refactor BaseForecaster and don't need to change every descendant class. Then, I'm absolutely happy to work on _set_fh and/or with other, additional interface changes.

Copy link
Contributor

@mloning mloning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change in the logic of _set_fh seem to be separate from the interface refactor, no? This PR was still easy to review for me, so adding a logic change to _set_fh could be included. In terms of descendants, we would still only have to change one line. Doing the change here would avoid having intermediate code on main.

But happy to follow your sequence of changes.

@fkiraly
Copy link
Collaborator Author

fkiraly commented Jun 12, 2021

This PR was still easy to review for me, so adding a logic change to _set_fh could be included. In terms of descendants, we would still only have to change one line.

That's not entirely true sine _set_fh appears in multiple places in descendants. I'm only adding a line to fit, and not changing existing lines or existing interfaces. But agreed with the rationale.

@fkiraly
Copy link
Collaborator Author

fkiraly commented Jun 12, 2021

so... @mloning - this needs a second review? Or are we fine to merge?

@mloning
Copy link
Contributor

mloning commented Jun 12, 2021

I think it's fine to merge this as an intermediate solution.

@fkiraly fkiraly merged commit 90b1ccf into main Jun 12, 2021
@fkiraly fkiraly deleted the bugfix-reducer-isfitted branch June 12, 2021 17:22
@mloning
Copy link
Contributor

mloning commented Jun 12, 2021

@fkiraly FYI I tend to use "squash and merge" for merging into main, that keeps main a bit more organised I think.

fkiraly added a commit that referenced this pull request Jun 12, 2021
This PR is addressing the forecaster base class proliferation discussed in #510, by simplifying the base class inheritance tree and the streamlining the private method logic.

Specifically, I've moved the logic from `_SktimeForecaster` and the forecasting horizin mixins into `BaseForecaster`, and adopted uniformly the principle of having a public method (`fit`, `predict`, etc) with checks/plumbing, dispatching to a "core logic" version (`_fit`, `_predict`, etc) where validated arguments can be assumed.

The burden on extenders becomes much lighter, since it is now possible to only focus on the "core logic" implementation, instead of having to keep in mind a myriad of inconsistent and constantly shifting conventions in checks and other plumbing.

As side effects, if we get this right, this should make a few things on the roadmap easier:
* the extension guidelines #464. Right now, the aforementioned implicit conventions are too many and intricate to write useful extension guidelines, in my opinion.
* extending to the multivariate case
* input/output checks and the eternal data container discussion, this can go in the plumbing

As a proof-of-concept regarding ease of extension, this PR also contains a highly annotated extension template in the `extension_templates` folder.

In terms of review, the key file is `forecasting/base/_base`, with corresponding changes (contraction and deletion) in `base/_sktime`.

I've tried to keep the interface consistent as much as possible (only changing internal logic).

Interface contracts with all the earlier estimators are still honoured, via loopthroughs and default behaviour that ensures that everything still works if `fit`, `predict` etc are overridden by the current descendants, as opposed to `_fit` and `_predict`.
The only change I had to made to descendants is set `self._is_fitted=False` at the start of `fit`, which is a minimally invasive change that's also separately reviewable as PR #941.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants