Skip to content

Fix naive model with last strategy for cases with trailing NaN values#1130

Merged
mloning merged 21 commits intosktime:mainfrom
SeitaBV:fix-GH918
Sep 9, 2021
Merged

Fix naive model with last strategy for cases with trailing NaN values#1130
mloning merged 21 commits intosktime:mainfrom
SeitaBV:fix-GH918

Conversation

@Flix6x
Copy link
Copy Markdown
Contributor

@Flix6x Flix6x commented Jul 12, 2021

Reference Issues/PRs

Fixes #918.

What does this implement/fix? Explain your changes.

The logic for the "last" strategy is now much more similar to the logic for the "mean" strategy. I refactored some of the logic to internal util functions so that it's clear that both strategies transform the last_window according to the seasonal periodicity in the same way, and also tile the results in the same way. The only difference is now that the "last" strategy selects the last non-NaN value from each seasonally periodic row, while the "mean" strategy computes the mean of said row.

Just to note, to me it would make sense if the "drift" strategy would use the same util functions to:

  • make it robust against NaN values, both at the end of the time series and at the start, and
  • let it account for seasonal periodicity.
    I think this would be a separate issue, but just to be clear: I am not considering working on it, unless someone points me to a use case for this particular strategy.

Does your contribution introduce a new dependency? If yes, which one?

No.

What should a reviewer concentrate their feedback on?

Any other comments?

PR checklist

For all contributions
  • I've added myself to the list of contributors.
  • Optionally, I've updated sktime's CODEOWNERS to receive notifications about future changes to these files.
  • I've added unit tests and made sure they pass locally.
For new estimators
  • I've added the estimator to the online documentation.
  • I've updated the existing example notebooks or provided a new one to showcase how my estimator works.

@Flix6x Flix6x requested a review from mloning as a code owner July 12, 2021 20:39
@fkiraly fkiraly requested review from aiwalter and fkiraly July 27, 2021 20:45
@fkiraly fkiraly added the module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting label Jul 31, 2021
Copy link
Copy Markdown
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, this looks very sensible! Thanks for the change.
I think this makes the method a lot more robust.

Now, I have to admit, I wanted to review this earlier, but I had to take some time for this since the docstrings aren't too great - otherwise I would have been done with this much earlier.

May I hence kindly ask:

  • in the class docstring, explain better what is meant with "last window"
  • in the class docstring, explain how nans are now handled
  • add docstrings to the new and old utility functions, you will do future developers a big favour...
  • add docstrings to the new tests - what does this tests for, under which conditions are what errors raised?

Another point: should the tag "handles-missing-values" now be switched to True? The default is False if it's not set.

@fkiraly
Copy link
Copy Markdown
Collaborator

fkiraly commented Aug 21, 2021

@Flix6x, are you still working on this?

@Flix6x
Copy link
Copy Markdown
Contributor Author

Flix6x commented Aug 23, 2021

@Flix6x, are you still working on this?

Little busy, but I plan to get back to this on Wednesday evening or Monday evening. Your documentation requests seem reasonable. I'd need to find out how I could set that tag, though. Is that on a class level or strategy level? If it's on a class level, I don't think the drift strategy is robust against missing values, so then it should still be False, but maybe made explicit and with a corresponding # todo: switch to True if GH<number of issue that addresses robustness against nan values for the "drift" strategy> is fixed?

@fkiraly
Copy link
Copy Markdown
Collaborator

fkiraly commented Aug 31, 2021

If it's on a class level, I don't think the drift strategy is robust against missing values, so then it should still be False

on the class level, it should be the "worst possible value". If you want, you can do it on the object/strategy level by using set_tags.

@Flix6x
Copy link
Copy Markdown
Contributor Author

Flix6x commented Sep 1, 2021

If you want, you can do it on the object/strategy level by using set_tags.

Thanks for the hint. I did just that.

add docstrings to the new tests - what does this tests for, under which conditions are what errors raised?

I added slightly more detail to the docstring and an inline comment. It's already pretty detailed compared to other tests.

@Flix6x Flix6x requested a review from fkiraly September 1, 2021 12:22
fkiraly
fkiraly previously approved these changes Sep 1, 2021
Copy link
Copy Markdown
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good from my side, I now understand what this is doing - I added minor comments, neither a blocker.

@fkiraly
Copy link
Copy Markdown
Collaborator

fkiraly commented Sep 1, 2021

but, obviously, you also need to fix:

  • doc quality checks
  • should be up-to-date with main

(can't be merged without this)

@Flix6x Flix6x requested a review from fkiraly September 6, 2021 10:23
Copy link
Copy Markdown
Contributor

@mloning mloning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all good to me - can we merge @fkiraly?

Copy link
Copy Markdown
Collaborator

@fkiraly fkiraly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, all fine - doc checks pass now.

@mloning
Copy link
Copy Markdown
Contributor

mloning commented Sep 9, 2021

Thanks @Flix6x and sorry again for the very long delay on this one - now it's merged 🎉

@Flix6x Flix6x deleted the fix-GH918 branch September 10, 2021 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:forecasting forecasting module: forecasting, incl probabilistic and hierarchical forecasting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Naive model with last strategy is sensitive to trailing NaN values

3 participants