Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
@fkiraly I think its a good idea. My first impression:
- Would be good to see also the outputs of the cells
- We could add a table of contents as described here: https://moonbooks.org/Articles/How-to-create-a-table-of-contents-in-a-jupyter-notebook-/
- I think we need also a section for the transformers, this is not covered so far. But also related to forecasting I think
- The file
01a_forecasting_sklearn.ipynbis the old one, right?
Martin Walter martin_friedrich.walter@daimler.com, Mercedes-Benz AG on behalf of Daimler TSS GmbH.
https://github.com/Daimler/daimler-foss/blob/master/LEGAL_IMPRINT.md
Don't the precommits eat the outputs away?
Yes, I was wondering how you do that - will have a look, thanks!
My feeling is, that should perhaps be an entirely separate notebook, since there's some complexity, e.g., the taxonomy on series, panel, etc? Maybe sth for the dev days next week. Might overload the forecasting notebook and break the narrative arc if there's too much in there.
Yes, I've copied it for easy copy/paste. Ultimately, I'd like to turn it into an appendix on the topic "what happens if you naively use scikit-learn, and why sktime is the right way to do lagging plus supervised prediction", which currently occupies much of the introduction. I found it an interesting topic, but it prevents you from getting to usage early on, which is what most users might be looking fore in a tutorial. |
|
while I have your attention, @aiwalter: is the code in the "advanced benchmarking" section still up-to-date? Or is there newer material that you would like to see there? |
|
which section do you mean? I would like to see also a more sophisticated tuning part including Martin Walter martin_friedrich.walter@daimler.com, Mercedes-Benz AG on behalf of Daimler TSS GmbH. |
I think that would be really nice in the "tuning" Section 3.3, as the "expert" sub-section which really shows (or brags about) all the great things we can do with |
mloning
left a comment
There was a problem hiding this comment.
This looks good!
A few comments:
- I wouldn't add set up instructions on top of each notebook, they may change and keeping them up-to-date will be hard, so instead I'd either automatically add more info in the header that's added to all notebooks automatically on the sphinx-built online documentation or simply link to the installation guide, same applies to notes on development status of particular functionality
- automated table of contents would be good as suggested by @aiwalter
- for exogenous data, there is the
load_longleydata if that's helpful - instead of "querying predictions" I'd say "generating predictions"
- overview of forecasters in sktime, I'd use `all_estimators("forecaster") to generate this dynamically
- typo "obatins"
- we now also have "dirrec" reduction strategy
The other more general point is that Jupyter notebooks are very hard to maintain in general, due to their incompatibility with git. An alternative may be to convert them into Python files and then use sphinx gallery for docs and an automated conversion for binder as done by scikit-learn.
Yes, but I still think we should use jupyter notebooks. Binder is error prone (it still is not fixed! see #662), and python raises the barrier to run the code. |
Agreed, but sphinx gallery allows you to download Python scripts as Jupyter notebooks (see e.g. https://scikit-learn.org/stable/auto_examples/bicluster/plot_spectral_coclustering.html#sphx-glr-auto-examples-bicluster-plot-spectral-coclustering-py) |
Hm, does this properly support markdown and cells? |
they are not incompatible with git? Or what do you mean? |
|
I meant that changes to Jupyter notebooks are not line based so it's hard to see what's going on with git to the point where tracking line changes is basically useless (hence tools like ReviewNB) |
but we're using reviewnb, no? |
|
Thanks, @MarcoGorelli, for helpful comments!
Argh - so there's no way around trying to find the exact cell manually? Any advice on a better workflow is appreciated...
Well, I'm not sure whether i'ts a pre-commit problem, a vs code problem, a GitHub desktop problem, a Windows admin rights problem, or a combination thereof. I wouldn't be able to produce a helpful bug report. I could compile the strange error messages though. |
The numbers from the nbqa output are of the code cells - the ones that are displayed in the cell are the execution count, which unfortunately can't be used in the nbqa output because they might contain duplicates or even be However, if you restart a notebook and execute it top-to-bottom, then the execution count should match the cell number in nbqa's output |
That's what I thought I tried (obviously, if there are duplicats/skips there's no way it could match) and the numbers didn't match, but maybe I did sth wrong Let me try it again - because if it wouldn't match it would be a bug, right? |
|
hm, it fits now, so I must have looked at the wrong log or notebook then, or made some other silly mistake. Seems to work as intended. |
|
no worries 😄
yes, exactly - and if you do encounter that, a bug report to nbqa would be appreciated 🙏 |
|
well, it's a bug, but it looks like it was in my head - so, all fine for now |
|
View / edit / reply to this conversation on ReviewNB mloning commented on 2021-06-22T15:23:46Z I would include this here, I listed alternatives in my review above.
fkiraly commented on 2021-06-22T16:42:43Z sorry, what do you want to include in the instructions? Not sure what you are referring to |
|
sorry, what do you want to include in the instructions? Not sure what you are referring to View entire conversation on ReviewNB |
I've added |
|
I noticed I forgot section 4 - I now added that section, which is basically a step-by-step extension guide |
|
would you mind reviewing again with intention to merge? I think suggestions are addressed, suggestions that would be major changes not just refactors are spun out in issues. |
aiwalter
left a comment
There was a problem hiding this comment.
@fkiraly few comments:
- should we maybe add an example anywhere about forecasting with exog data?
- what do you think about renaming the notebooks to
01a_forecasting.ipynband01b_forecasting_sklearn.ipynbor sth? As there might be other notebooks to follow? Just a minor thing. - Could we display the cell outputs also before merging?
Yes, but I would have to make it up from scratch - so it would be in the "more than just a refactor" category. Would say, out of scope.
Well, I thought there should be one "main" notebook, and the others are appendices a, b, and so on. |
Sure, I'll do that precisely when you think it's ready to, to avoid repo clutter. |
This is the forecasting tutorial with all cell contents, see #972. No other changes except cell content in the main forecasting tutorial added.
This pull request contains a re-work of the forecasting tutorial which should make it more structured, complete, and accessible - from what right now looks like a patchwork piece that has organically proliferated...
Main changes:
sktimeprimarily