[ENH] Adapter from multiindex to ListDataset #2893
[ENH] Adapter from multiindex to ListDataset #2893TNTran92 wants to merge 23 commits intosktime:mainfrom
Conversation
There was a problem hiding this comment.
Thanks!
May I suggest some changes:
- remove dependency on
convert_toandcheck_is. I'll integrate it into the conversion framework later. Assume the input ispd-multiindex. - do you need to input frequency? Can that not be inferred from the
pandasinput time index? This should be inferred and appropriately translated, instead of being overwritten, in the vanilla call. - could you move this to
datatypes.adapters? - a back-conversion function would be nice, we need that for forecasts, do we not? Maybe a separate PR
- can you kindly add tests please, with example input and outputs?
I don't see this in the repo. Can you help me locate it? |
Yes, this will be in a separate PR. That one is in progress. |
It does not exist, sorry for being unclear - I suggest creating a new submodule, so we can isolate soft dependencies there. |
|
I can help with the testing, there is a conversion testing framework already in |
I added 2 examples: 1 for univariate and 1 for multivariate. Also added listdataset into registry. I'm still trying to figure out pytest |
|
This is for my information only. After adding datatypes example, encounter
E TypeError: cannot unpack non-iterable NoneType object fixture_index = 1 sktime/datatypes/tests/test_check.py:153: TypeError |
|
for the record, this is as we discussed today in person: the A secondary question I have is how we deal with mtypes that have soft dependencies, I need to think about that more carefully. |
There was a problem hiding this comment.
@TNTran92, I´ve thought about this.
Given the complexity of some design questions, see #2957, I would suggest:
Let´s separate this PR from the datatypes testing framework., i.e., split off the _examples and possibly _registry.
I would recommend to:
- put the examples and tests in a sub-folder of
_adapters. Don´t try to integrate with the current test system, just make new tests with expected input/output pairs. - then, focus next on the back-conversion, while we think about #2957 and integration with the current testing framework (the soft dep question needs to be addressed first).
That would prevent us from getting stuck, and allows @AurumnPegasus to use the conversions already for gluonts neural networks.
(this is nothing related to the quality of your work - great work! This is only about removing a conditionality on integration to give @AurumnPegasus access to the functionality, and delaying the integration to a later piece of work)
Understood. I will try to get the unit test and the back-conversion done as soon as possible. |
|
I have merge conflict that can't be resolved on this branch, so I will be closing this pull request and add the unit test plus everything done so far to the PR #2976 |
Reference Issues/PRs
Partial solution to Issue #2860
What does this implement/fix? Explain your changes.
Implement an adapter to convert multiindex format to ListDataset in gluon-ts
The method takes as input a pd-multiindex DataFrame, (optional) a list of categorical feature, (optional) the startdate and (optional) frequency of the dataset.
Does your contribution introduce a new dependency? If yes, which one?
Dependency: gluon-ts
What should a reviewer concentrate their feedback on?
Testing with different dataset in sktime to weed out the bugs. So far, Arrowhead, ItalyPoowerDemand and StandWalkJump have been tested.
Any other comments?
PR checklist
For all contributions
For new estimators