Skip to content

[ENH] draft design for refactoring datatypes module - classes#6033

Closed
fkiraly wants to merge 18 commits intomainfrom
datatype-class-refactor
Closed

[ENH] draft design for refactoring datatypes module - classes#6033
fkiraly wants to merge 18 commits intomainfrom
datatype-class-refactor

Conversation

@fkiraly
Copy link
Copy Markdown
Collaborator

@fkiraly fkiraly commented Feb 29, 2024

Implements #3512, related to #2957

This adopts the design proposed in the issue, with a class for mtypes, and a class for conversions, both being BaseObject.

The design is meant ot make the datatypes module more extensible, and making it easier to manage soft dependencies in data type such as xarray, polars, or temporian EventSet (FYI @ianspektor, @achoum).

Currently in draft state for review, contains only the proposed class structure and refactor of a single mtype (pd.DataFrame),
FYI @sktime/core-developers

@fkiraly fkiraly added module:datatypes datatypes module: data containers, checkers & converters enhancement Adding new functionality labels Feb 29, 2024
@fkiraly fkiraly linked an issue Feb 29, 2024 that may be closed by this pull request
@fkiraly
Copy link
Copy Markdown
Collaborator Author

fkiraly commented Mar 1, 2024

Note to self: this would also solve the issue of documenting the mtype specifications, if we (a) do it in the class docstring, and (b) list them in the API reference. A minor issue is displaying the naming of mtypes, though this can be taken care of in the first line of the class docstring.

@julian-fong
Copy link
Copy Markdown
Contributor

Since skpro is a 'playground' of sorts - maybe we can play incorporate this in skpro and see how we can improve it for sktime? I am not sure if there is a similar pr for this in skpro yet

@fkiraly
Copy link
Copy Markdown
Collaborator Author

fkiraly commented Jun 15, 2024

hmm - good idea!

@fkiraly
Copy link
Copy Markdown
Collaborator Author

fkiraly commented Jun 15, 2024

do you want to try copy this, or should I do that?
Just concerned about merge conflicts.

@julian-fong
Copy link
Copy Markdown
Contributor

You can copy it - is it possible after wards for me to work on the your branch that you made?

@fkiraly
Copy link
Copy Markdown
Collaborator Author

fkiraly commented Jun 16, 2024

sure. Let me adapt it, and then you try to merge with your polars PR, I would recommend a separate PR though after you have added via the "old" system.

fkiraly added a commit to sktime/skpro that referenced this pull request Sep 7, 2024
This PR refactors the data type specifications and converters to
classes.

Implements sktime/sktime#3512, related to
sktime/sktime#2957.

Contains:

* a base class for mtypes, `BaseDatatype`, to replace the more ad-hoc
dictionary design
* a complete refactor of the `Table` mtype submodule to this interface
* a base class for converters, `BaseConverter`, also replacing a
dictionary based design
* a partial refactor of the `Table` related converters to this interface
* a full refactor of the public framework module with `check` and
`convert` logic, in `datatypes`, to allow extensibility with this design

Partial mirror in `skpro` of sktime/sktime#6033
fkiraly added a commit to sktime/skpro that referenced this pull request Sep 8, 2024
This PR refactors the data type specifications and converters to
classes.

Related: sktime/sktime#3512, related to
sktime/sktime#2957.

Contains:

* a base class for datatype examples, `BaseExample`, to replace the more
ad-hoc dictionary design
* a complete refactor of the `Table` and `Proba` mtype submodules to
this interface
* a full refactor of the public framework module with `get_example`
logic, in `datatypes`, to allow extensibility with this design

Partial mirror in `skpro` of sktime/sktime#6033
@fkiraly
Copy link
Copy Markdown
Collaborator Author

fkiraly commented Sep 25, 2024

superseded by #7161

@fkiraly fkiraly closed this Sep 25, 2024
fkiraly added a commit that referenced this pull request Oct 14, 2024
…ecords (#7161)

This PR carries out a refactor of the `datatypes` module to
`scikit-base` classes and data records, with the following benefits:

* modularity and extensibility
* the classes can be used as records in the documentation to store
information about the container specification
* programmatic soft dependency isolation, e.g., of data containers
requiring soft dependencies such as `polars`, `dask`, `xarray`,
`gluonts`, etc.

The refactor parallels that in `skpro`, and replaces the earlier attempt
in #6033. Towards #3512.
benHeid pushed a commit that referenced this pull request Feb 15, 2025
…ecords (#7161)

This PR carries out a refactor of the `datatypes` module to
`scikit-base` classes and data records, with the following benefits:

* modularity and extensibility
* the classes can be used as records in the documentation to store
information about the container specification
* programmatic soft dependency isolation, e.g., of data containers
requiring soft dependencies such as `polars`, `dask`, `xarray`,
`gluonts`, etc.

The refactor parallels that in `skpro`, and replaces the earlier attempt
in #6033. Towards #3512.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Adding new functionality module:datatypes datatypes module: data containers, checkers & converters

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENH] refactor datatypes mtype related functionality into classes

2 participants