Skip to content

Add BaseObject to sktime.base#891

Closed
RNKuhns wants to merge 2 commits intosktime:mainfrom
RNKuhns:base_object
Closed

Add BaseObject to sktime.base#891
RNKuhns wants to merge 2 commits intosktime:mainfrom
RNKuhns:base_object

Conversation

@RNKuhns
Copy link
Copy Markdown
Contributor

@RNKuhns RNKuhns commented May 20, 2021

Reference Issues/PRs

#877

What does this implement/fix? Explain your changes.

Creates Sktime BaseObject class that inherits from scikit-learn's BaseEstimator and only adds the method for returning estimator tags.

Does your contribution introduce a new dependency? If yes, which one?

No

What should a reviewer concentrate their feedback on?

Changes in _base.py. The changes in _meta.py are just to comply with pydocstyle checks (so all of base is compliant).

Any other comments?

Still need to set up the unit testing (want to apply applicable the unit tests that are currently applied to BaseEstimator except those dealing with fit functionality).

PR checklist

For all contributions
  • I've added myself to the list of contributors.
  • Optionally, I've updated sktime's CODEOWNERS to receive notifications about future changes to these files.
  • I've added unit tests and made sure they pass locally.
For new estimators
  • I've added the estimator to the online documentation.
  • I've updated the existing example notebooks or provided a new one to showcase how my estimator works.

@RNKuhns RNKuhns requested a review from mloning May 20, 2021 19:50
Copy link
Copy Markdown
Contributor

@mloning mloning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @RNKuhns, I think it makes sense to have this conceptually, so happy to merge this. @fkiraly what's your opinion?

@property
def is_fitted(self):
"""Has `fit` been called?"""
"""Wheter `fit` been called."""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Wheter `fit` been called."""
"""Whether `fit` has been called."""

@RNKuhns RNKuhns linked an issue Jun 13, 2021 that may be closed by this pull request
@RNKuhns
Copy link
Copy Markdown
Contributor Author

RNKuhns commented Jun 13, 2021

@mloning do you want me to add any unit tests for BaseObject?

I started to look through the tests in test_all_estimators and could work up tests that make sense for BaseObject based on those.

@fkiraly
Copy link
Copy Markdown
Collaborator

fkiraly commented Jun 17, 2021

Yes, quite happy with this - great idea, and it gives us a place to control functionality that all objects get.

This is out of date with the main branch, but in-principle I'm happy to approve.

@mloning
Copy link
Copy Markdown
Contributor

mloning commented Jun 18, 2021

Hi @RNKuhns, yes adding unit tests would be a good idea. Do we want to test all public objects (not just estimators) inherit from the BaseObject? This would require writing a new functions to get all classes from the repo, similar to all_estimators but more general.

@RNKuhns
Copy link
Copy Markdown
Contributor Author

RNKuhns commented Jun 18, 2021

@mlonging, I had been thinking about something along those lines. Beyond that is there anything else I should look to create a unit test for?

I'll try to get this knocked out this weekend.

@mloning
Copy link
Copy Markdown
Contributor

mloning commented Jun 18, 2021

In that case, I suggest to refactor all_estimators so that it can internally call all_classes and then does some extra filtering on the classes, not sure how much work this involves. Note that the automated unit testing depends on this, so we need to be extra careful! 🙂

If we enforce these tests, we also need to modify a few existing classes which currently do not inherit from the BaseObject such as the cross-validators.

I think we would want to take another look at the tags again at some point in the future, but that seems to be a separate PR.

scikit-learn to make sure it does not interfere with scikit-learn's tag
interface when inheriting from scikit-learn classes. Sktime's
estimator tags are class rather than object attribute as in scikit-learn.
"""
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should properly describe that this implements a kind of inheritance, no?
As far as I understand, if multiple estimators in the inheritance tree have the same tag, the descendant tag overrides the parent tag.
This is important to document, perhaps with an example of what tags are returned.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also like to rename the _all_tags method to get_tags and make it public if that doesn't interfere with scikit-learn's tags. But perhaps changes to the tag system are best addressed in a separate PR.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment: I was wondering, upon a vague feeling, whether this could be just solved by using inheritance smartly? Namely, if we were just looking at the dictionary, and its keys as attribute names, this is precisely inheritance of attributes.

Inside the list, the name for this pattern is inheritance of nested attributes - it doesn't seem python supports this natively. A quick google search indicates it's an occasional topic in various other programming languages as well (mostly ending in the conclusion that the language doesn't support it).

@fkiraly
Copy link
Copy Markdown
Collaborator

fkiraly commented Jul 3, 2021

FYI, @mloning, @RNKuhns, I made some suggestions in #1099 related to tag handling, as an extension to this PR.

The gist of this is:

  • public get_tag and get_tags interface points, as @mloning suggested
  • dynamic tag setting via setters to mirror the getters.

Please review.

@fkiraly
Copy link
Copy Markdown
Collaborator

fkiraly commented Jul 3, 2021

In the next step, I would like to add a get_summary interface point to BaseObject, as discussed in #971. But that should be in the next PR. FYI @ltsaprounis, for checking/commenting whether BaseObject would be the right place to locate get_summary

@fkiraly fkiraly mentioned this pull request Jul 3, 2021
5 tasks
@ltsaprounis
Copy link
Copy Markdown
Contributor

Thanks for pinging me @fkiraly.
BaseObject should be a good place for the get_summary method.
Is the idea to define it as an abstract method of BaseObject and then overwrite?

I had a look at statsmodels to see how this is handled and although I got a bit lost, it looks like they define it as an abstract method of their base class Results and then they implement it for each model (or family of models) e.g. StateSpaceMLEResults

@fkiraly
Copy link
Copy Markdown
Collaborator

fkiraly commented Jul 5, 2021

Is the idea to define it as an abstract method of BaseObject and then overwrite?

Well, in this case one would probably say "implement" rather than overwrite/override since it's abstract.

But yes, exactly!

I had a look at statsmodels to see how this is handled and although I got a bit lost, it looks like they define it as an abstract method of their base class Results

Hm, this looks really weird - the model is wrapped in the Results, rather than the other way round? Do you understand the rationale behind that? I'm sure there must be a good and well-thought out reason?

@ltsaprounis
Copy link
Copy Markdown
Contributor

Hm, this looks really weird - the model is wrapped in the Results, rather than the other way round? Do you understand the rationale behind that? I'm sure there must be a good and well-thought out reason?

Nope - I share your confusion here. Further investigation is required to get to the bottom of this.

fkiraly added a commit that referenced this pull request Jul 17, 2021
Rework of PR #1099 which is in turn a rework of PR #891 (by @RNKuhns), based on discussion recorded in #981 on tags.

Adds a `BaseObject` class which adds dynamic tag functionality. `BaseObject` contains `BaseEstimator` functionality sans `fit` state change related interfaces.
The new functionality replaces the inconsistent mixture of old interface points (`all_tags`, `has_tag`) in all downstream dependencies.

post-PR, tags work like this:
* each class will have "class tags", stored in the content of the class variable `_tags: dict`. Individual class tags are considered to override each other as per normal nested inheritance. E.g., if a parent class and the class have different values for a class tag, the younger class' class tag value counts.
* each object can, in addition, have dynamic tags, in the object variable `_tags_dynamic: dict`. These are expected to be initialized in the constructor, for instance in cases where behaviour described by tags depends on components or hyper-parameters. Dynamic tags are always considered to override the class tags - e.g., if the object's class or a parent class have a static tag of the same name, the dynamic tag's value overrides any class tag's value.

The new tag related methods make use of this inheritance logic are as follows.

class methods:
* `get_class_tags()` collects all class tags and their values and returns them as a `dict` (keys = tag names, values = tag values). It overrides values precisely as described above, in the return dictionary. This replaces the old `_all_tags`.
* `get_class_tag(tag_name, tag_value_default=None)` retrieves the value of an individual tag with name `tag_name`, and optionally allows to specify a default value if the tag is not found. The inheritance/override logic is the same as in `get_class_tags`.

object methods:
* `get_tags()` collects all class tags and dynamic tags and their values and returns them as a `dict` (keys = tag names, values = tag values). It overrides values precisely as described above, in the return dictionary.
* `get_tag(tag_name, tag_value_default=None)` retrieves the value of an individual tag with name `tag_name`, and optionally allows to specify a default value if the tag is not found. The inheritance/override logic is the same as in `get_tags`.
* `set_tags(**tag_dict)` sets dynamic tags according to keys (=tag names) and values (= tag values) in the argument dictionary. It is expected to be used in constructors typically.
* `mirror_tags(estimator, tag_set=None)` is a shorthand version of `set_tags`, where it sets tags according to their `get_tag` values in `estimator`. The tags that are mirrored/copied in this way can be restricted by specifying a list of tag names in `tag_set`, otherwise all tags are copied over. The mirroring is by-reference and not by-value (if the type allows this), but tags are not expected to be mutated.
@fkiraly
Copy link
Copy Markdown
Collaborator

fkiraly commented Jul 17, 2021

superseded by the now merged #1134

@fkiraly fkiraly closed this Jul 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create BaseObject

4 participants