Skip to content

Add requires_positive_y estimator tag#14095

Merged
jnothman merged 5 commits intoscikit-learn:masterfrom
rth:estimator-tag-positive-y
Jun 24, 2019
Merged

Add requires_positive_y estimator tag#14095
jnothman merged 5 commits intoscikit-learn:masterfrom
rth:estimator-tag-positive-y

Conversation

@rth
Copy link
Copy Markdown
Member

@rth rth commented Jun 14, 2019

This adds a requires_positive_y estimator tag, for estimators that only work with a positive y for regression, such as Poisson regression in the GLM PR #9405.

This tag only makes sense for regressors, but this is not enforced. Estimator tag validation in general could be done in some other PR.

The GLM PR distinguishes estimators, that work with strictly positive y as well as positive or zero. Here I only consider the strictly positive case, as I think for common tests that is enough.

whether the estimator requires positive X.

requires_positive_y
whether the estimator requires a positive y (only applicable for regression).
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should call the one above requires_positive_X to be consistent, not sure. In any case, I am not so keen on requires_positive_target if we try to reach naming consistency in the other direction.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you not like target? I'm fine either way.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanna actually add the one above and do the same thing you did for y? The tag is not used in the estimator or tags yet, I think.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you not like target? I'm fine either way.

Mostly to be consistent with fit(X, y), but it's not too critical I agree.

Wanna actually add the one above and do the same thing you did for y? The tag is not used in the estimator or tags yet, I think.

Sure will make another PR.


# doesn't error on actual estimator
LogisticRegression,
LogisticRegression(),
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this run on AdaBoostClassifier which took significant time (around 3 sec) for two checks. Since it looks like this only intends to check that initialized/non initialized estimators work, replaced it by a faster estimator.

AdaBoostClassifier is picked up by common tests anyway.

@rth
Copy link
Copy Markdown
Member Author

rth commented Jun 15, 2019

(The test failure in one of the jobs is unrelated).

@rth
Copy link
Copy Markdown
Member Author

rth commented Jun 20, 2019

In case you are able to have a quick look (should be easy to review) @thomasjpfan @glemaitre. Already has a +1 )

@rth
Copy link
Copy Markdown
Member Author

rth commented Jun 21, 2019

I think I have addressed all comments. If there aren't new ones, could someone please merge this then, as this PR has a +2? Thanks!

Copy link
Copy Markdown
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just get some consensus on naming: requires_positive_y vs requires_positive_target. @rth prefers the former, I think I prefer the latter for consistency with requires_positive_data. Other opinions? Does it matter if tags are still experimental?

@thomasjpfan
Copy link
Copy Markdown
Member

We have X_types instead of data_types, while also using requires_positive_data. I am +0.5 on using X and y since it used in all our function signatures.

TLDR: +0.5 on requires_positive_y and changing requires_positive_data to requires_positive_X.

@rth
Copy link
Copy Markdown
Member Author

rth commented Jun 24, 2019

OK, changed requires_positive_data to requires_positive_X for consistency with X_types and requires_positive_y. Though no strong objections on this, can change those to _data and _target if necessary.

Copy link
Copy Markdown
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also more consistent with X_types.

Co-Authored-By: Joel Nothman <joel.nothman@gmail.com>
@rth
Copy link
Copy Markdown
Member Author

rth commented Jun 24, 2019

Thanks, @jnothman -- I added your suggestion! This should be good to merge then (CI is green)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants