Skip to content

[MRG] Support for infinite values in GBDTs#14406

Merged
ogrisel merged 3 commits intoscikit-learn:masterfrom
NicolasHug:gbdt_nan
Jul 19, 2019
Merged

[MRG] Support for infinite values in GBDTs#14406
ogrisel merged 3 commits intoscikit-learn:masterfrom
NicolasHug:gbdt_nan

Conversation

@NicolasHug
Copy link
Copy Markdown
Member

ping @ogrisel @adrinjalali

I think we need this merged before the missing values support :)

# This is not strictly True, but it's needed since
# force_all_finite=False means accept both nans and infinite values.
# Without the tag, common checks would fail.
# This comment must be removed once we merge PR 13911
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a "TODO", we sometimes go through them and it'll be easier to find it then. But if you're gonna fix it yourself, then no big deal.

@adrinjalali
Copy link
Copy Markdown
Member

ping when tests pass?

@NicolasHug
Copy link
Copy Markdown
Member Author

ping @adrinjalali They pass ^^ it's a docker issue

Copy link
Copy Markdown
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a quick comment to make the atol in a test more easy to understand but not big deal. Feel free to merge without addressing it if you don't like my suggestion :)


gbdt = HistGradientBoostingRegressor(min_samples_leaf=1)
gbdt.fit(X, y)
np.testing.assert_allclose(gbdt.predict(X), y, atol=1e-4)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why such a high value for atol? Maybe max_iter it too small for the default value of the learning rate? Maybe you could set the learning rate to 1.0 and a single split in a single tree (max_iter=1, max_leaf_nodes=2)would be enough to perfectly fit the data?

@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Jul 19, 2019

I launched a rebuild of azure and circle as the failures did not look related to this PR.

@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Jul 19, 2019

The tests pass. Let's merge, we can always improve the test later :)

@ogrisel ogrisel merged commit dd78658 into scikit-learn:master Jul 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants