Skip to content

[MRG+1] FIX Isotonic Regression for duplicate minimal value#3995

Merged
jnothman merged 3 commits intoscikit-learn:masterfrom
jmetzen:fix_isotonic
Dec 24, 2014
Merged

[MRG+1] FIX Isotonic Regression for duplicate minimal value#3995
jnothman merged 3 commits intoscikit-learn:masterfrom
jmetzen:fix_isotonic

Conversation

@jmetzen
Copy link
Copy Markdown
Member

@jmetzen jmetzen commented Dec 23, 2014

Fixes an issue of Isotonic Regression when the minimal value of fitting is duplicated, e.g.:

ir = IsotonicRegression(increasing=True, out_of_bounds="clip")
ir.fit([0, 0, 1], [0, 0, 1])
ir.predict([0])

would return nan (see test_isotonic_duplicate_min_entry). The deeper reason for this seems to be in interpolate.interp1d. This issue is fixed by clipping not to minimal value observed in fitting but to minimal value + np.finfo(float).resolution.

@jmetzen jmetzen mentioned this pull request Dec 23, 2014
5 tasks
@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Dec 23, 2014

LGTM.

@ogrisel ogrisel changed the title FIX Isotonic Regression for duplicate minimal value [MRG+1] FIX Isotonic Regression for duplicate minimal value Dec 23, 2014
@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Dec 23, 2014

Actually it would be great to try to understand what is the issue with interp1d.

@jmetzen
Copy link
Copy Markdown
Member Author

jmetzen commented Dec 23, 2014

The following happens when the values are passed directly to interp1d:

In [7]: f = interpolate.interp1d([0, 0, 1], [0, 0, 1], kind='linear')
In [8]: f(0)
 /home/jmetzen/.anaconda/lib/python2.7/site-packages/scipy/interpolate/interpolate.py:445: RuntimeWarning: invalid value encountered in true_divide
 slope = (y_hi - y_lo) / (x_hi - x_lo)[:, None]
Out[8]: array(nan)

Everything works fine in the case without duplicates:

In [13]: f = interpolate.interp1d([0, 1], [0,  1], kind='linear')
In [14]: f(0)
Out[14]: array(0.0)

I will open a related bug report in scipy. For the moment, the workaround of this PR should fix the issue.

@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Dec 23, 2014

Ok once you have the issue reported to scipy, please add an inline comment with the scipy issue number next to the line with the workaround in the scikit-learn source code.

@agramfort
Copy link
Copy Markdown
Member

LGTM too

@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Dec 24, 2014

@jmetzen have you opened the issue at scipy? I would like to reference it before we merge this workaround to sklearn master to ensure traceability.

@jmetzen
Copy link
Copy Markdown
Member Author

jmetzen commented Dec 24, 2014

The scipy issue is 4304 (scipy/scipy#4304) and I've added an inline comment in isotonic_regression.

@coveralls
Copy link
Copy Markdown

Coverage Status

Coverage increased (+0.0%) when pulling d255866 on jmetzen:fix_isotonic into d5c57a7 on scikit-learn:master.

@jnothman
Copy link
Copy Markdown
Member

Then this looks good to me too. Merging. Thanks!

jnothman added a commit that referenced this pull request Dec 24, 2014
[MRG] FIX Isotonic Regression for duplicate minimal value
@jnothman jnothman merged commit 636502f into scikit-learn:master Dec 24, 2014
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no newline :P

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, my bad. this PR is already merged but I will fix it in the related PR #1176

@amueller
Copy link
Copy Markdown
Member

Should we not rather use one of the recommended interpolation routines such as
>>> fs = interp1d([0, 0, 1], [0, 0, 1], kind='slinear')?

@jmetzen
Copy link
Copy Markdown
Member Author

jmetzen commented Jan 14, 2015

I tried to avoid any changes that would cause differences in the learned model. But you are right, kind="linear" and kind="slinear" give identical results as far as I can see it (besides the nan issue, which does not occur in slinear). See also here: scipy/scipy#4304
We should use slinear in isotonic as it is a less hacky .

raghavrv added a commit to raghavrv/scikit-learn that referenced this pull request Jan 17, 2015
agramfort added a commit that referenced this pull request Jan 17, 2015
[MRG+1] MAINT Remove temporary fix #3995 in view of the change to slinear.
@ev-br
Copy link
Copy Markdown
Contributor

ev-br commented Oct 22, 2016

Just a note that the workaround from 2014, scipy/scipy#4304, no longer works.

@ogrisel
Copy link
Copy Markdown
Member

ogrisel commented Oct 24, 2016

@ev-br thanks for the heads up but as of 0.18 we don't use kind='slinear' anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants