MRG FIX: order of values of self.quantiles_ in QuantileTransformer by tirthasheshpatel · Pull Request #15751 · scikit-learn/scikit-learn

tirthasheshpatel · 2019-12-01T18:02:11Z

When the value of n_quantiles is too large, quantization is too dense which leads to possibility of misordering of values in self.quantiles_.
A quick fix as suggested in #15733 is implemented in this pr.

sklearn/preprocessing/_data.py

NicolasHug

Thanks for the PR @tirthasheshpatel, looks good .

Ideally we would need a non-regression test. Any chance we can have an easier reproducible example of the original issue?

sklearn/preprocessing/_data.py

tirthasheshpatel · 2019-12-02T13:37:53Z

@NicolasHug
I am not sure if the issue can be reproduced easily but I will try with some toy examples...

glemaitre · 2019-12-02T15:28:37Z

I think that we should create a nanpercentile and percentile in fixes.py and implement this fix for all algorithm calling using nanpercentile and percentile. I assume that other algorithms could be silently affected by this issue (e.g. KBinsDiscretizer)

@NicolasHug WDYT?

NicolasHug · 2019-12-02T16:08:03Z

I agree it'd be nice to have a uniform fix.

I'd suggest that we open an issue on numpy, and see what their answer/solution is. They might come with a simple or better fix than np.max.accumulate, which we could then backport. This way we're sure that we stay compatible with numpy. If their fix is too complex then we can go with the current solution.

ogrisel · 2019-12-03T17:02:40Z

I agree but I would also note that I find the solution based on np.maximum.accumulate quite elegant :)

NicolasHug · 2019-12-03T17:05:13Z

@glemaitre can you open the issue on numpy and cc me please ;)

else I'll do it soonish

glemaitre · 2019-12-03T18:10:19Z

@NicolasHug I will do it in the coming hours :)

glemaitre · 2019-12-03T21:54:34Z

I am thinking a bit more about it. np.percentile does not expect q to be an ordered sequence meaning that the np.maximum.accumulate is working for us but it will not necessarily work for numpy.

glemaitre · 2019-12-03T21:58:14Z

FYI The current issue was already reported in NumPy: numpy/numpy#14685

glemaitre

So I would be in favour to merge the fix. @tirthasheshpatel could you add an entry in what's new.
I think that we can add this bug fix in 0.22.1 as well. Could you add an entry in what's new for the 0.22.1 upcoming release.

sklearn/preprocessing/_data.py

Add suggested docstring... Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Add suggested docstring Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

doc/whats_new/v0.22.rst

Co-Authored-By: fcharras <franck@sancare.fr>

jnothman

It would still be nice to add a test... But this lgtm

NicolasHug

LGTM

The what's new in 0.22, so that means we need to add this to the bugfix release
CC @adrinjalali

adrinjalali · 2019-12-09T18:56:01Z

The reproducing example in the issue itself can be a test. I think we should have it here as a test.

import numpy as np
from sklearn.preprocessing import QuantileTransformer
X = np.loadtxt('gistfile1.txt', delimiter=',')
quantile_transformer = QuantileTransformer(n_quantiles=150).fit(X)
print(np.all(np.diff(quantile_transformer.quantiles_, axis=0) >= 0))

NicolasHug · 2019-12-09T19:04:53Z

That would require copying a 300 lines data file though. We usually avoid that

glemaitre · 2019-12-10T17:05:39Z

We might be able to have such an example by ensuring that we have a quantile array as the miniumum example in numpy/numpy#14685

sklearn/preprocessing/_data.py

ogrisel · 2019-12-21T14:37:57Z

I addressed my own comment and added the missing non-regression test. If CI is green, I will merge.

ogrisel · 2019-12-21T14:41:28Z

I should have resolved the merge conflict with master. BTW I made sure that the test was failing on master.

glemaitre · 2019-12-21T14:42:16Z

sklearn/preprocessing/tests/test_data.py

+    # https://github.com/scikit-learn/scikit-learn/issues/15733
+    # Taken from upstream bug report:
+    # https://github.com/numpy/numpy/issues/14685
+    X = np.array([0, 1, 1, 2, 2, 3, 3, 4, 5, 5, 1, 1, 9, 9, 9, 8, 8, 7] * 10)


Shoot, I try to produce the failure running differnent size and way of generating unsuccessfully.

Nice that you got one

The trick was to make that dataset larger than 100 samples, otherwise quantiles_ was limited by X.shape[0].

So I just duplicated the samples 10 times and the monotonicity issue was fortunately still present :)

BTW, I am not sure if quantile_transformer.quantiles_.shape[0] being smaller than quantile_transformer.n_quantiles (when the training set size is too small) is a bug or not.

But that's unrelated to the topic of this PR.

We raise warning in this case so I think that this is fine (at least we expected it)

glemaitre

LGTM

sklearn/preprocessing/tests/test_data.py

…cikit-learn#15751)

@nkish

* DOC fixed default values in dbscan (#15753) * DOC fix incorrect branch reference in contributing doc (#15779) * DOC relabel Feature -> Efficiency in change log (#15770) * DOC fixed Birch default value (#15780) * STY Minior change on code padding in website theme (#15768) * DOC Fix yticklabels order in permutation importances example (#15799) * Fix yticklabels order in permutation importances example * STY Update wrapper width (#15793) * DOC Long sentence was hard to parse and ambiguous in _classification.py (#15769) * DOC Removed duplicate 'classes_' attribute in Naive Bayes classifiers (#15811) * BUG Fixes pandas dataframe bug with boolean dtypes (#15797) * BUG Returns only public estimators in all_estimators (#15380) * DOC improve doc for multiclass and types_of_target (#15333) * TST Increases tol for check_pca_float_dtype_preservation assertion (#15775) * update _alpha_grid class in _coordinate_descent.py (#15835) * FIX Explicit conversion of ndarray to object dtype. (#15832) * BLD Parallelize sphinx builds on circle ci (#15745) * DOC correct url for preprocessing (#15853) * MNT avoid generating too many cross links in examples (#15844) * DOC Correct wrong doc in precision_recall_fscore_support (#15833) * DOC add comment in check_pca_float_dtype_preservation (#15819) Documenting the changes in #15775 * DOC correct indents in docstring _split.py (#15843) * DOC fix docstring of KMeans based on sklearn guideline (#15754) * DOC fix docstring of AgglomerativeClustering based on sklearn guideline (#15764) * DOC fix docstring of AffinityPropagation based on sklearn guideline (#15777) * DOC fixed SpectralCoclustering and SpectralBiclustering docstrings following sklearn guideline (#15778) * DOC fix FeatureAgglomeration and MiniBatchKMeans docstring following sklearn guideline (#15809) * TST Specify random_state in test_cv_iterable_wrapper (#15829) * DOC Include LinearSV{C, R} in models that support sample_weights (#15871) * DOC correct some indents (#15875) * DOC Fix documentation of default values in tree classes (#15870) * DOC fix typo in docstring (#15887) * DOC FIX default value for xticks_rotation in plot_confusion_matrix (#15890) * Fix imports in pip3 ubuntu by suffixing affected files (#15891) * MNT Raise erorr when normalize is invalid in confusion_matrix (#15888) * [MRG] DOC Increases search results for API object results (#15574) * MNT Ignores warning in pyamg for deprecated scipy.random (#15914) * DOC Instructions to troubleshoot Windows path length limit (#15916) * DOC add versionadded directive to some estimators (#15849) * DOC clarify doc-string of roc_auc_score and add references (#15293) * MNT Adds skip lint to azure pipeline CI (#15904) * BLD Fixes bug when building with NO_MATHJAX=1 (#15892) * [MRG] BUG Checks to number of axes in passed in ax more generically (#15760) * EXA Minor fixes in plot_sparse_logistic_regression_20newsgroups.py (#15925) * BUG Do not shadow public functions with deprecated modules (#15846) * Import sklearn._distributor_init first (#15929) * DOC Fix typos, via a Levenshtein-style corrector (#15923) * DOC in canned comment, mention that PR title becomes commit me… (#15935) * DOC/EXA Correct spelling of "Classification" (#15938) * BUG fix pip3 ubuntu update by suffixing file (#15928) * [MRG] Ways to compute center_shift_total were different in "full" and "elkan" algorithms. (#15930) * TST Fixes integer test for train and test indices (#15941) * BUG ensure that parallel/sequential give the same permutation importances (#15933) * Formatting fixes in changelog (#15944) * MRG FIX: order of values of self.quantiles_ in QuantileTransformer (#15751) * [MRG] BUG Fixes constrast in plot_confusion_matrix (#15936) * BUG use zero_division argument in classification_report (#15879) * DOC change logreg solver in plot_logistic_path (#15927) * DOC fix whats new ordering (#15961) * COSMIT use np.iinfo to define the max int32 (#15960) * DOC Apply numpydoc validation to VotingRegressor methods (#15969) Co-authored-by: Tiffany R. Williams <Tiffany8@users.noreply.github.com> * DOC improve naive_bayes.py documentation (#15943) Co-authored-by: Jigna Panchal <40188288+jigna-panchal@users.noreply.github.com> * DOC Fix default values in Perceptron documentation (#15965) * DOC Improve default values in logistic documentation (#15966) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * DOC Improve documentation of default values for imputers (#15964) * EXA/MAINT Simplify code in manifold learning example (#15949) * DOC Improve default values in SGD documentation (#15967) * DOC Improve defaults in neural network documentation (#15968) * FIX use safe_sparse_dot for callable kernel in LabelSpreading (#15868) * BUG Adds attributes back to check_is_fitted (#15947) Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> * DOC update check_is_fitted what's new * DOC change python-devel to python3-devel for yum. (#15986) * DOC Correct the default value of values_format in plot_confusion_matrix (#15981) * [MRG] MNT Updates pypy to use 7.2.0 (#15954) * FIX Add missing 'values_format' param to disp.plot() in plot_confusion_matrix (#15937) * FIX support scalar values in fit_params in SearchCV (#15863) * support a scalar fit param * pep8 * TST add test for desired behavior * FIX introduce _check_fit_params to validate parameters * DOC update whats new * TST tests both grid-search and randomize-search * PEP8 * DOC revert unecessary change * TST add test for _check_fit_params * olivier comments * TST fixes * DOC whats new * DOC whats new * TST revert type of error * add olivier suggestions * address olivier comments * address thomas comments * PEP8 * comments olivier * TST fix test by passing X * avoid to call twice tocsr * add case column/row sparse in check_fit_param * provide optional indices * TST check content when indexing params * PEP8 * TST update tests to check identity * stupid fix * use a distribution in RandomizedSearchCV * MNT add lightgbm to one of the CI build * move to another build * do not install dependencies lightgbm * MNT comments on the CI setup * address some comments * Test fit_params compat without dependency on lightgbm Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> * Remove abstractmethod that silently brake downstream packages (#15996) * FIX restore BaseNB._check_X without abstractmethod decoration (#15997) * Update v0.22 changelog for 0.22.1 (#16002) - set the date - move entry for quantile transformer to the 0.22.1 section - fix alphabetical ordering of modules * STY Removes hidden scroll bar (#15999) * Flake8 fixes * Fix: remove left-over lines that should have been deleted during conflict resolution when rebasing * Fix missing imports * Update version * Fix test_check_is_fitted * Make test_sag_regressor_computed_correctly deterministic (#16003) Fix #15818. Co-authored-by: cgsavard <claire.savard@colorado.edu> Co-authored-by: Joel Nothman <joel.nothman@gmail.com> Co-authored-by: Thomas J Fan <thomasjpfan@gmail.com> Co-authored-by: Matt Hall <matt@agilegeoscience.com> Co-authored-by: Kathryn Poole <kathryn.poole2@gmail.com> Co-authored-by: lucyleeow <jliu176@gmail.com> Co-authored-by: JJmistry <jayminm22@gmail.com> Co-authored-by: Juan Carlos Alfaro Jiménez <JuanCarlos.Alfaro@uclm.es> Co-authored-by: SylvainLan <sylvain.s.lannuzel@gmail.com> Co-authored-by: Nicolas Hug <contact@nicolas-hug.com> Co-authored-by: Hanmin Qin <qinhanmin2005@sina.com> Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com> Co-authored-by: Vachan D A <vachanda@users.noreply.github.com> Co-authored-by: Sambhav Kothari <sambhavs.email@gmail.com> Co-authored-by: wenliwyan <12013376+wenliwyan@users.noreply.github.com> Co-authored-by: shivamgargsya <shivam.gargshya@gmail.com> Co-authored-by: Reshama Shaikh <rs2715@stern.nyu.edu> Co-authored-by: Oliver Urs Lenz <oulenz@users.noreply.github.com> Co-authored-by: Loïc Estève <loic.esteve@ymail.com> Co-authored-by: Brian Wignall <BrianWignall@gmail.com> Co-authored-by: Ritchie Ng <ritchieng@u.nus.edu> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: inderjeet <43402782+inder128@users.noreply.github.com> Co-authored-by: scibol <scibol@users.noreply.github.com> Co-authored-by: Tirth Patel <tirthasheshpatel@gmail.com> Co-authored-by: Bibhash Chandra Mitra <bibhashm220896@gmail.com> Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org> Co-authored-by: Tiffany R. Williams <Tiffany8@users.noreply.github.com> Co-authored-by: Jigna Panchal <40188288+jigna-panchal@users.noreply.github.com> Co-authored-by: @nkish <19225359+ankishb@users.noreply.github.com> Co-authored-by: Pulkit Mehta <pulkit_mehta_work@yahoo.com> Co-authored-by: David Breuer <DavidBreuer@users.noreply.github.com> Co-authored-by: Niklas <niklas.sm+github@gmail.com> Co-authored-by: Windber <guolipengyeah@126.com> Co-authored-by: Stephen Blystone <29995339+blynotes@users.noreply.github.com> Co-authored-by: Brigitta Sipőcz <b.sipocz@gmail.com>

…cikit-learn#15751)

@nkish

* DOC fixed default values in dbscan (scikit-learn#15753) * DOC fix incorrect branch reference in contributing doc (scikit-learn#15779) * DOC relabel Feature -> Efficiency in change log (scikit-learn#15770) * DOC fixed Birch default value (scikit-learn#15780) * STY Minior change on code padding in website theme (scikit-learn#15768) * DOC Fix yticklabels order in permutation importances example (scikit-learn#15799) * Fix yticklabels order in permutation importances example * STY Update wrapper width (scikit-learn#15793) * DOC Long sentence was hard to parse and ambiguous in _classification.py (scikit-learn#15769) * DOC Removed duplicate 'classes_' attribute in Naive Bayes classifiers (scikit-learn#15811) * BUG Fixes pandas dataframe bug with boolean dtypes (scikit-learn#15797) * BUG Returns only public estimators in all_estimators (scikit-learn#15380) * DOC improve doc for multiclass and types_of_target (scikit-learn#15333) * TST Increases tol for check_pca_float_dtype_preservation assertion (scikit-learn#15775) * update _alpha_grid class in _coordinate_descent.py (scikit-learn#15835) * FIX Explicit conversion of ndarray to object dtype. (scikit-learn#15832) * BLD Parallelize sphinx builds on circle ci (scikit-learn#15745) * DOC correct url for preprocessing (scikit-learn#15853) * MNT avoid generating too many cross links in examples (scikit-learn#15844) * DOC Correct wrong doc in precision_recall_fscore_support (scikit-learn#15833) * DOC add comment in check_pca_float_dtype_preservation (scikit-learn#15819) Documenting the changes in scikit-learn#15775 * DOC correct indents in docstring _split.py (scikit-learn#15843) * DOC fix docstring of KMeans based on sklearn guideline (scikit-learn#15754) * DOC fix docstring of AgglomerativeClustering based on sklearn guideline (scikit-learn#15764) * DOC fix docstring of AffinityPropagation based on sklearn guideline (scikit-learn#15777) * DOC fixed SpectralCoclustering and SpectralBiclustering docstrings following sklearn guideline (scikit-learn#15778) * DOC fix FeatureAgglomeration and MiniBatchKMeans docstring following sklearn guideline (scikit-learn#15809) * TST Specify random_state in test_cv_iterable_wrapper (scikit-learn#15829) * DOC Include LinearSV{C, R} in models that support sample_weights (scikit-learn#15871) * DOC correct some indents (scikit-learn#15875) * DOC Fix documentation of default values in tree classes (scikit-learn#15870) * DOC fix typo in docstring (scikit-learn#15887) * DOC FIX default value for xticks_rotation in plot_confusion_matrix (scikit-learn#15890) * Fix imports in pip3 ubuntu by suffixing affected files (scikit-learn#15891) * MNT Raise erorr when normalize is invalid in confusion_matrix (scikit-learn#15888) * [MRG] DOC Increases search results for API object results (scikit-learn#15574) * MNT Ignores warning in pyamg for deprecated scipy.random (scikit-learn#15914) * DOC Instructions to troubleshoot Windows path length limit (scikit-learn#15916) * DOC add versionadded directive to some estimators (scikit-learn#15849) * DOC clarify doc-string of roc_auc_score and add references (scikit-learn#15293) * MNT Adds skip lint to azure pipeline CI (scikit-learn#15904) * BLD Fixes bug when building with NO_MATHJAX=1 (scikit-learn#15892) * [MRG] BUG Checks to number of axes in passed in ax more generically (scikit-learn#15760) * EXA Minor fixes in plot_sparse_logistic_regression_20newsgroups.py (scikit-learn#15925) * BUG Do not shadow public functions with deprecated modules (scikit-learn#15846) * Import sklearn._distributor_init first (scikit-learn#15929) * DOC Fix typos, via a Levenshtein-style corrector (scikit-learn#15923) * DOC in canned comment, mention that PR title becomes commit me… (scikit-learn#15935) * DOC/EXA Correct spelling of "Classification" (scikit-learn#15938) * BUG fix pip3 ubuntu update by suffixing file (scikit-learn#15928) * [MRG] Ways to compute center_shift_total were different in "full" and "elkan" algorithms. (scikit-learn#15930) * TST Fixes integer test for train and test indices (scikit-learn#15941) * BUG ensure that parallel/sequential give the same permutation importances (scikit-learn#15933) * Formatting fixes in changelog (scikit-learn#15944) * MRG FIX: order of values of self.quantiles_ in QuantileTransformer (scikit-learn#15751) * [MRG] BUG Fixes constrast in plot_confusion_matrix (scikit-learn#15936) * BUG use zero_division argument in classification_report (scikit-learn#15879) * DOC change logreg solver in plot_logistic_path (scikit-learn#15927) * DOC fix whats new ordering (scikit-learn#15961) * COSMIT use np.iinfo to define the max int32 (scikit-learn#15960) * DOC Apply numpydoc validation to VotingRegressor methods (scikit-learn#15969) Co-authored-by: Tiffany R. Williams <Tiffany8@users.noreply.github.com> * DOC improve naive_bayes.py documentation (scikit-learn#15943) Co-authored-by: Jigna Panchal <40188288+jigna-panchal@users.noreply.github.com> * DOC Fix default values in Perceptron documentation (scikit-learn#15965) * DOC Improve default values in logistic documentation (scikit-learn#15966) Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> * DOC Improve documentation of default values for imputers (scikit-learn#15964) * EXA/MAINT Simplify code in manifold learning example (scikit-learn#15949) * DOC Improve default values in SGD documentation (scikit-learn#15967) * DOC Improve defaults in neural network documentation (scikit-learn#15968) * FIX use safe_sparse_dot for callable kernel in LabelSpreading (scikit-learn#15868) * BUG Adds attributes back to check_is_fitted (scikit-learn#15947) Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> * DOC update check_is_fitted what's new * DOC change python-devel to python3-devel for yum. (scikit-learn#15986) * DOC Correct the default value of values_format in plot_confusion_matrix (scikit-learn#15981) * [MRG] MNT Updates pypy to use 7.2.0 (scikit-learn#15954) * FIX Add missing 'values_format' param to disp.plot() in plot_confusion_matrix (scikit-learn#15937) * FIX support scalar values in fit_params in SearchCV (scikit-learn#15863) * support a scalar fit param * pep8 * TST add test for desired behavior * FIX introduce _check_fit_params to validate parameters * DOC update whats new * TST tests both grid-search and randomize-search * PEP8 * DOC revert unecessary change * TST add test for _check_fit_params * olivier comments * TST fixes * DOC whats new * DOC whats new * TST revert type of error * add olivier suggestions * address olivier comments * address thomas comments * PEP8 * comments olivier * TST fix test by passing X * avoid to call twice tocsr * add case column/row sparse in check_fit_param * provide optional indices * TST check content when indexing params * PEP8 * TST update tests to check identity * stupid fix * use a distribution in RandomizedSearchCV * MNT add lightgbm to one of the CI build * move to another build * do not install dependencies lightgbm * MNT comments on the CI setup * address some comments * Test fit_params compat without dependency on lightgbm Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> * Remove abstractmethod that silently brake downstream packages (scikit-learn#15996) * FIX restore BaseNB._check_X without abstractmethod decoration (scikit-learn#15997) * Update v0.22 changelog for 0.22.1 (scikit-learn#16002) - set the date - move entry for quantile transformer to the 0.22.1 section - fix alphabetical ordering of modules * STY Removes hidden scroll bar (scikit-learn#15999) * Flake8 fixes * Fix: remove left-over lines that should have been deleted during conflict resolution when rebasing * Fix missing imports * Update version * Fix test_check_is_fitted * Make test_sag_regressor_computed_correctly deterministic (scikit-learn#16003) Fix scikit-learn#15818. Co-authored-by: cgsavard <claire.savard@colorado.edu> Co-authored-by: Joel Nothman <joel.nothman@gmail.com> Co-authored-by: Thomas J Fan <thomasjpfan@gmail.com> Co-authored-by: Matt Hall <matt@agilegeoscience.com> Co-authored-by: Kathryn Poole <kathryn.poole2@gmail.com> Co-authored-by: lucyleeow <jliu176@gmail.com> Co-authored-by: JJmistry <jayminm22@gmail.com> Co-authored-by: Juan Carlos Alfaro Jiménez <JuanCarlos.Alfaro@uclm.es> Co-authored-by: SylvainLan <sylvain.s.lannuzel@gmail.com> Co-authored-by: Nicolas Hug <contact@nicolas-hug.com> Co-authored-by: Hanmin Qin <qinhanmin2005@sina.com> Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com> Co-authored-by: Vachan D A <vachanda@users.noreply.github.com> Co-authored-by: Sambhav Kothari <sambhavs.email@gmail.com> Co-authored-by: wenliwyan <12013376+wenliwyan@users.noreply.github.com> Co-authored-by: shivamgargsya <shivam.gargshya@gmail.com> Co-authored-by: Reshama Shaikh <rs2715@stern.nyu.edu> Co-authored-by: Oliver Urs Lenz <oulenz@users.noreply.github.com> Co-authored-by: Loïc Estève <loic.esteve@ymail.com> Co-authored-by: Brian Wignall <BrianWignall@gmail.com> Co-authored-by: Ritchie Ng <ritchieng@u.nus.edu> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: inderjeet <43402782+inder128@users.noreply.github.com> Co-authored-by: scibol <scibol@users.noreply.github.com> Co-authored-by: Tirth Patel <tirthasheshpatel@gmail.com> Co-authored-by: Bibhash Chandra Mitra <bibhashm220896@gmail.com> Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org> Co-authored-by: Tiffany R. Williams <Tiffany8@users.noreply.github.com> Co-authored-by: Jigna Panchal <40188288+jigna-panchal@users.noreply.github.com> Co-authored-by: @nkish <19225359+ankishb@users.noreply.github.com> Co-authored-by: Pulkit Mehta <pulkit_mehta_work@yahoo.com> Co-authored-by: David Breuer <DavidBreuer@users.noreply.github.com> Co-authored-by: Niklas <niklas.sm+github@gmail.com> Co-authored-by: Windber <guolipengyeah@126.com> Co-authored-by: Stephen Blystone <29995339+blynotes@users.noreply.github.com> Co-authored-by: Brigitta Sipőcz <b.sipocz@gmail.com>

FIX: order of values of self.quantiles_ in QuantileTransformer

b13fcf5

fcharras reviewed Dec 2, 2019

View reviewed changes

sklearn/preprocessing/_data.py Outdated Show resolved Hide resolved

FIX: order of values in self.quantiles_ in QuantileTransformer

7bd7846

NicolasHug mentioned this pull request Dec 2, 2019

QuantileTransformer quantiles can be unordered because of rounding errors which cause np.interp to return nonsense results #15733

Closed

NicolasHug reviewed Dec 2, 2019

View reviewed changes

sklearn/preprocessing/_data.py Outdated Show resolved Hide resolved

add comment explaining changes

1196430

tirthasheshpatel added 2 commits December 2, 2019 21:23

Update _data.py

3816224

Update _data.py

b47a786

glemaitre reviewed Dec 6, 2019

View reviewed changes

sklearn/preprocessing/_data.py Outdated Show resolved Hide resolved

sklearn/preprocessing/_data.py Show resolved Hide resolved

tirthasheshpatel and others added 3 commits December 6, 2019 22:33

Update sklearn/preprocessing/_data.py

3eb3aa9

Add suggested docstring... Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/preprocessing/_data.py

ebe7590

Add suggested docstring Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

add fix in whats_new

ec222ea

tirthasheshpatel requested a review from glemaitre December 7, 2019 07:02

fcharras reviewed Dec 7, 2019

View reviewed changes

doc/whats_new/v0.22.rst Outdated Show resolved Hide resolved

Update doc/whats_new/v0.22.rst

c9cf96e

Co-Authored-By: fcharras <franck@sancare.fr>

jnothman approved these changes Dec 8, 2019

View reviewed changes

NicolasHug approved these changes Dec 9, 2019

View reviewed changes

adrinjalali added this to the 0.22.1 milestone Dec 9, 2019

ogrisel reviewed Dec 20, 2019

View reviewed changes

sklearn/preprocessing/_data.py Show resolved Hide resolved

Add non-regression test + reference to upstream numpy issue

911859d

Merge remote-tracking branch 'origin/master' into patch-1

29bce0c

glemaitre reviewed Dec 21, 2019

View reviewed changes

glemaitre approved these changes Dec 21, 2019

View reviewed changes

sklearn/preprocessing/tests/test_data.py Outdated Show resolved Hide resolved

sklearn/preprocessing/tests/test_data.py Outdated Show resolved Hide resolved

Use _convert_container

f81b17c

ogrisel approved these changes Dec 21, 2019

View reviewed changes

ogrisel merged commit 0c0b6da into scikit-learn:master Dec 21, 2019

ogrisel pushed a commit to ogrisel/scikit-learn that referenced this pull request Dec 31, 2019

MRG FIX: order of values of self.quantiles_ in QuantileTransformer (s…

1303834

…cikit-learn#15751)

ogrisel pushed a commit to ogrisel/scikit-learn that referenced this pull request Jan 2, 2020

MRG FIX: order of values of self.quantiles_ in QuantileTransformer (s…

1739eb9

…cikit-learn#15751)

panpiort8 pushed a commit to panpiort8/scikit-learn that referenced this pull request Mar 3, 2020

MRG FIX: order of values of self.quantiles_ in QuantileTransformer (s…

9f8efdb

…cikit-learn#15751)

Uh oh!

Conversation

tirthasheshpatel commented Dec 1, 2019

Uh oh!

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tirthasheshpatel commented Dec 2, 2019

Uh oh!

glemaitre commented Dec 2, 2019

Uh oh!

NicolasHug commented Dec 2, 2019

Uh oh!

ogrisel commented Dec 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NicolasHug commented Dec 3, 2019

Uh oh!

glemaitre commented Dec 3, 2019

Uh oh!

glemaitre commented Dec 3, 2019

Uh oh!

glemaitre commented Dec 3, 2019

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

NicolasHug left a comment

Choose a reason for hiding this comment

Uh oh!

adrinjalali commented Dec 9, 2019

Uh oh!

NicolasHug commented Dec 9, 2019

Uh oh!

glemaitre commented Dec 10, 2019

Uh oh!

Uh oh!

ogrisel commented Dec 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ogrisel commented Dec 21, 2019

Uh oh!

glemaitre Dec 21, 2019

Choose a reason for hiding this comment

Uh oh!

glemaitre Dec 21, 2019

Choose a reason for hiding this comment

Uh oh!

ogrisel Dec 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ogrisel Dec 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

glemaitre Dec 21, 2019

Choose a reason for hiding this comment

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

ogrisel commented Dec 3, 2019 •

edited

Loading

ogrisel commented Dec 21, 2019 •

edited

Loading

ogrisel Dec 21, 2019 •

edited

Loading

ogrisel Dec 21, 2019 •

edited

Loading