[MRG] BUG ensure object array are properly casted when dtype=object by alexshacked · Pull Request #16076 · scikit-learn/scikit-learn

alexshacked · 2020-01-09T16:22:20Z

Fix a bug where calling np.array(..., dtype=object) will create a N-D array while algorithms are expecting a 1-D array with objects inside (similar to a list).

…X, but not in 0.21.3 (scikit-learn#16036)

thomasjpfan

Thank you for the PR @alexshacked !

sklearn/neighbors/tests/test_neighbors.py

glemaitre

Looks good. A couple of changes.

Please add an entry to the change log at doc/whats_new/v0.20.rst under bug fixes. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

sklearn/neighbors/_base.py

sklearn/neighbors/tests/test_neighbors.py

sklearn/neighbors/_base.py

jnothman · 2020-01-09T21:42:51Z

I see a similar idiom three times in the radius_neighbors method in this file

glemaitre · 2020-01-09T22:05:27Z

Oh I see, I was seeking for np.array(..., dtype=object).
Then, I agree to move it either as it is now (I would rename it _to_object_array) or even in utils if we have something similar in other file. I will look at it.

glemaitre · 2020-01-09T22:14:40Z

So we need to call _to_object_array for sklearn.neighbors._base: l.945-948; l.952-953

NB: I searched for the patter [:] = and filter that it was preceded by the creation of a numpy object array.

sklearn/neighbors/_base.py

jnothman · 2020-01-09T22:37:13Z

So we need to call _to_object_array for sklearn.neighbors._base: l.945-948; l.952-953

I see similar here:

sklearn/preprocessing/tests/test_label.py=435=def test_multilabel_binarizer_non_integer_labels():
sklearn/preprocessing/tests/test_label.py:436:    tuple_classes = np.empty(3, dtype=object)
sklearn/preprocessing/tests/test_label.py-437-    tuple_classes[:] = [(1,), (2,), (3,)]
--
sklearn/neighbors/_classification.py:541:            pred_labels = np.zeros(len(neigh_ind), dtype=object)
sklearn/neighbors/_classification.py-542-            pred_labels[:] = [_y[ind, k] for ind in neigh_ind]

but otherwise agree it's all in radius_neighbors

alexshacked · 2020-01-09T22:40:31Z

@glemaitre change log is in v0.20.rst? I thought v0.23.rst

glemaitre · 2020-01-09T22:41:09Z

v0.23.rst

glemaitre · 2020-01-09T22:41:35Z

Ups my automatic answering is broken :)

…in regression testing also distances

…PEP fixes

alexshacked · 2020-01-09T22:52:18Z

ok. v0.23 then. Thanks @glemaitre

…change log

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

glemaitre

We will need to apply the _to_object_array function on the following line:

sklearn/preprocessing/tests/test_label.py=435=def test_multilabel_binarizer_non_integer_labels():
sklearn/preprocessing/tests/test_label.py:436:    tuple_classes = np.empty(3, dtype=object)
sklearn/preprocessing/tests/test_label.py-437-    tuple_classes[:] = [(1,), (2,), (3,)]

I propose to add a docstring (as you did earlier) and move the _to_object_array function in sklearn/utils/__init__.py. Then, we can import it in neighbors and preprocessing.

We just need to add a small test in sklearn/utils/tests/test_utils.py to check the expected behavior:

@pytest.mark.parametrize(
    "sequence",
    [[np.array(1), np.array(2)], [[1, 2], [3, 4]]]
)
test_to_object_array(sequence):
    out = _to_object_array(sequence)
    assert isinstance(out, ndarray)
    assert out.dtype.kind == 'O'
    assert out.ndim == 1

doc/whats_new/v0.23.rst

sklearn/neighbors/_base.py

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

…moved to_object_array() to sklearn.utils

… fix PEP8 errors

alexshacked · 2020-01-10T16:27:04Z

Hi @glemaitre. Moved function to_object_array() to sklearn.utils and changed the message in the change log of v0.23.rst

glemaitre

Apart of making the function private LGTM. @alexshacked you can accept my suggestion and this would be enough.

sklearn/utils/__init__.py

sklearn/preprocessing/tests/test_label.py

sklearn/utils/__init__.py

sklearn/neighbors/_base.py

alexshacked · 2020-01-10T18:17:44Z

Sorry about this @glemaitre . I thought one underscore means private inside the class, not private inside the package. Will restore the underscore

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

…removed unnecessary comment

…restored underscore

glemaitre · 2020-01-13T10:32:07Z

LGTM. @jnothman @thomasjpfan Could you have a look. I added the regression tag and tag it as a candidate for 0.22.2

TomDLT

LGTM

sklearn/utils/__init__.py

doc/whats_new/v0.23.rst

sklearn/utils/__init__.py

alexshacked · 2020-01-13T18:34:10Z

Thanks for your comments @TomDLT. Will apply them all.

…improved _to_object_array() documentation

TomDLT · 2020-01-15T02:07:13Z

Thanks @alexshacked !

…-learn#16076)

* FIX ensure object array are properly casted when dtype=object (#16076) * DOC Docstring example of classifier should import classifier (#16430) * MNT Update nightly build URL and release staging config (#16435) * BUG ensure that estimator_name is properly stored in the ROC display (#16500) * BUG ensure that name is properly stored in the precision/recall display (#16505) * ENH Perform KNN imputation without O(n^2) memory cost (#16397) * bump scikit-learn version for binder * bump version to 0.22.2 * MNT Skips failing SpectralCoclustering doctest (#16232) * TST Updates test for deprecation in pandas.SparseArray (#16040) * move 0.22.2 what's new entries (#16586) * add 0.22.2 in the news of the web site frontpage * skip test_ard_accuracy_on_easy_problem Co-authored-by: alexshacked <al.shacked@gmail.com> Co-authored-by: Oleksandr Pavlyk <oleksandr-pavlyk@users.noreply.github.com> Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com> Co-authored-by: Joel Nothman <joel.nothman@gmail.com> Co-authored-by: Thomas J Fan <thomasjpfan@gmail.com>

…-learn#16076)

[MRG] Using dbscan with precomputed neighbors gives an error in 0.22.…

c2abc89

…X, but not in 0.21.3 (scikit-learn#16036)

thomasjpfan reviewed Jan 9, 2020

View reviewed changes

sklearn/neighbors/tests/test_neighbors.py Outdated Show resolved Hide resolved

sklearn/neighbors/tests/test_neighbors.py Outdated Show resolved Hide resolved

glemaitre changed the title ~~[MRG] Using dbscan with precomputed neighbors gives an error in 0.22.…~~ [MRG] BUG ensure object array are properly casted when dtype=object Jan 9, 2020

glemaitre requested changes Jan 9, 2020

View reviewed changes

jnothman reviewed Jan 9, 2020

View reviewed changes

sklearn/neighbors/_base.py Outdated Show resolved Hide resolved

sklearn/neighbors/_base.py Outdated Show resolved Hide resolved

glemaitre reviewed Jan 9, 2020

View reviewed changes

alexshacked added 2 commits January 10, 2020 00:44

[MRG] Using dbscan with precomputed neighbours. (scikit-learn#16036) …

06f01d4

…in regression testing also distances

[MRG] Using dbscan with precomputed neighbours. (scikit-learn#16036) …

0fca6f0

…PEP fixes

alexshacked and others added 11 commits January 10, 2020 01:12

[MRG] Using dbscan with precomputed neighbours. (scikit-learn#16036) …

887f154

…change log

Update sklearn/neighbors/_base.py

e582a40

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/neighbors/_base.py

eda4787

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/neighbors/tests/test_neighbors.py

eedad66

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/neighbors/tests/test_neighbors.py

bfeee92

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/neighbors/_base.py

e13297c

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/neighbors/_base.py

a512168

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/neighbors/_base.py

a538fac

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/neighbors/_base.py

19b23b2

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/neighbors/_base.py

af6b74d

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

(scikit-learn#16036) refactoring with new function _to_object_array()

e2d9f8b

glemaitre reviewed Jan 10, 2020

View reviewed changes

doc/whats_new/v0.23.rst Outdated Show resolved Hide resolved

doc/whats_new/v0.23.rst Outdated Show resolved Hide resolved

doc/whats_new/v0.23.rst Outdated Show resolved Hide resolved

sklearn/neighbors/_base.py Outdated Show resolved Hide resolved

alexshacked and others added 2 commits January 10, 2020 15:44

Update doc/whats_new/v0.23.rst

93d87d5

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update doc/whats_new/v0.23.rst

71f9c7f

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

alexshacked and others added 3 commits January 10, 2020 15:48

Update sklearn/neighbors/_base.py

33b68e5

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

[MRG] Using dbscan with precomputed neighbours. (scikit-learn#16036) …

5edfeb6

…moved to_object_array() to sklearn.utils

[MRG] Using dbscan with precomputed neighbours. (scikit-learn#16036)…

d3e05dd

… fix PEP8 errors

glemaitre approved these changes Jan 10, 2020

View reviewed changes

alexshacked and others added 11 commits January 10, 2020 20:18

Update sklearn/preprocessing/tests/test_label.py

a992048

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/utils/__init__.py

ece10a1

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/neighbors/_base.py

6a75af4

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/neighbors/_base.py

83df8c6

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/neighbors/_base.py

3508571

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/neighbors/_base.py

8ddd3e5

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/neighbors/_base.py

c302eb9

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

Update sklearn/neighbors/_base.py

cf8ada9

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

[MRG] Using dbscan with precomputed neighbours. (scikit-learn#16036) …

8df3f1c

…removed unnecessary comment

fixed import statement

aaf778e

[MRG] Using dbscan with precomputed neighbours. (scikit-learn#16036) …

e5dcb28

…restored underscore

glemaitre added the Regression label Jan 13, 2020

glemaitre added this to the 0.22.2 milestone Jan 13, 2020

TomDLT approved these changes Jan 13, 2020

View reviewed changes

sklearn/utils/__init__.py Show resolved Hide resolved

doc/whats_new/v0.23.rst Outdated Show resolved Hide resolved

sklearn/utils/__init__.py Outdated Show resolved Hide resolved

alexshacked and others added 2 commits January 13, 2020 23:19

[MRG] Using dbscan with precomputed neighbours. (scikit-learn#16036) …

cf31a9e

…improved _to_object_array() documentation

add empty line

6415d4b

TomDLT merged commit c4ea377 into scikit-learn:master Jan 15, 2020

thomasjpfan pushed a commit to thomasjpfan/scikit-learn that referenced this pull request Feb 22, 2020

FIX ensure object array are properly casted when dtype=object (scikit…

0791946

…-learn#16076)

jeremiedbb pushed a commit to jeremiedbb/scikit-learn that referenced this pull request Feb 28, 2020

FIX ensure object array are properly casted when dtype=object (scikit…

9c85f44

…-learn#16076)

panpiort8 pushed a commit to panpiort8/scikit-learn that referenced this pull request Mar 3, 2020

FIX ensure object array are properly casted when dtype=object (scikit…

7afde81

…-learn#16076)

Uh oh!

Conversation

alexshacked commented Jan 9, 2020 • edited by glemaitre Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomasjpfan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jnothman commented Jan 9, 2020 via email

Uh oh!

glemaitre commented Jan 9, 2020

Uh oh!

glemaitre commented Jan 9, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jnothman commented Jan 9, 2020

Uh oh!

alexshacked commented Jan 9, 2020

Uh oh!

glemaitre commented Jan 9, 2020

Uh oh!

glemaitre commented Jan 9, 2020

Uh oh!

alexshacked commented Jan 9, 2020

Uh oh!

glemaitre left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexshacked commented Jan 10, 2020

Uh oh!

glemaitre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexshacked commented Jan 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glemaitre commented Jan 13, 2020

Uh oh!

TomDLT left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexshacked commented Jan 13, 2020

Uh oh!

TomDLT commented Jan 15, 2020

Uh oh!

alexshacked commented Jan 9, 2020 •

edited by glemaitre

Loading

glemaitre left a comment •

edited

Loading

alexshacked commented Jan 10, 2020 •

edited

Loading