Skip to content

Unable to reproduce OneHotEncoder example from the docs #685

@SultanOrazbayev

Description

@SultanOrazbayev

This example from the API reference returns an error:

from dask_ml.preprocessing import OneHotEncoder
import numpy as np
import dask.array as da
enc = OneHotEncoder()
X = da.from_array(np.array([['A'], ['B'], ['A'], ['C']]), chunks=2)
enc.fit(X)
enc.categories_
enc.transform(X)

This is the traceback:

Details
ValueErrorTraceback (most recent call last)
<ipython-input-1-f54891b18539> in <module>
      6 enc.fit(X)
      7 enc.categories_
----> 8 enc.transform(X)

~/myenv/lib/python3.7/site-packages/dask_ml/preprocessing/_encoders.py in transform(self, X)
    211         self, X: Union[ArrayLike, DataFrameType]
    212     ) -> Union[ArrayLike, DataFrameType]:
--> 213         return self._transform(X)
    214 
    215     def _transform_new(

~/myenv/lib/python3.7/site-packages/dask_ml/preprocessing/_encoders.py in _transform(self, X, handle_unknown)
    243                 for i in range(n_features)
    244             ]
--> 245             X = da.concatenate(Xs, axis=1)
    246 
    247             if not self.sparse:

~/myenv/lib/python3.7/site-packages/dask/array/core.py in concatenate(seq, axis, allow_unknown_chunksizes)
   3480         raise ValueError("Need array(s) to concatenate")
   3481 
-> 3482     meta = np.concatenate([meta_from_array(s) for s in seq], axis=axis)
   3483 
   3484     # Promote types to match meta

<__array_function__ internals> in concatenate(*args, **kwargs)

ValueError: zero-dimensional arrays cannot be concatenated

Environment:

  • Dask version: 2.19.0
  • numpy version: 1.18.5
  • Python version: 3.7.6
  • Install method (conda, pip, source): conda

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions