Skip to content

[BUG] ColumnEnsembleClassifier fails fitting KNN classifiers #578

@isma3ilsamir

Description

@isma3ilsamir

Describe the bug
I am trying to classify a multivariate dataset using a ColumnEnsembleClassifier. I use the same classifier for all dimensions.
It fails to fit the dataset when I use KNN classifiers, but works when I use TimeSeriesForestClassifier.

I tried KNN with different metrics ('msm, 'dtw' 'euclidean' -> imported from scipy'), all of them fail
I tried it with AtrialFibrillation and BasicMotions datasets, it fails for both.

To Reproduce

from sktime.utils.data_io import load_from_arff_to_dataframe as load_arff
X_train, y_train = load_arff(path_to_dataset_TRAIN.arff')

from sktime.classification.compose import ColumnEnsembleClassifier
from sktime.classification.distance_based._time_series_neighbors import KNeighborsTimeSeriesClassifier
from sktime.classification.compose import TimeSeriesForestClassifier

########## This one works fine ##########
clf= ColumnEnsembleClassifier(estimators=[
                                       ('TSF_0',TimeSeriesForestClassifier(verbose=0,n_jobs=-1),[0]),
                                       ('TSF_1',TimeSeriesForestClassifier(verbose=0,n_jobs=-1),[1])
                                     ])
clf.fit(X_train, y_train)
ColumnEnsembleClassifier(estimators=[
                                       ('TSF_0',TimeSeriesForestClassifier(n_jobs=-1),[0]),
                                       ('TSF_1',TimeSeriesForestClassifier(n_jobs=-1),[1])
                                     ])


########## This one fails ########## 
clf= ColumnEnsembleClassifier(estimators=[
                                       ('1NN-MSM_0', KNeighborsTimeSeriesClassifier(metric='msm'), [0]),
                                       ('1NN-MSM_1', KNeighborsTimeSeriesClassifier(metric='msm'), [1])
                                     ])
clf.fit(X_train, y_train)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~\Desktop\OVGU\DKE Subjects\Master Thesis\Code\run.py in <module>
----> 1 clf.fit(X_train, y_train)

~\.virtualenvs\Code-OH4xsw-D\lib\site-packages\sktime\classification\compose\_column_ensemble.py in fit(self, X, y)
    157         for name, estimator, column in self._iter(replace_strings=True):
    158             estimator = clone(estimator)
--> 159             estimator.fit(_get_column(X, column), transformed_y)
    160             estimators_.append((name, estimator, column))
    161

~\.virtualenvs\Code-OH4xsw-D\lib\site-packages\sktime\classification\distance_based\_time_series_neighbors.py in fit(self, X, y)
    243             check_array.__code__ = _check_array_ts.__code__
    244
--> 245         fx = self._fit(X)
    246
    247         if hasattr(check_array, "__wrapped__"):

~\.virtualenvs\Code-OH4xsw-D\lib\site-packages\sklearn\neighbors\_base.py in _fit(self, X, y)
    362             if not isinstance(X, (KDTree, BallTree, NeighborsBase)):
    363                 X, y = self._validate_data(X, y, accept_sparse="csr",
--> 364                                            multi_output=True)
    365
    366             if is_classifier(self):

~\.virtualenvs\Code-OH4xsw-D\lib\site-packages\sklearn\base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
    413             if self._get_tags()['requires_y']:
    414                 raise ValueError(
--> 415                     f"This {self.__class__.__name__} estimator "
    416                     f"requires y to be passed, but the target y is None."
    417                 )

ValueError: This KNeighborsTimeSeriesClassifier estimator requires y to be passed, but the target y is None.

Expected behavior
ColumnEnsembleClassifier should fit the training data

Versions

Details

System:
python: 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)]
executable: C: ...\....\.virtualenvs\Code-OH4xsw-D\Scripts\python.exe
machine: Windows-10-10.0.18362-SP0

Python dependencies:
pip: 20.3.1
setuptools: 50.3.2
sklearn: 0.24.0
sktime: 0.5.0
statsmodels: 0.12.1
numpy: 1.19.4
scipy: 1.5.4
Cython: 0.29.21
pandas: 1.1.5
matplotlib: 3.3.3
joblib: 1.0.0
numba: 0.52.0
pmdarima: None
tsfresh: 0.17.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions