Skip to content

[BUG] Distance functions and distance based classifiers bugs and features #596

@TonyBagnall

Description

@TonyBagnall

The distance based classifiers are not working properly, and there are several issues related to them. I am going to collate them here and comment on progress fixing them.

  1. KNN with msm distance give exact same scores with different hyper-parameters
    [BUG] KNN with msm distance give exact same scores with different hyper-parameters #589

  2. ColumnEnsembleClassifier fails fitting KNN classifiers
    [BUG] ColumnEnsembleClassifier fails fitting KNN classifiers #578

  3. Cannot dump proximity forest model using joblib or pickle
    [BUG] Cannot dump proximity forest model using joblib or pickle #600

  4. AttributeError: 'ProximityStump' object has no attribute 'get_exemplars
    [BUG] AttributeError: 'ProximityStump' object has no attribute 'get_exemplars' #598

More issues rolled into this one for clarity
5. Proximity Forest always using 2 processors #663
#663

  1. Derivative distance measures out
    [BUG] Derivative based distance measures error out #654

  2. Distance measure unit testing
    Distance measure unit testing #646

  3. Variable length distance throws an error
    [BUG] Variable length time series throw an error in DTW in elastic.py #645

  4. Cross validation bug
    [BUG] Cross validation has no effect on KNeighborsTimeSeriesClassifier distance measure parameters #1057

  5. Strange PF bug in orchestration
    [BUG] unexpected behaviour in Proximity Forest (unspecified) #1114

There are several related issues which I am investigating, this is a work in progress.

  1. the sktime knn classifier has changed, and no longer has a _fit(X) method, so we use fir(X,y)
  2. the DTW window sizes are, we think, not set correctly
  3. there may be an issue with data transposition

actions are:

  1. get the classifiers working
  2. black box correctness vs tsml toolkit
  3. write unit tests for distance functions and classifiers
  4. make sure all tuning works correctly
  5. fix the above issues

fixed

KNeighboursTimeSeriesClassifier throws errors with multivariate time series data when algorithm is set to either 'kd_tree' or 'ball_tree
#328
Euclidean distance for KNN
#551

Bug with KNN
#413

TypeError from KNeighborsTimeSeriesClassifier
#608

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingmodule:classificationclassification module: time series classificationmodule:distances&kernelsdists_kernels and distances modules: time series distances, kernels, pairwise transforms

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions