Skip to content

Knn transpose fix#689

Merged
TonyBagnall merged 12 commits intomasterfrom
knn_transpose_fix
Feb 18, 2021
Merged

Knn transpose fix#689
TonyBagnall merged 12 commits intomasterfrom
knn_transpose_fix

Conversation

@TonyBagnall
Copy link
Copy Markdown
Contributor

Reference Issues/PRs

Those goes a way to addressing issues raised in #596

What does this implement/fix? Explain your changes.

  1. Fix the data representation

a common framework for check_X was introduced that includes a mechanism for coercing a pandas into a numpy array. This superceded the previous conversion used in the KNeighborsTimeSeriesClassifier. However, check_X converts to data[n][d][m] (n=num cases, d = num dimensions, m = series length), whereas this is set up to use data[n][m][d] to optimise vectorisation). The first fix is to correct this with a simple transpose

  1. Fix default setting for MSM
    MSM distance was defaulting the cost parameter c to 0. The tsml version uses 1, and this is preferable,

  2. Remove the option for algorithm, only use "brute"
    the two alternatives in sklearn knn are not usable with time series, and are hard coded that way. It is best to just remove the option and reduce code bloat somewhat

  3. First basic unit testing
    check accuracy on arrow head, data verified in sister package tsml

@Lovkush-A
Copy link
Copy Markdown
Collaborator

Lovkush-A commented Feb 20, 2021

@TonyBagnall, @mloning This merge has caused one (and only one) of the tests to fail locally.

_____________________________________________________ test_knn_on_arrowhead ______________________________________________________

    def test_knn_on_arrowhead():
        # load gunpoint data
        X_train, y_train = load_arrow_head(split="train", return_X_y=True)
        X_test, y_test = load_arrow_head(split="test", return_X_y=True)
        for i in range(0, len(distance_functions)):
            knn = KNeighborsTimeSeriesClassifier(
                distance=distance_functions[i],
            )
            knn.fit(X_train, y_train)
            pred = knn.predict(X_test)
            correct = 0
            for j in range(0, len(pred)):
                if pred[j] == y_test[j]:
                    correct = correct + 1
>           assert correct == expected_correct[distance_functions[i]]
E           assert 97 == 139

@TonyBagnall
Copy link
Copy Markdown
Contributor Author

thanks, I'll sort that, I left something in I didnt mean to. I guess because it was a new test in the PR did not run it?

@TonyBagnall
Copy link
Copy Markdown
Contributor Author

hmm, curious, not sure what happened as it worked locally. I've removed a test for now as a quick fix so I dont hold others up and will have more complete testing next week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants