Incorrect grid search results with correlation metrics in rare edge cases

While investigating #579, @mulhod found weird inconsistencies when trying to manually run the code in [that test](https://github.com/EducationalTestingService/skll/blob/master/tests/test_output.py#L1084) in SKLL and when trying to run it in scikit-learn space with as little SKLL code as possible. Upon further investigation and using a more minimal example, I found that our decision to return 0 instead of `NaN` from our correlation functions is the issue here. 

As [this analysis](https://github.com/EducationalTestingService/skll/files/3779172/replicate.pdf) shows, by returning `0`s instead of `NaN`s, it is possible for SKLL to incorrectly choose the the least desirable hyperparameter configuration as the best option in some edge cases of this type where pearson may not be mathematically well-defined.  

To fix this, we should just get rid of [the line](https://github.com/EducationalTestingService/skll/blob/master/skll/metrics.py#L230) that converts `NaN`s to `0`s. However, this might break some existing tests so they will need to be updated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incorrect grid search results with correlation metrics in rare edge cases #585

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect grid search results with correlation metrics in rare edge cases #585

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions