Summary
Replace the belongingness ratio score with a relative match score value that is based on log odds comparisons of different evaluation fields.
Acceptance Criteria
- [ ] Merge the link.LinkResult and schemas.LinkResult class into one, keeping the latter (moved to follow up)
Details / Tasks
The details in how the new relative match score (RMS) is calculated can be found in this document. Use that as a guide for implementing the new link_record function, which will accept the same input and more or less produce the same output.
The schemas.LinkResult class needs to change to capture the differences between a belongingness ratio score and RMS.
- Remove the existing
belongingness_ratio value
- Add the
relative_match_score value
- Add a pass number to indicate which pass produced the best score for the cluster (1 indexed)
4. Add a context list, this is a list of all the feature evaluations for the pass that generated the highest score. Each item in the list will have the feature name and the weighted score associated to that feature.
The MatchGrade enum should allow for ordering, that is, I should be able to show that MatchGrade.CERTAIN > MatchGrate.POSSIBLE. Consider using the total_ordering decorator for implementation. For the hierarchy; CERTAIN > POSSIBLE > CERTAINLY_NOT.
When selecting the best results from the evaluation table, first compare on MatchGrade. If two passes for the same cluster have the same MatchGrade, then select the result with the highest RMS.
Background / Context
RFC-002
Testing Considerations
Please include algorithm test results of the 84 test cases NBS shared with us, using both the existing codebase and the RMS implementation.
Summary
Replace the belongingness ratio score with a relative match score value that is based on log odds comparisons of different evaluation fields.
Acceptance Criteria
possible_match_windowon each passbeloningness_ratioparameter on the algorithm configurationlink.link_recordmethod- [ ] Merge the(moved to follow up)link.LinkResultandschemas.LinkResultclass into one, keeping the latterschemas.LinkResultclass to include therelative_match_scorevalue,pass_numberandof feature evaluations (context moved to follow up)contextschemas.Predictionclassschemas.MatchGradeenumDetails / Tasks
The details in how the new relative match score (RMS) is calculated can be found in this document. Use that as a guide for implementing the new
link_recordfunction, which will accept the same input and more or less produce the same output.The
schemas.LinkResultclass needs to change to capture the differences between a belongingness ratio score and RMS.belongingness_ratiovaluerelative_match_scorevalue4. Add a context list, this is a list of all the feature evaluations for the pass that generated the highest score. Each item in the list will have the feature name and the weighted score associated to that feature.TheMatchGradeenum should allow for ordering, that is, I should be able to show thatMatchGrade.CERTAIN > MatchGrate.POSSIBLE. Consider using the total_ordering decorator for implementation. For the hierarchy;CERTAIN > POSSIBLE > CERTAINLY_NOT.When selecting the best results from the evaluation table, first compare on MatchGrade. If two passes for the same cluster have the same MatchGrade, then select the result with the highest RMS.
Background / Context
RFC-002
Testing Considerations
Please include algorithm test results of the 84 test cases NBS shared with us, using both the existing codebase and the RMS implementation.