Skip to content

Implementation of linkage algorithm against new schema#52

Merged
ericbuckley merged 16 commits into
mainfrom
feature/23-new-link-module
Oct 2, 2024
Merged

Implementation of linkage algorithm against new schema#52
ericbuckley merged 16 commits into
mainfrom
feature/23-new-link-module

Conversation

@ericbuckley

@ericbuckley ericbuckley commented Sep 27, 2024

Copy link
Copy Markdown
Collaborator

Description

This is the big one, this is a reimplementation of the linkage algorithm using the new schema. The new algorithm has been added to recordlinker.linking.link.

Related Issues

closes #23

Additional Notes

  • Replaced the FEATURE literal type with a Feature enum for better type safety and readability. (src/recordlinker/models/pii.py)
  • Separated matchers into two files based on the signature it was using (old schema vs new schema) (src/recordlinker/linkage/matchers.py and src/recordlinker/linking/matchers.py)
  • Changed FEATURE_COMPARE_FUNC and MATCH_RULE_FUNC to use floats instead of booleans for more granular matching. (src/recordlinker/linking/matchers.py)
  • Added caching to the Patient model for generating its record. (src/recordlinker/models/mpi.py)

<--------------------- REMOVE THE LINES BELOW BEFORE MERGING --------------------->

Checklist

Please review and complete the following checklist before submitting your pull request:

  • I have ensured that the pull request is of a manageable size, allowing it to be reviewed within a single session.
  • I have reviewed my changes to ensure they are clear, concise, and well-documented.
  • I have updated the documentation, if applicable.
  • I have added or updated test cases to cover my changes, if applicable.
  • I have minimized the number of reviewers to include only those essential for the review.
  • I have notified teammates in the review thread to build awareness.

Checklist for Reviewers

Please review and complete the following checklist during the review process:

  • The code follows best practices and conventions.
  • The changes implement the desired functionality or fix the reported issue.
  • The tests cover the new changes and pass successfully.
  • Any potential edge cases or error scenarios have been considered.
    This pull request includes significant changes to the recordlinker module to improve feature matching and update the codebase to use more descriptive and consistent identifiers. The most important changes involve modifying the feature comparison functions to return float values instead of booleans, updating the Feature model to use an enum, and adjusting related unit tests.

@ericbuckley ericbuckley added the api New API feature label Sep 27, 2024
@ericbuckley ericbuckley self-assigned this Sep 27, 2024
@ericbuckley ericbuckley marked this pull request as ready for review September 27, 2024 00:48
Comment thread src/recordlinker/models/pii.py
Comment thread src/recordlinker/linking/link.py
Comment thread src/recordlinker/linkage/matchers.py Outdated
Comment thread assets/linking/basic_algorithm.json Outdated
Comment thread src/recordlinker/linkage/matchers.py Outdated
Comment thread src/recordlinker/linking/link.py
Comment thread src/recordlinker/linking/link.py
Comment thread src/recordlinker/linking/link.py
Comment thread tests/unit/test_matchers.py
@ericbuckley ericbuckley marked this pull request as draft October 1, 2024 04:03
@ericbuckley ericbuckley marked this pull request as ready for review October 1, 2024 04:03
@ericbuckley ericbuckley merged commit d105454 into main Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api New API feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

simple_link module

4 participants