Skip to content

Implement FULL_FIRST_NAME FeatureAttribute #299

@m-goggins

Description

@m-goggins

Summary

Implement a new feature attribute called FULL_FIRST_NAME that prepends normalized suffixes (where applicable) to first names during evaluation so that suffixes may be included.

Acceptance Criteria

  • FULL_FIRST_NAME compares first names, including normalized suffixes (where applicable); see task details about why we should only include normalized suffixes
  • The implementation in feature_iter includes a check that suffixed has been normalized before prepending it to the first name
  • When a record does not have a suffix, FULL_FIRST_NAME includes just the first name
  • Tests are added to demonstrate multiple edge cases, such as comparing two names with normalized suffixes, at least 1 first name has a suffix that is not normalized (and ignored), at least 1 name does not have a suffix, etc.
  • Documentation is updated to reflect the new feature attribute

Details / Tasks

We should only prepend normalized suffixes. If there is text in the suffix field but it could not be normalized, we should not prepend it. For example, if the suffix is "the 2nd" and first name is "John", FULL_FIRST_NAME for this record should be "JOHN", not "THE2NDJOHN". This would likely lead to very low similarity scores for first name due to bad suffix normalization, which we do not want.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions