Skip to content

Fix: Context boosting in YAML recognizers (#1696)#1705

Closed
MRADULTRIPATHI wants to merge 1 commit intomicrosoft:mainfrom
MRADULTRIPATHI:fix-context-boost
Closed

Fix: Context boosting in YAML recognizers (#1696)#1705
MRADULTRIPATHI wants to merge 1 commit intomicrosoft:mainfrom
MRADULTRIPATHI:fix-context-boost

Conversation

@MRADULTRIPATHI
Copy link
Copy Markdown
Contributor

@MRADULTRIPATHI MRADULTRIPATHI commented Aug 31, 2025

Change Description

This PR fixes an issue where custom recognizers defined in YAML with a context field did not properly boost the detection score.
Previously, the match score stayed at the base pattern score (e.g., 0.8) even when the context word (e.g., "DOB") was present.
Now, the recognizer correctly applies context boosting so that the score is raised (e.g., from 0.81.0) when context terms are detected near the entity.

Issue reference

Fixes #1696

Checklist

  • I have reviewed the contribution guidelines
  • I have signed the CLA (if required)
  • My code includes unit tests verifying that context boosting works as expected
  • All unit tests and lint checks pass locally
  • My PR contains documentation updates / additions if required

Example (Before → After)

Before:
Entity: DATE_TIME, Score: 0.8

After:
Entity: DATE_TIME, Score: 1.0

@MRADULTRIPATHI
Copy link
Copy Markdown
Contributor Author

@microsoft-github-policy-service agree

@MRADULTRIPATHI
Copy link
Copy Markdown
Contributor Author

Hi @maintainers,
I’ve created this PR to fix issue #1696 (Context in YAML recognizers was not boosting scores as expected).
The fix ensures context words properly increase the match score.

I’ve signed the CLA ✅ and all checks have passed.
Kindly review this PR when you get a chance. Thanks!

@omri374
Copy link
Copy Markdown
Collaborator

omri374 commented Sep 1, 2025

Thanks for this PR. The context awareness flow takes place after each recognizer analyzes the text, and is decoupled from the recognition logic. Closing this PR as this creates a dependency between the Pattern Recognizer and the context enhancer. I believe that the problem in issue 1696 is a configuration problem and not a problem with Presidio's logic, as this flow works if defined in code.

@omri374 omri374 closed this Sep 1, 2025
@MRADULTRIPATHI
Copy link
Copy Markdown
Contributor Author

Thanks for the clarification!
I understand that the context enhancer is decoupled from the PatternRecognizer.

Could you please guide on how to properly configure context boosting when using a custom recognizer defined in YAML?
In code-based configuration, it works fine, but YAML seems to ignore the context field.
An example would be very helpful to avoid misconfiguration.

Happy to adjust my PR or contribute to docs if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Context in recognizer YAML is ignored when scoring matches

2 participants