Skip to content

Expand and scramble test cases#364

Merged
bamader merged 2 commits into
mainfrom
expanded-test-cases
May 14, 2025
Merged

Expand and scramble test cases#364
bamader merged 2 commits into
mainfrom
expanded-test-cases

Conversation

@bamader

@bamader bamader commented May 8, 2025

Copy link
Copy Markdown
Collaborator

Description

This PR adds a script that allow us to combinatorially expand and scramble the test cases NBS provided for us. We leave each of the original test cases unchanged, but we generate a random number of duplicates of each case to mess with. We then dropout some fields to simulate missingness, as well as apply random likelihoods of edit distances to string values in algorithm-relevant fields. All of this is controlled by parameters at the top of the script (which only needs to be run once to make the data-set, doesn't need to be run each time we run the algorithm tests). Of note, the script right now copies the match decisions of the original test case into each duplicate we generate from it, no matter how much we then scramble and mangle the duplicate case. This is the reason we have poor performance on this set, so that's expected. Before sharing this out, we should consider "re-grading" our expanded set to have better labels of when things should and shouldn't match. Also, since the script is customizable, we can also generate different test expansion cases if we want to simulate "pretty close" data (low randomness scrambling) vs "really bad data" (lots of scrambling). We'd need to grade each separately, but this could be useful for purposes of showing performance in different contexts.

Related Issues

Closes #354

<--------------------- REMOVE THE LINES BELOW BEFORE MERGING --------------------->

Checklist

Please review and complete the following checklist before submitting your pull request:

  • I have ensured that the pull request is of a manageable size, allowing it to be reviewed within a single session.
  • I have reviewed my changes to ensure they are clear, concise, and well-documented.
  • I have updated the documentation, if applicable.
  • I have added or updated test cases to cover my changes, if applicable.
  • I have minimized the number of reviewers to include only those essential for the review.

Checklist for Reviewers

Please review and complete the following checklist during the review process:

  • The code follows best practices and conventions.
  • The changes implement the desired functionality or fix the reported issue.
  • The tests cover the new changes and pass successfully.
  • Any potential edge cases or error scenarios have been considered.

@bamader

bamader commented May 8, 2025

Copy link
Copy Markdown
Collaborator Author

Performance of the algorithm tests on this initial run

Screenshot 2025-05-08 at 2 48 25 PM

@codecov

codecov Bot commented May 8, 2025

Copy link
Copy Markdown

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.51%. Comparing base (1b961c8) to head (b1523f1).
Report is 4 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #364   +/-   ##
=======================================
  Coverage   98.51%   98.51%           
=======================================
  Files          33       33           
  Lines        1947     1947           
=======================================
  Hits         1918     1918           
  Misses         29       29           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread tests/algorithm/scripts/expand_test_data.py
Comment thread tests/algorithm/scripts/expand_test_data.py
@bamader bamader merged commit 6402da9 into main May 14, 2025
15 checks passed
@bamader bamader deleted the expanded-test-cases branch May 14, 2025 12:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expand NBS test set

2 participants