Skip to content

Upgraded to MEDS v0.4#323

Merged
mmcdermott merged 2 commits intomainfrom
322_upgrade_MEDS_04
May 6, 2025
Merged

Upgraded to MEDS v0.4#323
mmcdermott merged 2 commits intomainfrom
322_upgrade_MEDS_04

Conversation

@mmcdermott
Copy link
Copy Markdown
Owner

Closes #322

@mmcdermott mmcdermott requested a review from Copilot May 6, 2025 15:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR upgrades the project to MEDS v0.4 and updates all references to legacy fields and schema objects (e.g. subject_id_field, time_field) to use the new DataSchema, SubjectSplitSchema, and CodeMetadataSchema. It also updates configuration examples and dependency versions accordingly.

  • Updated imports and field references throughout stages (reshard_to_split, extract_values, bin_numeric_values, etc.) to use the new MEDS v0.4 schema.
  • Adjusted examples and configuration in tests and documentation to reflect dependency and API changes.
  • Upgraded dependency versions in pyproject.toml for meds and meds_testing_helpers.

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/MEDS_transforms/stages/reshard_to_split/reshard_to_split.py Replaced legacy field names with new DataSchema/SubjectSplitSchema references and updated sorting keys.
src/MEDS_transforms/stages/extract_values/extract_values.py Updated mandatory types and casting expressions to use DataSchema fields.
src/MEDS_transforms/stages/examples.py Revised example usage to instantiate DatasetMetadataSchema instead of a plain dict.
src/MEDS_transforms/stages/bin_numeric_values/bin_numeric_values.py Changed field references and join column names to their new schema counterparts.
src/MEDS_transforms/stages/aggregate_code_metadata/aggregate_code_metadata.py Switched subject id references to DataSchema.subject_id_name for aggregation consistency.
src/MEDS_transforms/stages/add_time_derived_measurements/* Updated multiple modules to consistently refer to MEDS v0.4 fields for time, subject id, and code.
src/MEDS_transforms/mapreduce/stage.py Adjusted filtering and merge functions to use the new schema definitions.
src/MEDS_transforms/configs/dataset.py Updated dataset metadata handling to use DatasetMetadataSchema and its API.
src/MEDS_transforms/compute_modes/match_revise.py Updated sorting and field references in match-revise functions to use DataSchema.
pyproject.toml Upgraded dependency versions to use meds~=0.4.0 and meds_testing_helpers~=0.3.0.

@codecov
Copy link
Copy Markdown

codecov bot commented May 6, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (a86aa16) to head (113c239).
⚠️ Report is 53 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #323   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           54        54           
  Lines         2397      2401    +4     
=========================================
+ Hits          2397      2401    +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mmcdermott mmcdermott merged commit 95ce8ea into main May 6, 2025
9 checks passed
@mmcdermott mmcdermott deleted the 322_upgrade_MEDS_04 branch May 6, 2025 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Upgrade to MEDS v0.4

2 participants