This package provides an evaluation API for models produced in the MEDS ecosystem. If predictions are produced
in accordance with the provided pyarrow schema, this package can be used to evaluate a model's performance
in a consistent, Health-AI focused manner.
To use, simply:
- Install:
pip install meds-evaluation - Produce predictions that satisfy the included schema.
- Run the
meds-evaluation-clitool:meds-evaluation-cli predictions_path="$PREDICTIONS_FP_GLOB" output_dir="$OUTPUT_DIR"
A JSON file with the output evaluations will be produced in the given dir!
Note
This is a work-in-progress package and currently only supports evaluation of binary classification tasks.
Inputs to MEDS Evaluation must follow the prediction schema, which by default has five fields:
subject_id: ID of the subject (patient) associated with the eventprediction_time: time at which the prediction as being madeboolean_value: ground truth boolean label for the prediction taskpredicted_boolean_value(optional): predicted boolean label generated by the modelpredicted_boolean_probability(optional): predicted probability logits generated by the model
This is equivalent to the following polars schema:
Schema(
[
("subject_id", Int64),
("prediction_time", Datetime(time_unit="us")),
("boolean_value", Boolean),
("predicted_boolean_value", Boolean),
("predicted_boolean_probability", Float64),
]
)Note that while predicted_boolean_value and predicted_boolean_probability are optional, at least one of
them must be present and contain non-null values in order to generate the results. In addition, a schema can
contain additional fields but at the moment these will not be used in MEDS Evaluation.
MEDS Evaluation pipeline is intended to be used together with MEDS-DEV, but can also be adapted to use as a standalone package.
Please refer to the MEDS-DEV tutorial to learn how to extract and prepare the data in the MEDS format and obtain model predictions ready to be evaluated.