We need a codified way to indicate what datasets are suitable for what tasks

This is also a necessary component of ensuring viable testing, so that we don't have tests fail b/c they examine invalid tasks and datasets.

There are a few considerations here:
1. Some datasets are not suitable for tasks at a base make-up level (e.g., eICU is at the hospital-stay level, not patient level, so should not be used for 30d hospital readmission. MIMIC-IV is a cohort of patients who were at some point admitted to either the ED or ICU, so is not suitable for 30d general hospital readmission (though we invalidate that now)).
2. Some datasets are suitable for tasks, but are not currently configured for them because predicates have not been set up by local data owners. This means that from a testing perspective, we likely need a way to indicate "general viability" for dataset X task combinations, and also "set-up task X dataset" combos used in testing.
3. Some datasets are suitable for tasks, but do not permit appropriate predicate definitions without ACES updates or something. This is like 2, technically, but requires different operationalization.

We should also aim for a series of minimal improvements rather than aiming for perfection.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

We need a codified way to indicate what datasets are suitable for what tasks #60

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

We need a codified way to indicate what datasets are suitable for what tasks #60

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions