Detection of Chagas Disease from the ECG: The George B. Moody PhysioNet Challenge 2025
💀💀💀 BIG MISTAKE: Forgot to add (z-score) normalization in the preprocessing pipeline in the config!!! 💀💀💀
- The Conference
- Description of the files/folders(modules)
- Key problem to solve
- Background knowledge
- Post-Conference Thoughts
Conference Website | Unofficial Phase Leaderboard1 | Official Phase Leaderboard | Final Test Results
Click to view the details
- README.md: this file, serves as the documentation of the project.
- cfg.py: the configuration file for the whole project.
- const.py: constant definitions.
- Dockerfile: docker file for building the docker image for submissions.
- requirements.txt, requirements-docker.txt, requirements-no-torch.txt: requirements files for different purposes.
- evaluate_model.py, helper_code.py, prepare_code15_data.py, run_model.py, train_model.py: scripts inherited from the official baseline and official scoring code. Modifications on these files are invalid and are immediately overwritten after being pulled by the organizers (or the submission system).
- sync_official.py: script for synchronizing data from the official baseline and official scoring code.
- team_code.py: entry file for the submissions.
- submissions: log file for the submissions, including the key hyperparameters, the scores received, commit hash, etc. The log file is updated after each submission and organized as a YAML file.
Click to view the details
- official_baseline: the official baseline code, included as a submodule.
- official_scoring_metric: the official scoring code, included as a submodule.
- models: folder for model definitions, typically we used a CRNN model. Some custom loss functions are also defined in this module.
- utils: various utility functions, including custom scoring functions, and some training-validation split files.
- results: folder containing some typical experiment log files, for reproducibility.
The data is highly imbalanced, with only approximately 2% of the data being positive. Dealing with the imbalanced data is the key problem to solve in this challenge. Possible solutions include:
- Upsampling the positive data
- Downsampling the negative data
- Using Focal Loss, Asymmetric Loss, etc.
- Using class weights
- Using data augmentation, including Mixup, Cutmix, etc.
According to a review paper about ECG abnormalities in Chagas Disease, the most common ECG abnormalities are:
-
Prevalence of overall ECG abnormalities was higher in participants with CD (40.1%; 95%CIs=39.2-41.0) compared to non-CD (24.1%; 95%CIs=23.5-24.7) (OR=2.78; 95%CIs=2.37-3.26).
-
Among specific ECG abnormalities, prevalence of
- complete right bundle branch block (RBBB) (OR=4.60; 95%CIs=2.97-7.11),
- left anterior fascicular block (LAFB) (OR=1.60; 95%CIs=1.21-2.13),
- combination of complete RBBB/LAFB (OR=3.34; 95%CIs=1.76-6.35),
- first-degree atrioventricular block (A-V B) (OR=1.71; 95%CIs=1.25-2.33),
- atrial fibrillation (AF) or flutter (OR=2.11; 95%CIs=1.40-3.19),
- ventricular extrasystoles (VE) (OR=1.62; 95%CIs=1.14-2.30)
was higher in CD compared to non-CD participants
- High-performing teams often accepted very low accuracy in exchange for better recall/risk ranking.
- Foundation/self-supervised ECG encoders (ViT/Transformer backbones, distilled/foundation models pretrained on large ECG corpora) are widely used.
CinC2020 | CinC2021 | CinC2022 | CinC2023 | CinC2024
Footnotes
-
As clarified by the organizers, the validation set for the official phase was updated, hence the unofficial and official phase leaderboards are not comparable. ↩


