Digitization and Classification of ECG Images: The George B. Moody PhysioNet Challenge 2024
The figure below demonstrates the framework of the proposed method in this project (more details can be found in the conference paper).
💀💀💀 SUPER BIG MISTAKE: The loss function for the classification head of the official phase (multi-label) is NOT changed from the cross-entropy loss used in the unofficial phase (single-label) to the asymmetric loss. See the commit for the details. 💀💀💀
- The Conference
- Description of the files/folders(modules)
- Performance comparison of classification backbones
- Final results table
- Possible solutions for the digitization task
Conference Website | Unofficial Phase Leaderboard | Official Phase Leaderboard
Click to view the details
- README.md: this file, serves as the documentation of the project.
- cfg.py: the configuration file for the whole project.
- const.py: constant definitions, mostly the URLs for downloading the model weights.
- data_reader.py: data reader, including data downloading, file listing, data loading, etc.
- dataset.py: dataset class, which feeds data to the models.
- Dockerfile: docker file for building the docker image for submissions.
- evaluate_model.py, helper_code.py, remove_hidden_data.py, run_model.py, train_model.py: scripts inherited from the official baseline and official scoring code. Modifications on these files are invalid and are immediately overwritten after being pulled by the organizers (or the submission system).
- outputs.py: container (dataclass) for the outputs of the models.
- sync_official.py: script for synchronizing data from the official baseline and official scoring code.
- requirements.txt, requirements-docker.txt, requirements-no-torch.txt: requirements files for different purposes.
- team_code.py: entry file for the submissions.
- test_local.py, test_docker.py, test_run_challenge.sh: scripts for testing the docker image and the local environment. The latter 2 files along with the docker-test action are used for CI. Passing the CI almost guarantees that the submission will run successfully in the official environment, except for potential GPU-related issues (e.g. model weights and data are on different devices, i.e. CPU and GPU, in which case torch will raise an error).
- trainer.py: trainer class, which trains the models.
- train_models.ipynb: notebook for training the models.
- submissions: log file for the submissions, including the key hyperparameters, the scores received, commit hash, etc. The log file is updated after each submission and organized as a YAML file.
Click to view the details
- official_baseline: the official baseline code, included as a submodule.
- official_scoring_metric: the official scoring code, included as a submodule.
- ecg-image-kit: a submodule for the ECG image processing and generating toolkit, provided by the organizers.
- models: folder for model definitions, including image backbones, Dx head, digitization head, custom losses, waveform detector, etc.
- utils: various utility functions, including a ECG simulator for generating synthetic ECG signals, ecg image generator which is an enhanced version of the ecg-image-kit, etc.
Curves of F1 score using different backbone sizes (all ConvNeXt architecture) are collected in the following image.
Details of the final results can be found in the official results page.
| F-measure | SNR | |
|---|---|---|
| Rank | 9/16 | 13/16 |
| Leaderboard | 0.33 | -0.733 |
| Color scans of clean papers | 0.332 | -0.148 |
| Black-and-white scans of clean papers | 0.327 | -1.267 |
| Mobile phone photos of clean papers | 0.306 | -9.019 |
| Mobile phone photos of stained papers | 0.316 | -8.545 |
| Mobile phone photos of deteriorated papers | 0.306 | -6.398 |
| Color scans of deteriorated papers | 0.331 | -1.636 |
| Black-and-white scans of deteriorated papers | 0.319 | -3.559 |
| Screenshots of computer monitor | 0.288 | -6.532 |
Click to view the details
-
End-to-end model (NOT adopted): A single model that takes the input image and produces the digitized ECG signal directly.
-
Several-stage solution (adopted): A multi-stage solution that consists of several models, possibly including:
-
OCR model: Recognizes the ECG signal names and its locations in the input image, as well as other metadata.For example, using EasyOCR, or Tesseract,or TrOCR. -
Object detection model: Detects the area (bounding box) of the ECG signal in the input image. This bounding box, together with the location of the ECG signal names, can be used to crop each channel of the ECG signal.
-
Edge sharpening algorithm: Enhances and extracts the grid lines and the ECG signal from the cropped patches of the input image. -
Segmentation model: Segments the ECG signal from the cropped patches of the input image. This model can be a U-Net, a DeepLabV3, or a Mask R-CNN, etc.
-
The end-to-end model is simpler in terms of implementation, but it may be harder to train and optimize. Its effectiveness can not be guaranteed.
The several-stage solution may be easier to train and optimize. However, it requires more effort to design and implement the models and algorithms. (Actually a system of models and algorithms.)
If you find this repository useful for your research, please consider citing the following paper:
@inproceedings{Kang_cinc2024,
title = {{A Multi-Stage Framework for Simultaneous Digitization and Classification of Electrocardiogram Images}},
author = {Kang, Jingsu and WEN, Hao},
booktitle = {{2024 Computing in Cardiology Conference (CinC)}},
series = {{CinC2024}},
volume = {51},
issn = {2325-887X},
doi = {10.22489/cinc.2024.128},
publisher = {{Computing in Cardiology}},
year = {2024},
month = {12},
collection = {{CinC2024}}
}

