Hyunseung Chung1, Jungwoo Oh1, Daeun Kyung1, Jiho Kim1, Yeonsu Kwon1, Min-Gyu Kim2, Edward Choi1
1KAIST 2Ajou University School of Medicine
Recent advances in Multimodal Large Language Models have rapidly expanded to electrocardiograms, focusing on classification, report generation, and single-turn QA tasks. However, these models fall short in real-world scenarios, lacking multi-turn conversational ability, on-device efficiency, and precise understanding of ECG measurements such as the PQRST intervals. To address these limitations, we introduce ECG-Agent, the first LLM-based tool-calling agent for multi-turn ECG dialogue. To facilitate its development and evaluation, we also present the ECG-Multi-Turn-Dialogue (ECG-MTD) dataset, a collection of realistic user-assistant multi-turn dialogues for diverse ECG lead configurations. We develop ECG-Agents in various sizes, from on-device capable (1B, 3B) to larger agents (8B, 32B). Experimental results show that ECG-Agents outperform baseline ECG-LLMs in response accuracy. Furthermore, on-device agents achieve comparable performance to larger agents in various evaluations that assess response accuracy, tool-calling ability, and hallucinations, demonstrating their viability for real-world applications.
| Type | 12-lead | Lead I | Lead II |
|---|---|---|---|
| Dataset | ECG-MTD 12-lead | ECG-MTD Lead I | ECG-MTD Lead II |
| Ground-Truth | ECG-MTD 12-lead Ground-Truth | ECG-MTD Lead I Ground-Truth | ECG-MTD Lead II Ground-Truth |
For pre-processing PhysioNet2021 and pretraining W2V+CMSC+RLM, follow the instructions in:
- Fairseq-signals repository (uni-modal tasks section)
- W2V+CMSC+RLM guide
Step 2a: Preprocess the PTB-XL dataset:
python src/preprocess/preprocess_ptbxl.py \
/path/to/ptbxl/records500/ \
--dest /path/to/outputStep 2b: Generate manifest files for fine-tuning:
python src/preprocess/manifest_ptbxl_10s.py /path/to/outputStep 2c: Fine-tune on the Cardiac Arrhythmia Classification task:
fairseq-hydra-train \
model.model_path=/path/to/pretrained_model/checkpoints/checkpoint_last.pt \
+task.data=/path/to/output/from/2b/ptbxl_10s_manifest \
--config-dir examples/w2v_cmsc/config/finetuning/ecg_transformer \
--config-name diagnosisFor explanation tool outputs, follow the instructions in: SpectralX repository
After fine-tuning, use the model to produce tool output CSV files.
Classification (requires model path):
python extract_tool_outputs.py \
--tool classification \
--ecg_dir /path/to/ecg/files \
--model_path /path/to/checkpoint_best.pt \
--output_dir ./resultsMeasurements (no model needed):
python extract_tool_outputs.py \
--tool measurements \
--ecg_dir /path/to/ecg/files \
--output_dir ./resultspython finetune_ecg_dialogue_unsloth.pypython inference_ecg_dialogue.pyIf you find this work useful, please cite our paper:
@misc{chung2026ecgagentondevicetoolcallingagent,
title={ECG-Agent: On-Device Tool-Calling Agent for ECG Multi-Turn Dialogue},
author={Hyunseung Chung and Jungwoo Oh and Daeun Kyung and Jiho Kim and Yeonsu Kwon and Min-Gyu Kim and Edward Choi},
year={2026},
eprint={2601.20323},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2601.20323},
}