π Text2Vis: A Challenging and Diverse Benchmark for Generating Multimodal Visualizations from Text
Text2Vis is a benchmark dataset designed to evaluate the ability of large language models (LLMs) to generate accurate and high-quality data visualizations from natural language queries. It supports evaluation across multiple dimensions including answer correctness, chart quality, reasoning difficulty, and question answering over structured data.
π Accepted to EMNLP 2025 (Main Conference)
You can access the dataset through the following links:
- π¦ Hugging Face Dataset
- π Google Drive Download
- π ArXiv Version
If you have any questions about this work, please contact **[Mizanur Rahman] mizanur.york@gmail.com.
If you use Text2Vis in your research, please cite:
@misc{rahman2025text2vischallengingdiversebenchmark,
title={Text2Vis: A Challenging and Diverse Benchmark for Generating Multimodal Visualizations from Text},
author={Mizanur Rahman and Md Tahmid Rahman Laskar and Shafiq Joty and Enamul Hoque},
year={2025},
eprint={2507.19969},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2507.19969},
}