ChartMaster: Advancing Chart-to-Code Generation with Real-World Charts and Chart Similarity Reinforcement Learning
[π Model] β’ [π€ Dataset] β’ [π Paper] β’ [π± GitHub]
- [2025/10/16] ChartMaster-7B released.
- [2025/08/25] ChartMaster paper, repo and website released.
Previous datasets mainly use synthetic seeds to prompt GPT models, resulting in homogeneous samples and limited generalization. To address this, we propose ReChartPrompt, a large-scale automatically constructed dataset based on 30,071 real arXiv papers, yielding 240K diverse chart-image/code/instruction triplets. These real-world charts cover rich design styles and research fields, greatly enhancing model robustness.
The second challenge is that traditional supervised fine-tuning (SFT) improves code understanding but cannot guarantee visual fidelity. We introduce ChartSimRL, a reinforcement learning algorithm based on Group Relative Policy Optimization (GRPO), guided by a novel chart similarity reward. This reward combines attribute similarity (matching layout, color, text, and values) and visual similarity (comparing CNN-extracted features), ensuring generated charts align closely with originals in both semantics and appearance.
By integrating ReChartPrompt and ChartSimRL, our ChartMaster model achieves state-of-the-art results among open-source 7B-parameter models, rivaling GPT-4o performance across multiple benchmarks. All code, datasets, and models are released to facilitate further research.
Figure 1: The overall framework of ChartMaster: (a) Construction of ReChartPrompt-240K with real chart images; (b) Optimization via ChartSimRL; (c) Definition of Chart Similarity reward.
ChartMaster-7B sets a new SOTA for open-source 7B models across the ChartMimic, Plot2Code, and ChartX benchmarks, even approaching GPT-4o.
Figure 2: Performance comparison on three benchmarks. Our method outperforms ChartCoder-7B, and matches or exceeds GPT-4o on certain metrics. For better representation, the "Rating" metric in the Plot2Code benchmark is multiplied by 10.
Figure 3: The test results of various models on the ChartMimic benchmark. ``Base.+ReCha." refers to the baseline model fine-tuned with the ReChartPrompt-240K dataset. Incorporating ReChartPrompt significantly enhances the chart-to-code generation capability of the base model, while ChartSimRL further improves the handling of fine details.
For questions, suggestions, or collaboration, please feel free to:
- Open an issue.
- Reach out via email: