Communication Dans Un Congrès Année : 2025

LLM Code Customization with Visual Results: A Benchmark on TikZ

Customisation de code avec une sortie visuelle: un benchmark sur TikZ

Résumé

With the rise of AI-based code generation, customizing existing code out of natural language instructions to modify visual results -such as figures or images -has become possible, promising to reduce the need for deep programming expertise. However, even experienced developers can struggle with this task, as it requires identifying relevant code regions (feature location), generating valid code variants, and ensuring the modifications reliably align with user intent. In this paper, we introduce vTikZ, the first benchmark designed to evaluate the ability of Large Language Models (LLMs) to customize code while preserving coherent visual outcomes. Our benchmark consists of carefully curated vTikZ editing scenarios, parameterized ground truths, and a reviewing tool that leverages visual feedback to assess correctness. Empirical evaluation with stateof-the-art LLMs shows that existing solutions struggle to reliably modify code in alignment with visual intent, highlighting a gap in current AI-assisted code editing approaches. We argue that vTikZ opens new research directions for integrating LLMs with visual feedback mechanisms to improve code customization tasks in various domains beyond TikZ, including image processing, art creation, Web design, and 3D modeling.

Fichier principal
Vignette du fichier
main.pdf (967.84 Ko) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)
Licence

Dates et versions

hal-05049250 , version 1 (29-04-2025)
hal-05049250 , version 2 (04-06-2025)

Licence

Identifiants

Citer

Charly Reux, Mathieu Acher, Djamel Eddine Khelladi, Olivier Barais, Clément Quinton. LLM Code Customization with Visual Results: A Benchmark on TikZ. EASE'25 - Evaluation and Assessment in Software Engineering, Jun 2025, Istanbul, Turkey. pp.1-10, ⟨10.1145/3756681.3757003⟩. ⟨hal-05049250v2⟩
527 Consultations
661 Téléchargements

Altmetric

Partager

  • More