Zhiyu Yang KevinCL16

Hi there 👋

Ph.D. @ University of Texas at Dallas!

Prev. RA @ Singapore Management University

Prev. Research Intern @ THUNLP

Excited for the future of AI, want to be a part of it!

Zhiyu Yang 🏎️⚽🤖

Email: kelvin.yangzhiyu@outlook.com
Hometown: Chengdu, China

GitHub | Google Scholar | LinkedIn

Education

Ph.D. in Computer Science and Technology
University of Texas at Dallas
August 2025 - now

M.Eng. in Computer Science and Technology
Beijing Language and Culture University
September 2021 - July 2024

B.Eng. in Computer Science and Technology
Sichuan University
September 2017 - July 2021

Research Experience

Research Assistant

Singapore Management University
Sept 2024 - April 2025

Conducting LLM research under the supervision of Professor Yang Deng.
Exploring LLMs' capabilities to identify and explain multi-hop and multiple logical errors in data analysis code.

Research Intern

Modelbest Co. Ltd. & OpenBMB
April 2024 - July 2024

Worked during the initial phase of devising an agent framework LLM×MapReduce to adapt regular LLMs to process long context inputs.
Serving as a team leader, devising research plans, guiding interns into our research group, participating in discussion and collaboration with fellow senior interns.

Research Intern

THUNLP, Tsinghua University
April 2023 - July 2024

Conducting NLP research under the supervision of Shuo Wang, Ph.D.
Distilled table reasoning skills from LLMs to small PLMs.
Developed LLM agents for scientific data visualization.
Curated multilingual SFT data.
Participated in devising an agent framework for processing long context inputs.

Research Publications

Why Stop at One Error? Benchmarking LLMs as Data Science Code Debuggers for Multi-Hop and Multi-Bug Errors - EMNLP 2025 Oral (First Author)

Authors: Zhiyu Yang, Shuo Wang, Yukun Yan, Yang Deng.
Summary: Introduced DSDBench, a challenging benchmark built via an automated framework to test LLMs on realistic data science code with multiple, multi-hop bugs. Our findings reveal that even top models struggle to trace error origins and achieve complete bug detection, exposing a critical gap in their reasoning and debugging capabilities.
Contribution: I designed the DSDBench dataset construction pipeline, implemented the automated error injection framework, and conducted experiments to evaluate state-of-the-art LLMs, revealing critical performance gaps in dynamic debugging.

MatPlotAgent - ACL 2024 Findings (First Author)

Authors: Zhiyu Yang, Zihan Zhou, Shuo Wang, Xin Cong, Xu Han, Yukun Yan, Zhenghao Liu, Zhixing Tan, Pengyuan Liu, Dong Yu, Zhiyuan Liu, Xiaodong Shi, Maosong Sun.
Summary: Introduced MatPlotBench for automatic evaluation of AI methods for scientific data visualization. Proposed MatPlotAgent, a framework using visual feedback to enhance LLM performance.
Contribution: Designed the agent framework and evaluation method, conducted experiments, and curated data for MatPlotBench.

UltraLink - ACL 2024 (Fifth Author)

Authors: Haoyu Wang, Shuo Wang, Yukun Yan, Xujia Wang, Zhiyu Yang, Yuzhuang Xu, Zhenghao Liu, Liner Yang, Ning Ding, Xu Han, Zhiyuan Liu, Maosong Sun.
Summary: Developed a multilingual SFT dataset with language-specific and language-agnostic subsets using knowledge-enhanced data augmentation methods with Wikipedia as knowledge source.
Contribution: Concretized the paper's idea, designed initial prompt templates for data synthesis, and revised the paper.

Enhancing Free-Form Table Question Answering Models by Distilling Relevant-Cell-Based Rationales - CCL 2024 (First Author)

Authors: Zhiyu Yang, Shuo Wang, Yukun Yan, Pengyuan Liu, Dong Yu.
Summary: Proposed a knowledge distillation method for table QA tasks using relevant-cell-based rationales, achieving SOTA results on the FeTaQA benchmark.
Contribution: Developed the distillation method, conducted experiments, and authored the paper.

Projects

CCL 2024 System Demonstration: MatPlotAgent - 2024

Converted MatPlotAgent into an interactive online demo.
Demonstrated its workflow and performance to scholars attending CCL 2024.

SemEval 2022 Task 7 (Rank: 9) - 2021

Explored various pre-trained language models for understanding the plausibility of implicit and underspecified texts.
Fine-tuned Facebook AI’s MUPPET model for optimal performance.

Garbage Image Classification Based on Deep Neural Networks - 2021

Proposed a novel garbage classification deep neural network architecture.
Achieved superior performance compared to mainstream models on a Huawei Cloud Garbage Classification Competition dataset.

Skills

Languages

Mandarin: Native
English: Fluent (IELTS Overall Band 8.0, Reading: 9, Listening: 9, Writing: 7.5, Speaking: 7)

Programming Languages

Python, C++, Java

Tools and Frameworks

PyTorch, Huggingface, PyG, Keras, TensorFlow 2, Linux, Android Studio, vLLM, Matplotlib, Numpy, Pandas

Deep Learning

Convolutional Neural Networks, Pre-trained NLU and NLG models, LLMs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zhiyu Yang KevinCL16

Achievements

Achievements

Block or report KevinCL16

Hi there 👋

Zhiyu Yang 🏎️⚽🤖

Education

Research Experience

Research Assistant

Research Intern

Research Intern

Research Publications

Why Stop at One Error? Benchmarking LLMs as Data Science Code Debuggers for Multi-Hop and Multi-Bug Errors - EMNLP 2025 Oral (First Author)

MatPlotAgent - ACL 2024 Findings (First Author)

UltraLink - ACL 2024 (Fifth Author)

Enhancing Free-Form Table Question Answering Models by Distilling Relevant-Cell-Based Rationales - CCL 2024 (First Author)

Projects

CCL 2024 System Demonstration: MatPlotAgent - 2024

SemEval 2022 Task 7 (Rank: 9) - 2021

Garbage Image Classification Based on Deep Neural Networks - 2021

Skills

Languages

Programming Languages

Tools and Frameworks

Deep Learning

Pinned Loading

Uh oh!