Chengru Wu

I'm an undergraduate student at Beihang University (北京航空航天大学), majoring in Computer Science and Technology. My research interests include code language models, multi-agent systems for code generation, and embodied intelligence.

Email / CV / Github

Research

I'm interested in code language models, multi-agent systems for code generation, and embodied intelligence. Representative papers are listed below.

	Galaxea G0.5 Technical Report: One Autoregressive Stream for Reasoning and Action Galaxea Team Technical Report, May 2026 project page / code / pdf Introduces G0.5, a pretrained autoregressive Vision-Language-Action model with a unified transformer decoder that emits reasoning and action tokens in a single stream. Features a Cross-Embodiment Action Codec, Native Chain-of-Thought, and Visual Memory. Surpasses state-of-the-art models across seven benchmarks, including real-world robot fine-tuning (76.7% vs. 53.3% for π0.5), BEHAVIOR-1K (31.4% task score), DROID (82.5%), LIBERO (98.9%), and RoboTwin 2.0 (93.3%).
	Towards Realistic Project-Level Code Generation via Multi-Agent Collaboration and Semantic Architecture Modeling Qianhui Zhao, Li Zhang, Fang Liu, Junhang Cheng, Chengru Wu, Junchen Ai, Qiaoyuanhe Meng, Lichen Zhang, Xiaoli Lian, Shubin Song, Yuanping Guo ACM Transactions on Software Engineering and Methodology (TOSEM), 2026 arXiv Tackles project-level code generation by introducing CodeProjectEval (a dataset of 18 real-world repositories averaging 12.7 files and ~2,389 lines per task) and ProjectGen, a multi-agent framework that decomposes generation into architecture design, skeleton generation, and code filling. Introduces Semantic Software Architecture Tree (SSAT) to bridge user requirements and code. Achieves 57% improvement on DevBench and ~10x improvement on CodeProjectEval over baselines.
	CangjieBench: Benchmarking LLMs on a Low-Resource General-Purpose Programming Language Junhang Cheng, Fang Liu, Jia Li, Chengru Wu, Nanxiang Jiang, Li Zhang arXiv, 2026 arXiv / code Introduces CangjieBench, a contamination-free benchmark of 248 manually translated samples from HumanEval and ClassEval for Cangjie, a low-resource general-purpose language by Huawei. Evaluates LLMs under Direct Generation, Syntax-Constrained Generation, RAG, and Agent settings. Finds that Syntax-Constrained Generation offers the best accuracy-cost trade-off, while Agents achieve SOTA accuracy at high token cost. Reveals negative transfer in Code-to-Code translation where models overfit to source language patterns.
	On the Applicability of Code Language Models to Scientific Computing Programs Qianhui Zhao, Fang Liu, Xiao Long, Chengru Wu, Li Zhang IEEE Transactions on Software Engineering (TSE), 2025 IEEE Evaluates whether pre-trained code language models (CodeBERT, CodeT5, Codex, StarCoder, CodeLlama) can generalize to scientific computing programming languages (SCPLs). Finds that while SCPLs are more challenging than general-purpose languages, CLMs are nevertheless applicable and knowledge from general languages transfers effectively to SCPL analysis.
	CCUP: A Controllable Synthetic Data Generation Pipeline for Pretraining Cloth-Changing Person Re-Identification Models Yujian Zhao, Chengru Wu, Yinong Xu, Xuanzheng Du, Ruiyu Li, Guanglin Niu IEEE International Conference on Multimedia and Expo (ICME), 2025 IEEE / arXiv / code Proposes a low-cost pipeline for generating controllable synthetic data for cloth-changing person re-identification. Introduces the CCUP dataset with 6,000 IDs, ~1.18M images, 100 cameras, and 26.5 outfits per individual. A pretrain-finetune framework using CCUP significantly improves CC-ReID models, outperforming state-of-the-art methods on PRCC, VC-Clothes, and NKUP benchmarks.
	AdaptiveLLM: A Framework for Selecting Optimal Cost-Efficient LLM for Code-Generation Based on CoT Length Junhang Cheng, Fang Liu, Chengru Wu, Li Zhang Internetware, 2025 arXiv / code Introduces AdaptiveLLM, a framework that dynamically selects the optimal cost-efficient LLM for code generation based on automatically assessed task difficulty using Chain-of-Thought length. Clusters tasks into three difficulty levels and uses XGBoost for model selection. Achieves 7.86% improvement in pass@1 while reducing resource consumption by 88.9% compared to ComplexityNet.

Honors & Awards

National Scholarship, 2025
National Scholarship, 2024
Beihang University Academic Excellence Scholarship (Special Prize), 2025
Beihang University Academic Excellence Scholarship (Special Prize), 2024
Beihang University Merit Student, 2025
Mathematical Contest in Modeling (MCM) — Meritorious Winner (M Award), 2025
Mathematical Contest in Modeling (MCM) — Honorable Mention (H Award), 2024

Miscellanea

Hobbies: Volleyball, Gaming

Template adapted from Jon Barron's website.