I am currently a Ph.D. student at Tianjin Key Laboratory of Visual Computing and Intelligent Perception (VCIP), College of Computer Science, Nankai University, advised by Prof. Xiang Li.
My research interest includes:
- Vision-Language Model
- Diffusion Model
- Video Understanding
π₯ News
- 2026.06: Β ππ Released arxiclaw Agent-Native Academic Archive and arxiclaw-agent, a local API client for external AI agents to efficiently access and utilize academic resources.
- 2025.10: Β ππ Champion of 2025 Jittor AI competition, 50,000 RMB bonus (1th from 100+ teams).
- 2025.09: Β ππ Completed full project delivery for NKU-ENN Zhixin video understanding (worth 9.6 million RMB) and initial development of a film editing project, resulting in one patent and one software copyright.
- 2024.10: Β ππ Completed joint development with Insta360 for AI video editing, resulting in one co-first-author paper and three patents (including one PCT).
- 2023.11: Β ππ Second prize of the 6th Open Source Innovation Competition (Task of Open Source Challenge Circuit).
- 2023.09: Β ππ Second prize of 2023 Jittor AI competition, 20,000 RMB bonus (3th from 49 teams).
π Publications
Cooming soon.
π οΈ Projects
arxiClaw: Academic Paper Retrieval and Multi-Agent Analysis Platform
Role: Project Lead / Full-Stack Core Developer
Summary: An open-source research platform that integrates academic paper aggregation, retrieval, and LLM-agent based analysis to support literature tracking, paper triage, and in-depth reading.
Contribution: Initiated and built the system from scratch, including the web interface, distributed crawling and cleaning pipeline, and multi-agent service engine for complex academic workflows.
SmartVideoClip: Intent-Driven Automated Video Editing
Role: Overall Algorithm Architect / End-to-End Pipeline Lead
Summary: An open-source automated video editing system that connects user intent understanding, multimodal asset analysis, script planning, and audio-video synthesis in a unified generation pipeline.
Contribution: Designed and led the core algorithm architecture, covering cross-modal media understanding, editing/directing strategy scheduling, and multi-track audio-video alignment.
π Competitions
- 2025.10 π₯The Champion of 2025 Jittor AI competition, 50,000 RMB bonus (1th from 100+ teams). [solutions][code][certificate]
- 2023.09 π₯The Second prize of 2023 Jittor AI competition, 20,000 RMB bonus (3th from 100+ teams). [solutions][code][certificate]
- 2023.11 π₯The Second prize of the 6th Open Source Innovation Competition (Task of Unsupervised Semantic Segmentation). [code][certificate]
- 2021.08 π₯The Champion of CVPR 2021 LID challenge (Track of Weakly-supervised Object Localization). [solutions][certificate]
- 2021.08 π₯The 3rd place of CVPR 2021 LID challenge (Track of Weakly-supervised Semantic Segmentation). [solutions][certificate]
- 2020.08 π₯The 3rd place of CVPR 2020 LID challenge (Track of Weakly-supervised Object Localization). [solutions][certificate]
π Honors and Awards
- National Scholarship (Annual Selection Rate < 1%), Ministry of Education, China, 2021.
- Outstanding Graduate of Anhui Province, Ministry of Education, China, 2020.
βοΈ Professional Services
- Journal Reviewer:
- IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
- IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
- Pattern Recognition (PR)
- Visual Intelligence (VI)
- Image and Vision Computing Journal (IMAVIS)
- Conference Reviewer:
- European Conference on Computer Vision (ECCV26)
π Educations
- 2024.09 - Present, Ph.D., Computer Science and Technology, Nankai University. Advised by Prof. Xiang Li.
- 2020.09 - 2023.06, M.S., Computer Science and Technology, Nanjing University of Science and Technology. Advised by Prof. Chen Gong.
π» Internships
- 2026.03 - Present, arxiclaw, Remote. Project Lead, spearheading development and overall maintenance of the entire project.
- 2024.10 - 2025.05, ENNGroup, Langfang, China. Head of Algorithm Architecture at NKU-ENN Lab.
- 2024.04 - 2024.10, Insta360, Shenzhen, China. Joint research and development.
- 2023.03 - 2024.04, MEGVII Research, Nanjing, China. Research Intern. Led by Jiajun Liang.
- 2022.05 - 2023.03, JD EXPLORE ACADEMY, Nanjing Online, China. Visiting student. Led by Yibing Zhan.
π¬ Invited Talks
- 2024.07γ2023.10 and 11, The GrokCV group share LLM and CLIP and Weakly Supervised Semantic Segmentation, invited by Prof.Yimian Dai.
- 2020.06, Achieve the 3rd place of Track 3 βWeakly-supervised Object Localization [video]β in the 2nd Learning from Imperfect Data (LID) Workshop in conjunction with CVPR 2020, invited by Yunchao Wei.