I go by `SiriusNEO` on most websites. And I usually use `Chaos`
as a nickname (a variant of my real name). (Github | Zhihu |
Scholar)
Research
My research interest lies in building practical, scalable and efficient systems for
machine learning (MLSys) like serving system and deep learning compiler.
My long-term research goal is to make powerful AI accessible to everyone.
Currently, I'm working on the following topics. If you share the same interest with me, please don't hesitate to email me and we can schedule a discussion.
Researcher-friendly Programming Abstraction. Recently, I'm interested at and actively contribute to TileLang, through
which we can swiftly implement efficient co-design algorithms like sparse/linear attention.
Efficient Attention/System for LLM Agents, specially on memory(context) management problems in single ReAct agent w/ sub-agents architecture.
And some of my previous projects are listed below:
ParrotServe: Serving system for LLM
applications with Semantic Variable abstraction (Pulished in OSDI'24).
Accelerating Large-Scale Reasoning Model Inference with Sparse Self-Speculative Decoding
Yilong Zhao*, Jiaming Tang*, Kan Zhu, Zihao Ye, Chi-Chih Chang, Chaofan Lin, Jongseok Park, Guangxuan Xiao, Mohamed S. Abdelfattah, Mingyu Gao, Baris Kasikci, Song Han, Ion Stoica
[2024/6] Graduated from SJTU with Zhiyuan Outstanding Student Scholarship! My bachelor thesis
`Large Language Model Inference Serving System for Requests with Long System Prompts`
is awarded with Best Bachelor Thesis (Top 1%) in SJTU!
[2024/3] Parrot is accepted by OSDI'24! Code and paper are both released.
[2023/9] I will be joining IIIS, Tsinghua as a PhD student starting 24Fall.