I am Zhanghan Wang, a PhD student at New York University now.
And I am widely interested in building high-performance and reliable systems.
Recently, I am working on correctness of distributed system advised by
Prof. Aurojit Panda.
Hi, I'm Zhanghan Wang
Research & Projects
My research experience and coursework projects. (* means equal contribution)
- Research
- Project
It Takes Two to Entangle
Zhanghan Wang*, Ding Ding*, Hang Zhu, Habin Lin, Aurojit Panda.
Understanding Stragglers in Large Model Training Using What-if Analysis
Jinkun Lin, Ziheng Jiang, Zuquan Song, Sida Zhao, Menghan Yu, Zhanghan Wang, Chenyuan Wang, Zuocheng Shi, Xiang Shi, Wei Jia, Zherui Liu, Shuguang Wang, Haibin Lin, Xin Liu, Aurojit Panda, and Jinyang Li.
Runtime Protocol Refinement Checking for Distributed Protocol Implementations
Ding Ding, Zhanghan Wang, Jinyang Li, Aurojit Panda.
Incremental Specialization of Network Programs
Fabian Ruffy, Zhanghan Wang, Gianni Antichi, Aurojit Panda, Anirudh Sivaraman
Improve Load Balance for DLRM with Programmable Switch
Advisor: Jialin Li (NUS) and Liang Luo (Meta, US)
Deep Learning Recommendation Model(DLRM) is a widely used recommendation model developed by Meta. DLRM uses embedding tables that encode sparse features like movie genres. However, there are a lot of such tables, and the number of embeddings and the embedding sizes vary a lot. Thus, they are usually partitioned into different machines. Nevertheless, the accessing pattern is skewed due to the popularity of data. Thus, we tried to use the programmable switch to improve the load balance by caching several embedding entries and routing the lookup requests based on caching status. The work hasn't been completely done and the problem might already be obsolete.
Database Deadlock Diagnosis for Large-scale ORM-based Web Application
Advisor: Jinyang Li (NYU) and Zhaoguo Wang (SJTU)
Database-backed web application usually relies database to handle deadlocks. However, the common-used detect-and-recover strategy could be costly. Although developers can sometimes reorganize their application to removedeadlocks, the large number of LOC and third-party ORM frameworks they use make this much more difficult. In thisproject, we use symbolic execution to extract the APIs' statement templates with symbolic inputs and path conditions forthe issued statements. Based on the information, we then analyze and report the potential deadlocks. I was only listed in acknowledgement in the paper since I left the team later.
RocksDB with Disaggregated Block Cache.
Advisor: Cheng Li (USTC)
This is my undergraduate thesis. In this work, I explored using RDMA to disaggregate the block cache of RocksDB to improve the overal throughput of LSM-Tree-based KV store. This project help alleviate the memory burden of block cache in RocksDB and disaggregate more than 75% memory with only 15% performance drop.
SQL Query Plan Optimization
Advisor: Cheng Li (USTC)
In this project, we develop a greedy algorithm to reorder left-deep-join tree in SQL query plan, and achieved better performance.
Graph DataBase File System
Zhanghan Wang, Chuqing Gao, Zhiyuan Huang, Xingmei Wang, Jiacheng Wan (in no particular order)
We utilize Neo4j (one of the best graph database) to build a FUSE-based
file system with a fantastic web UI. The files are connected in the graph based on their
contents.
We used some machine learning techniques to extract keywords and description. Due to the
limitation of early AI techniques, GDBFS can only process some simple files.
A Note in 2024: we noticed that, with the emerging LLM, there are some new works
similar to our GDBFS design philosophy. This again proves that new techniques always enable
some old ideas...
Experience
Education and Work Experiences
Education
New York Univeristy
PhD Student
I am now a PhD student at NYU, researching on correctness of distributed systems, advised by Prof. Aurojit Panda.
Univeristy of Science and Technology of China
Bachelor of Computer Science
USTC is where I started my Computer Science Study since 2018. And I met Prof. Cheng Li here, who offered me lots of help in digging into system works.
Work
Roblox
Research Intern
2025 SummerDeveloped a prototype for script-based LOD with Just-in-time compilation and instances cache..
Bytedance
Research Intern
2024 SummerDeveloped a tool to verify correctness of distributed model training/inferencing implementation.
Projects for Fun
IssueClear
This is a tool to scrape either Github or JIRA issues and allow user to use LLM to filter out
the issues they are interested in. I expect it can help researchers in program testing, debug
and verification to find more interesting bugs and facilitate their works.
This is also my first project experiencing (nearly) pure vibe coding.
The tool is still under development.
P4 language extension for VSCode
VSCode Extension for p4 language. This extension support simple but not complete syntax and semantic highlights. This project is mainly for learning p4lang specification and review visitor pattern. It is not maintained anymore.