Skip to content
View pandengyao's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report pandengyao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
pandengyao/README.md

Hi there 👋

I am currently a Research & Development Engineer in the Reinforcement Learning Group of the AI Computing Department at Baidu.
I received my Master of Engineering from Beihang University (BUAA), and my Bachelor of Engineering from Nanjing University of Science and Technology (NJUST).

I have worked across areas including software development, ROS-based system integration, model quantization and deployment, and MLSys optimization.
My current research focuses on Agentic Reinforcement Learning (Agentic RL) — exploring how autonomous agents can leverage reinforcement learning to enhance large-scale intelligent systems.

Research Interests 🔭

My research primarily focuses on:

  • ML Systems: Topics related to SGLang, veRL, AI Infra, and High Performance Computing.
  • RL Sys for Agents: Topics related to Coding Agent & Pipeline and RLHF for Multi-Agent Systems.

Pinned Loading

  1. sgl-project/sglang sgl-project/sglang Public

    SGLang is a high-performance serving framework for large language models and multimodal models.

    Python 27.6k 5.8k

  2. verl-project/verl verl-project/verl Public

    verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

    Python 21.2k 3.8k

  3. THUDM/slime THUDM/slime Public

    slime is an LLM post-training framework for RL Scaling.

    Python 5.6k 784

  4. NVIDIA/Megatron-LM NVIDIA/Megatron-LM Public

    Ongoing research training transformer models at scale

    Python 16.3k 3.9k

  5. my-claude-skills my-claude-skills Public

    Claude Code skills collection

    Python

  6. model-visualization model-visualization Public