Skip to content
View yuyu5333's full-sized avatar
🚋
Studying
🚋
Studying

Block or report yuyu5333

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
yuyu5333/README.md

Hi there 👋 I'm Jay

LLM Infrastructure Engineer @ ByteDance Volcano Engine. Previously interned at Huawei. M.S. in Computer Science from Northwestern Polytechnical University.

🔭 Currently

Working on LLM Infrastructure — inference, training, INT4 quantization, and system-level optimization for large-scale models.

🌱 Open Source

Contributor to:

🛠️ Tech Stack

Python · CUDA · PyTorch · Triton

📫 Contact

wangyuzhan@bytedance.com · wyz_yy@mail.nwpu.edu.cn · 1812107659@qq.com

Pinned Loading

  1. sglang-bytedance sglang-bytedance Public

    Forked from bytedance-iaas/sglang

    SGLang is a fast serving framework for large language models and vision language models.

    Python

  2. VeOmni VeOmni Public

    Forked from ByteDance-Seed/VeOmni

    VeOmni: Scaling any Modality Model Training to any Accelerators with PyTorch native Training Framework

    Python 1

  3. flash-attention flash-attention Public

    Forked from Dao-AILab/flash-attention

    Fast and memory-efficient exact attention

    Python

  4. minimind minimind Public

    Forked from jingyaogong/minimind

    🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!

    Python 3