Jason Phang (@zhansheng) / X

Jason Phang

1,557 posts

Jason Phang

@zhansheng

Foundations at @OpenAI. PhD @NYUDataScience, @AiEleuther, 🇸🇬. Prev: @Google, @Microsoft

San Francisco, CA

Joined May 2009

Pinned
Jason Phang
@zhansheng
Mar 21, 2025
🧵I’m excited to share not one but two research papers, written jointly by researchers from OpenAI and the @medialab at MIT. We try to answer the following question: How do interactions with AI chatbots affect people’s social and emotional well-being?
199K
Jason Phang
@zhansheng
Jan 5, 2023
Nothing much, ChatGPT.
1.2M
Jason Phang
@zhansheng
Oct 6, 2021
The field of AI moves very fast
Jason Phang
@zhansheng
Jun 23, 2020
I wrote a Colab notebook that showcases how to do *multi-task training* with the @huggingface Transformers and NLP libraries:
colab.research.google.com
Multi-task Training with Transformers+NLP
Run, share, and edit Python notebooks
Jason Phang
@zhansheng
Apr 5, 2022
I wrote a minimal-ish implementation of GPT-NeoX-20B. It runs on a single GPU with 41-44GB of memory. You can use it as a reference or for easy hacking of the model. github.com/zphang/minimal… Next up: porting to Hugging Face Transformers!
Jason Phang
@zhansheng
Nov 23, 2022
Introducing HyperTuning: Using a hypermodel to generate parameters for frozen downstream models. This allows us to adapt models to new tasks *without* back-prop! Paper: arxiv.org/abs/2211.12485 1/10
Jason Phang
@zhansheng
Mar 2, 2023
In the last 24 hours: - EleutherAI announced it's forming a non-profit - LLaMA (7 - 65B) weights have been mailed out - Flan-UL2 (20B) weights have been released A good day for open science!
32K
Jason Phang
@zhansheng
Jul 19, 2025
I'd like to take this chance to remind everyone that it hasn't even been a full year since o1 was announced (Sept 2024).
Alexander Wei
@alexwei_
Jul 19, 2025
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
18K
Jason Phang
@zhansheng
Jul 9, 2022
I wrote a minimal implementation of OPT. I've tested up to 66B with pipeline parallelism, should work up to 175B if you have enough GPUs. github.com/zphang/minimal…
Jason Phang
@zhansheng
Apr 4, 2023
15K
Jason Phang
@zhansheng
Aug 13, 2023
Had to do it.
51K
Jason Phang
@zhansheng
Aug 10, 2022
"Investigating Efficiently Extending Transformers for Long Input Summarization" from my time at @GoogleAI - We investigate how to adapt models to perform long input summarization - We introduce PEGASUS-X, a long-context extension of PEGASUS arxiv.org/abs/2208.04347 [1/8]
Jason Phang
@zhansheng
Mar 9, 2023
I very quickly threw together some code for fine-tuning LLaMA. One version using PEFT+8bit, and another using (simple) pipeline parallelism for full fine-tuning.
GitHub - zphang/minimal-llama
From github.com
19K
Jason Phang
@zhansheng
Jul 22, 2022
Don’t ask “what do you do?” at parties Instead ask: "Are the experiments you kicked off before coming here still running? Are you sure you configured all your jobs correctly? Did you remove *all* the PDB breakpoints?"