Vahid Kazemi (@VahidK) / X

Vahid Kazemi

667 posts

Vahid Kazemi

@VahidK

PhD in machine learning. KTH 14. Ex @xAI, @OpenAI, @Apple, @Google.

San Francisco Bay Area, CA

Joined April 2008

Pinned
Vahid Kazemi
@VahidK
Jan 12, 2025
Finally finished editing my video. Episode 02: Build your own language model: youtube.com/watch?v=7ny8Sr….
118K
Vahid Kazemi
@VahidK
Dec 6, 2024
In my opinion we have already achieved AGI and it’s even more clear with O1. We have not achieved “better than any human at any task” but what we have is “better than most humans at most tasks”. Some say LLMs only know how to follow a recipe. Firstly, no one can really explain
759K
Vahid Kazemi
@VahidK
Apr 2, 2020
Effective PyTorch. github.com/vahidk/Effecti… First 6 lessons are committed. More to come.
GitHub - vahidk/EffectivePyTorch: PyTorch tutorials and best practices.
From github.com
Vahid Kazemi
@VahidK
Jun 12, 2019
I was working on optimizing some Pytorch code today and was amazed how fast Pytorch ran a pretty non-optimal code. So I made some test cases to compare with TensorFlow. Pytorch handily beat TensorFlow running vectorized and non-vectorized code in my test cases. Here's one:
Vahid Kazemi
@VahidK
Apr 14, 2019
Since I left Alphabet, I came to realize that computational resources can be limited! Gone are the days of using hundreds of TPUs without anyone raising an eyebrow. Now I spend a lot of time optimizing neural nets to train and run faster. The experience has been very rewarding.
Vahid Kazemi
@VahidK
Feb 11, 2023
We barely understand how a 3 layer MLP with ReLU optimized with SGD works. Let alone a 200B parameter Transformer model optimized on entirety of internet. I take with a grain of salt any expert opinion for what an LLM can or can not do.
45K
Vahid Kazemi
@VahidK
Sep 22, 2021
I've been looking for literature around efficient data collection for ML models and realized for every 1000 papers about NN architecture you can maybe find one paper about data. In practice data engineering is equally or more important, but it's completely neglected by academia.
Vahid Kazemi
@VahidK
Jan 3, 2019
ML engineer productivity tip: Don't stare at TensorBoard.
Vahid Kazemi
@VahidK
Feb 19, 2019
I wonder how many millions of hours of engineering time would have been saved, if C++ had a built-in standard linear algebra library like Eigen.
Vahid Kazemi
@VahidK
Nov 30, 2019
A practical lesson I learned from doing research in deep learning is to spend considerable amount of time at the beginning of the process on optimizing data loading and common operations making sure 100% of my GPU resources are utilized. It pays off massively in the long run.
Vahid Kazemi
@VahidK
Nov 26, 2019
I made a small package which allows reading tfrecord files in PyTorch with no tf dependency:
GitHub - vahidk/tfrecord: Standalone TFRecord reader/writer with PyTorch data loaders
From github.com
Vahid Kazemi
@VahidK
May 26, 2019
Python is so inefficient, Python coders think twice before implementing any new algorithm; they prefer a ready made library (usually written in C++). Paradoxically this has made Python coders much more productive. C++ coders are still writing their own string classes in 2019.
Vahid Kazemi
@VahidK
Sep 5, 2017
Executing each op in TensorFlow has a massive overhead. One way to optimize your code is to use as few ops as possible.
Vahid Kazemi
@VahidK
Mar 5, 2019
My paper on real-time face landmark estimation (used by Snapchat and several other companies) just passed 1000 citations according to Google scholar. Quite a milestone! scholar.google.com/scholar?cluste…