In my opinion we have already achieved AGI and it’s even more clear with O1. We have not achieved “better than any human at any task” but what we have is “better than most humans at most tasks”. Some say LLMs only know how to follow a recipe. Firstly, no one can really explain
I was working on optimizing some Pytorch code today and was amazed how fast Pytorch ran a pretty non-optimal code. So I made some test cases to compare with TensorFlow. Pytorch handily beat TensorFlow running vectorized and non-vectorized code in my test cases. Here's one:
Since I left Alphabet, I came to realize that computational resources can be limited! Gone are the days of using hundreds of TPUs without anyone raising an eyebrow. Now I spend a lot of time optimizing neural nets to train and run faster. The experience has been very rewarding.
We barely understand how a 3 layer MLP with ReLU optimized with SGD works. Let alone a 200B parameter Transformer model optimized on entirety of internet. I take with a grain of salt any expert opinion for what an LLM can or can not do.
I've been looking for literature around efficient data collection for ML models and realized for every 1000 papers about NN architecture you can maybe find one paper about data. In practice data engineering is equally or more important, but it's completely neglected by academia.
A practical lesson I learned from doing research in deep learning is to spend considerable amount of time at the beginning of the process on optimizing data loading and common operations making sure 100% of my GPU resources are utilized. It pays off massively in the long run.
Python is so inefficient, Python coders think twice before implementing any new algorithm; they prefer a ready made library (usually written in C++). Paradoxically this has made Python coders much more productive. C++ coders are still writing their own string classes in 2019.
My paper on real-time face landmark estimation (used by Snapchat and several other companies) just passed 1000 citations according to Google scholar. Quite a milestone! scholar.google.com/scholar?cluste…