Log inSign up
Micah Goldblum
1,047 posts
user avatar
Micah Goldblum
@micahgoldblum
🤖Prof at Columbia University 🏙️. All things machine learning.🤖
goldblum.github.io
Joined December 2014
760
Following
8,899
Followers

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
  • user avatar
    Micah Goldblum
    @micahgoldblum
    Nov 13, 2025
    An LLM-generated paper is in the top 17% of ICLR submissions in terms of average reviewer score, having received two 8's. The paper has tons of BS jargon and hallucinated references. Fortunately, one reviewer actually looked at the paper and gave it a zero. 1/3
    517K
  • user avatar
    Micah Goldblum
    @micahgoldblum
    Aug 23, 2022
    TLDR: Diffusion models (like DALLE or Imagen) generate pretty pictures from Gaussian noise, but the same training and generation update rules generalize easily to other degradations, including completely deterministic ones. 1/7
  • user avatar
    Micah Goldblum
    @micahgoldblum
    Aug 2, 2022
    A common point raised by ML reviewers is that a method is too simple or is made of existing parts. But simplicity is a strength, not a weakness. People are much more likely to adopt simple methods, and simple ones are also typically more interpretable and intuitive. 1/2
  • user avatar
    Micah Goldblum
    @micahgoldblum
    Jul 10, 2025
    🚨 Did you know that small-batch vanilla SGD without momentum (i.e. the first optimizer you learn about in intro ML) is virtually as fast as AdamW for LLM pretraining on a per-FLOP basis? 📜 1/n
    397K
  • user avatar
    Micah Goldblum
    @micahgoldblum
    Oct 29, 2024
    📢I’ll be admitting multiple PhD students this winter to Columbia University 🏙️ in the most exciting city in the world! If you are interested in dissecting modern deep learning systems to probe how they work, advancing AI safety, or automating data science, apply to my group.
    68K
  • user avatar
    Micah Goldblum
    @micahgoldblum
    Feb 6, 2025
    Here’s an easy trick for improving the performance of gradient-boosted decision trees like XGBoost allowing them to read text column headers and to benefit from massive pretraining: replace the first tree with an LLM or TabPFN! 🧵 1/9
    105K
  • user avatar
    Micah Goldblum
    @micahgoldblum
    Apr 25, 2023
    Self-Supervised Learning (SSL) is quickly becoming a defacto way of training neural networks, but if you have ever tried it yourself, you’d know that getting high performance is tricky! Check out our new thorough guide to all things SSL. arxiv.org/abs/2304.12210
    77K
  • user avatar
    Micah Goldblum
    @micahgoldblum
    May 23, 2024
    I’m excited to announce that I’ll start as an assistant professor at Columbia University this summer! Interview season was fun, I met so many amazing people, but I’m happy to finally close the loop.
    50K
  • user avatar
    Micah Goldblum
    @micahgoldblum
    Nov 1, 2023
    🚨Excited to announce a large-scale comparison of pretrained vision backbones including SSL, vision-language models, and CNNs vs ViTs across diverse downstream tasks ranging from classification to detection to OOD generalization and more! NeurIPS 2023🚨🧵 arxiv.org/abs/2310.19909
    200K
  • user avatar
    Micah Goldblum
    @micahgoldblum
    Oct 13, 2022
    How much data are augmentations worth? We show that augmentations can actually be worth more than extra data and invariance! They increase variance across batches, and this extra stochasticity finds flatter minima. arxiv.org/abs/2210.06441 1/8
    arXiv logo
    arxiv.org
    How Much Data Are Augmentations Worth? An Investigation into...
    Despite the clear performance benefits of data augmentations, little is known about why they are so effective. In this paper, we disentangle several key mechanisms through which data augmentations...
  • user avatar
    Micah Goldblum
    @micahgoldblum
    Jul 5, 2022
    Gradient-boosted decision trees are still thought to be competitive with neural networks on tabular data. But NNs have a massive advantage, they learn representations, and this ability can be leveraged for transfer learning arxiv.org/abs/2206.15306. 1/4
  • user avatar
    Micah Goldblum
    @micahgoldblum
    Jun 12, 2024
    🚨 Announcing LiveBench, a challenging new general-purpose live LLM benchmark! 🚨 Thanks @crwhite_ml and @spamueldooley for leading the charge! Link: livebench.ai Existing LLM benchmarks have serious limitations: 🧵
    155K
  • user avatar
    Micah Goldblum
    @micahgoldblum
    Apr 20, 2023
    🚨Here’s an intuitive explanation for why training on lots and lots of data creates emergent properties, for instance math and reasoning, in large language models like #GPT-4 and #ChatGPT 🚨 1/17
    111K
  • user avatar
    Micah Goldblum
    @micahgoldblum
    Jun 18, 2024
    We often determine whether a neural network is over or under parameterized by counting parameter. In practice, how much data we can fit depends on many factors: architecture, optimizer, etc. So just how flexible are neural networks in practice? 🧵 Paper:
    arXiv logo
    arxiv.org
    Just How Flexible are Neural Networks in Practice?
    It is widely believed that a neural network can fit a training set containing at least as many samples as it has parameters, underpinning notions of overparameterized and underparameterized...
    43K