user avatar
Taco Cohen
@TacoCohen
Slop janitor & post-trainologer at Meta / FAIR. Into codegen, RL, equivariance. Spent time at Qualcomm, Scyfer (acquired), UvA, Deepmind, OpenAI.
Joined March 2013
Posts
  • user avatar
    Surprisingly little AI progress in 2023 so far. What’s going on??
  • user avatar
    An interesting aspect of this discussion is the fact that LLMs will soon start affecting our thoughts, beliefs, mental & linguistic habits, and culture. The idea that we could select a handful of "trustworthy" institutions with the "correct" set of values and beliefs to shape LLM
    Thinking a lot about centralization and decentralization these few days.
  • user avatar
    An easy guide to Gauge Equivariant Convolutional Networks. (I finally get it!) medium.com/@kayzaks/an-ea…
  • user avatar
    Exactly. I learned a ton of math during my PhD, and it was fun and easy *because I had a goal* to use it in my research. Coding it up is also a great way to detect gaps in your understanding. Totally different from learning in class. Another common fallacy is that you need to
    This is empirically incorrect. Hundreds of thousands of fast.ai students have learned the required math for ML as they go. By *far* the biggest problem we've seen is from people who try to learn the math first. They learn the wrong stuff & have not context.
  • user avatar
    Nobody wants to hear it, but working on data is more impactful than working on methods or architectures.
    1. We often observe power laws between loss and compute: loss = a * flops ^ b + c 2. Models are rapidly becoming more efficient, i.e. use less compute to reach the same loss But: which innovations actually change the exponent in the power law (b) vs change only the constant (a)?
  • user avatar
    Rumor has it that I don't even have a PhD yet. This is in fact true... 😏 BUT! I am happy to report that I will be graduating before any of the PhD students I'm advising. The thesis is now online and I will be defending Jun 9th, 16.00 CET! Check it out: dare.uva.nl/search?identif…
  • user avatar
    8 years of progress in generative modelling. What a time to be alive
  • user avatar
    Two weeks ago I joined Meta / FAIR, and I couldn't be more excited about this new chapter. Meta is indeed the only place left that supports highly ambitious long-term oriented & fundamental research projects and has a strong commitment to open science and open source. (and has
    There is literally no other company doing this today: - open research towards human-level AI - open source AI platform enabling a huge AI ecosystem - wearable device to interact with always-on AI assistants
  • user avatar
    🚨 Attention aspiring PhD students: Meta / FAIR is looking for candidates for a joint academic/industry PhD! 🚨 Among others, the CodeGen team is looking for candidates to work on world models for code, discrete search & continuous optimization methods for long-term planning,
  • user avatar
    Best paper award for our ICLR paper, "Spherical CNNs"! Read it while it's hot 🔥 arxiv.org/abs/1801.10130 🔥
    GIF
  • user avatar
    Fascinating paper, showing that transformers are energy-based models in disguise .. And this insight leads to an efficient decoding algorithm
    Ever looked at the attention operation and said "hang on, that's a one-point function!"?
  • user avatar
    Llama-2 is coming to your phone:
  • user avatar
    So these "Multi-Headed Vision Transformers", are they in the room with us right now?
  • user avatar
    Interested in geometric and equivariant deep learning? Check out our latest paper on Gauge Equivariant CNNs, where we show how gauge theory makes it possible to build CNNs on general manifolds: arxiv.org/abs/1902.04615