Jürgen Schmidhuber (@SchmidhuberAI) / X

Jürgen Schmidhuber

279 posts

Jürgen Schmidhuber

@SchmidhuberAI

Introduced basics of: P & T in ChatGPT, very deep learning, meta learning, neural distillation, GANs, etc. Co-authored most-cited AI paper of 20th century

Switzerland, KSA

people.idsia.ch/~juergen/most-…

Joined August 2019

Pinned
Jürgen Schmidhuber
@SchmidhuberAI
Jul 11, 2025
Congrats to @nvidia, the first public $4T company! Today, compute is 100000x cheaper, and $NVDA 4000x more valuable than in the 1990s when we worked on unleashing the true potential of neural networks. Thanks to Jensen Huang (see image) for generously funding our research 🚀
178K
Jürgen Schmidhuber
@SchmidhuberAI
Oct 9, 2024
The #NobelPrizeinPhysics2024 for Hopfield & Hinton rewards plagiarism and incorrect attribution in computer science. It's mostly about Amari's "Hopfield network" and the "Boltzmann Machine." 1. The Lenz-Ising recurrent architecture with neuron-like elements was published in
How 3 Turing Awardees Republished Key Methods and Ideas
From people.idsia.ch
1.2M
Jürgen Schmidhuber
@SchmidhuberAI
Nov 24, 2023
Thanks @elonmusk for your generous hyperbole! Admittedly, however, I didn’t invent sliced bread, just #GenerativeAI and things like that: people.idsia.ch/~juergen/most-… And of course my team is standing on the shoulders of giants: people.idsia.ch/~juergen/deep-… Original tweet by @elonmusk:
Elon Musk
@elonmusk
Nov 23, 2023
Replying to @McaleerStephen
Schmidhuber invented everything
1.2M
Jürgen Schmidhuber
@SchmidhuberAI
Jan 31, 2025
DeepSeek [1] uses elements of the 2015 reinforcement learning prompt engineer [2] and its 2018 refinement [3] which collapses the RL machine and world model of [2] into a single net through the neural net distillation procedure of 1991 [4]: a distilled chain of thought system.
848K
Jürgen Schmidhuber
@SchmidhuberAI
Dec 7, 2024
The #NobelPrize in Physics 2024 for Hopfield & Hinton turns out to be a Nobel Prize for plagiarism. They republished methodologies developed in #Ukraine and #Japan by Ivakhnenko and Amari in the 1960s & 1970s, as well as other techniques, without citing the original inventors.
895K
Jürgen Schmidhuber
@SchmidhuberAI
Jan 11, 2024
The GOAT of tennis @DjokerNole said: "35 is the new 25.” I say: “60 is the new 35.” AI research has kept me strong and healthy. AI could work wonders for you, too!
710K
Jürgen Schmidhuber
@SchmidhuberAI
Aug 3, 2025
Who invented convolutional neural networks (CNNs)? 1969: Fukushima had CNN-relevant ReLUs [2]. 1979: Fukushima had the basic CNN architecture with convolution layers and downsampling layers [1]. Compute was 100 x more costly than in 1989, and a billion x more costly than
617K
Jürgen Schmidhuber
@SchmidhuberAI
Nov 22, 2022
LeCun's "5 best ideas 2012-22” are mostly from my lab, and older: 1 Self-supervised 1991 RNN stack; 2 ResNet = open-gated 2015 Highway Net; 3&4 Key/Value-based fast weights 1991; 5 Transformers with linearized self-attention 1991. (Also GAN 1990.) Details: people.idsia.ch/~juergen/lecun…
Jürgen Schmidhuber
@SchmidhuberAI
Oct 27, 2020
Quarter-century anniversary: 25 years ago we received a message from N(eur)IPS 1995 informing us that our submission on LSTM got rejected. (Don’t worry about rejections. They mean little.) #NeurIPS2020 people.idsia.ch/~juergen/deep-…
Jürgen Schmidhuber
@SchmidhuberAI
Dec 5, 2024
Re: The (true) story of the "attention" operator ... that introduced the Transformer ... by @karpathy. Not quite! The nomenclature has changed, but in 1991, there was already what is now called an unnormalized linear Transformer with "linearized self-attention" [TR5-6]. See (Eq.
Andrej Karpathy
@karpathy
Dec 3, 2024
The (true) story of development and inspiration behind the "attention" operator, the one in "Attention is All you Need" that introduced the Transformer. From personal email correspondence with the author @DBahdanau ~2 years ago, published here and now (with permission) following
346K
Jürgen Schmidhuber
@SchmidhuberAI
Oct 4, 2019
In 2020, we will celebrate that many of the basic ideas behind the Deep Learning Revolution were published three decades ago within fewer than 12 months in our "Annus Mirabilis" 1990-1991:
Deep Learning: Our Miraculous Year 1990-1991
From people.idsia.ch
Jürgen Schmidhuber
@SchmidhuberAI
Nov 30, 2023
Q*? 2015: reinforcement learning prompt engineer in Sec. 5.3 of “Learning to Think...” arxiv.org/abs/1511.09249. A controller neural network C learns to send prompt sequences into a world model M (e.g., a foundation model) trained on, say, videos of actors. C also learns to
Jürgen Schmidhuber
@SchmidhuberAI
Dec 30, 2022
30 years ago in a journal: "distilling" a recurrent neural network (RNN) into another RNN. I called it “collapsing” in Neural Computation 4(2):234-242 (1992), Sec. 4. Greatly facilitated deep learning with 20+ virtual layers. The concept has become popular people.idsia.ch/~juergen/deep-…
469K
Jürgen Schmidhuber
@SchmidhuberAI
Mar 5, 2025
Congratulations to @RichardSSutton and Andy Barto on their Turing award!
128K
Jürgen Schmidhuber
@SchmidhuberAI
Jul 25, 2023
Meta used my 1991 ideas to train LLaMA 2, but made it insinuate that I “have been involved in harmful activities” and have not made “positive contributions to society, such as pioneers in their field.” @Meta & LLaMA promoter @ylecun should correct this ASAP. See
Jürgen Schmidhuber
@SchmidhuberAI
Nov 22, 2022
LeCun's "5 best ideas 2012-22” are mostly from my lab, and older: 1 Self-supervised 1991 RNN stack; 2 ResNet = open-gated 2015 Highway Net; 3&4 Key/Value-based fast weights 1991; 5 Transformers with linearized self-attention 1991. (Also GAN 1990.) Details: people.idsia.ch/~juergen/lecun…
618K