Vivek’s Substack
Subscribe
Sign in
Home
Notes
Archive
About
Multi-Head Attention: One Sentence, Many Perspectives
Breaking down how attention heads specialize and collaborate in transformers
Oct 4, 2025
•
Vivek Nayyar
No Peeking! How LLMs Learn With Causal Attention
From masks to dropout: making LLMs learn step by step
Sep 27, 2025
•
Vivek Nayyar
2
Attention, Please! How LLMs Learn Relationships Between Words
How Neural Networks Learn to Connect Words
Jul 13, 2025
•
Vivek Nayyar
2
2
How Does a LLM Read Your Sentence? Let's Break It Down
This article is meant to give you a simple and intuitive explanation of how large language models (LLMs) like GPT convert a sentence into numbers …
Jul 11, 2025
•
Vivek Nayyar
2
2
1
Coming soon
This is Vivek’s Substack.
May 13, 2025
•
Vivek Nayyar
Vivek’s Substack
My personal Substack
Subscribe
Vivek’s Substack
Subscribe
About
Archive
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts