Blog that makes software development and testing interesting and exciting!
977 posts
Workbook questions and notes on the attention mechanism from Build a LLM from Scratch.
A practical walkthrough of the attention mechanism in Elixir, from simple self-attention to causal and multi-head attention, based on Chapter 3 of Build a LLM from Scratch.
Workbook answers and notes for Chapter 2 of Build a LLM from Scratch: Working with Text Data.
An Elixir/Nx walkthrough of preparing text for LLM training: tokenization, token IDs, BPE, sliding windows, token embeddings, and positional embeddings.
My foundation plan for studying Taking Testing Seriously, the foundations, so the lessons from James Bach and Michael Bolton become habits instead of inspirational quotes.
Re-read my Chapter 1 study posts and Giles Thomas’s companion article to reinforce Sebastian Raschka’s Build LLMs from Scratch takeaways and capture the new insights I missed the first time.
Exploring the three main stages of building an LLM — data preparation, pretraining, and fine-tuning — along with key concepts like transformer architecture, emergent properties, and self-supervised...
Exploring GPT architecture through study questions — understanding next-word prediction, self-supervised learning, decoder-only design, autoregressive generation, and how model size impacts capabil...
Exploring the role of large datasets in LLMs — tokenization, pretraining, and fine-tuning. Study questions from Build LLM from Scratch by Sebastian Raschka.
Diving into the Transformer architecture — encoder vs decoder, self-attention, BERT vs GPT, and zero-shot/few-shot learning. Study questions from Build LLM from Scratch.
Continuing with study questions from Build LLM from Scratch by Sebastian Raschka — covering pretraining, fine-tuning, and the two-stage process of building LLMs.
Exploring LLM applications — from chatbots and virtual assistants to knowledge retrieval and machine translation. Study questions from Build LLM from Scratch.
Answering study questions from Build LLM from Scratch by Sebastian Raschka — testing my understanding of what an LLM is, how it works, and how it relates to generative AI.
TL;DR In the previous post we gave a high-level overview of the LLM Transformer architecture with examples. Today, we explain what ChatGPT did in “primary school”: the pre-training of an LLM. The q...
TL;DR In the last post, I wrote about custom LLM models and where they outperform general purpose LLM. Today we will present the transformer architecture, the hearth of llm model. The Transformer N...
TL;DR I am reading book Build LLM from scratch by Sebastian Raschka. In previous post we covered the topic what is the primary function of a llm. Today’s topic is next question from chapter 1, What...
TL;DR I am reading Build a Large Language Model from Scratch and, to deepen my understanding of what I read, I am writing blog posts about the questions that accompany the book. This post covers th...
TL;DR Currently, I am reading two books: Taking Testing Seriously by Michael Bolton and James Bach, and Build a Large Language Model from Scratch by Sebastian Raschka. I learned about Build a Large...
TL;DR