-

Using Gemma 4, Ollama, OpenAI Agents SDK, and Tavily MCP to build a lightweight research…
8 min read -

Why memorizing for the exam doesn’t mean you understand the subject
10 min read
Latest
-

Enterprise Document Intelligence [Vol.1 #M1] – The thesis behind every architectural choice in this series
20 min read -

How to smash through data / ML behavioural interviews
10 min read -

I benchmarked raw chat history, vector-only RAG, and a context graph on the same multi-agent…
19 min read -

A reproducible benchmark on latency, cost, and reproducibility, and where agents actually earn their keep.
17 min read -

Beyond the Straight Line: Choosing Between OLS, Interaction Terms, and Tweedie Regression
Data ScienceWhether you should stick to a classic Ordinary Least Squares regression, introduce interaction terms, or…
14 min read -

Beat the 8GB VRAM limit. Learn how to run three different LLMs on a single…
21 min read -

Enterprise Document Intelligence [Vol.1 #7C] – One LLM call ranks the candidates with reasons. The…
31 min read -

One Month Into Learning Data Engineering in Public: Here’s What I Didn’t Write About
Data EngineeringA reflection on the first month of learning data engineering in public, and what actually…
8 min read -

Turning model coefficients into a 0–1000 score, with risk classes and stability checks
7 min read
Editor’s Picks
-

Your First Task as a Data Engineer in a New Company? Make the ETL Pipeline Testable
Data EngineeringA practical data engineering onboarding workflow for environment setup, automated testing, and AI-assisted development.
9 min read -

A practical walkthrough using text-to-SQL as the example
13 min read -

How Gemini solved my Pandas problem in seconds, and why data science fundamentals still matter…
7 min read -

GPU-Resident Top-K for Agentic RAG: I Built a CUDA Kernel So My Retrieval Step Would Stop Bouncing Off the GPU
Agentic AIThe PCIe transfer latency is silently bottlenecking your agentic inference. Here is how building a…
31 min read -

Structured Outputs with LLMs: JSON Mode, Function Calling, and When to Use Each
Large Language ModelsGetting reliable, readable responses out of your LLM, and knowing which tool to reach for
13 min read -

How unit economics should set your classification cutoff, and why they rarely do.
15 min read -

Most LLM applications need a clear workflow, not an autonomous agent. Here’s how to build…
19 min read -

Budgets for AI tokens can’t be infinite, no matter how much hyperscalers wish they were
8 min read -

A single model hands you a single answer and no sense of how much it…
11 min read
The Variable Newsletter
-

Sorting through the good, bad, and ambiguous aspects of vibe coding
4 min read
Deep Dives
-

Finding the right anchors for RAG: keyword, embedding, and TOC signals in parallel
Large Language ModelsEnterprise Document Intelligence [Vol.1 #7B] – Retrieval is filtering on structured tables: keywords first, TOC…
33 min read -

Enterprise Document Intelligence [Vol.1 #7A] – Stop searching strings. Filter line_df and toc_df. Pick anchors…
21 min read -

Why one-hot encoding isn’t always the best approach, and alternative encodings
21 min read -

The intuition behind neural networks and why they need activation functions.
20 min read -

Understanding how LLMs interact with the world around them, from returning data to taking action
12 min read -

For decades, the existence of the hydrophobic core, a region in the 3D structure of…
12 min read
