-

4 YAML Files Instead of PySpark: How We Let Analysts Build Data Pipelines Without Engineers
Data EngineeringHow we replaced Python pipelines with dlt, dbt, and Trino — and cut delivery time…
10 min read -

The best machine learning model is not one model
9 min read
Latest
-

Caching, lazy-loading, routing, compaction, and more
26 min read -

System Design Series: Apache Flink from 10,000 Feet, and Building a Flink-powered Recommendation Engine
Data ScienceA deep dive into how Apache Flink works, why it exists, and learning it while…
17 min read -

Using autoresearch to optimise marketing campaigns under budget constraints
14 min read -

What does correlation tells us?
6 min read -

Blast-radius control tells you how much to break. Intent tells you what breaking it will…
18 min read -

PyTorch NaNs Are Silent Killers — So I Built a 3ms Hook to Catch Them at the Exact Layer
Deep LearningNaNs don’t crash your training — they quietly destroy it. After losing hours to a…
11 min read -

A simulation of how a single forecast change moves through five planning teams, and why…
14 min read -

With the advent of UDFs and their combination with calculation groups, I see a lot…
6 min read
Editor’s Picks
-

Why learn 8 scripts when you can learn 256 bytes?
12 min read -

How does decision-gravity dictate this gap?
12 min read -

A local, zero-cost project that cleans, structures, and summarizes your reading automatically
13 min read -

Using Causal Inference to Estimate the Impact of Tube Strikes on Cycling Usage in London
Data ScienceTurning free-to-use data into a hypothesis-ready dataset
19 min read -

A short intro to scientific methodology to combat “prompt in, slop out”
6 min read -

-

Why it tickles your brain to use an LLM, and what that means for the…
8 min read -

Git worktrees, parallel agentic coding sessions, and the setup tax you should be aware of
20 min read -

How I turned my eight-year weekly visualization habit into a reusable AI workflow
7 min read
The Variable Newsletter
-

Sorting through the good, bad, and ambiguous aspects of vibe coding
4 min read
Deep Dives
-

It’s simpler than you think.
24 min read -

Learn how Propensity Score Matching uncovers true causality in observational data. By finding “statistical twins,”…
12 min read -

How you can build your own Thompson Sampling Algorithm object in Python and apply it…
17 min read -

For any data scientist who works in a team, being able to undo Git actions…
24 min read -

The hidden cost of probabilistic outputs in systems that demand reliability
13 min read


