Test Pappy

A Practical Lesson in Systems Thinking – The Six Moves – Part 0

Over the last few weeks I wrote a lot. About vibe coding. About outsourcing testing to AI. About the 10x productivity myth. About ownership, lock-in, and the bathtub that nobody is watching. I didn’t plan it as a series. But looking back, all of these posts are connected. They are all attempts to explain the current state of my mental model on AI. Especially raising the awareness for certain risks.

And I want to take a small series to explain the basics of my current favorite system thinking approach. Not just what I see when I look at the AI landscape, but which thinking processes lead to that state.

Information Is Not Understanding

We are drowning in information about AI. Every day. LinkedIn is full of it. YouTube is full of it. Everyone has an opinion, a framework, a prediction. The signal-to-noise ratio is terrible, and the noise is winning. But here’s the thing. Even if you filter perfectly and only consume the best, most relevant information, that alone will not give you understanding.

There is a formula that stuck with me since I heard it on the Cabrera Lab podcast by Drs. Derek and Laura Cabrera.

M = I × O

Your mental model (M) equals the information you have (I) multiplied by how you organize it (O). That’s it. And the interesting part is not the information. It’s the organization.

Let me give you an example. Think of two people reading the same article about coding agents. Same information. One person reads it, nods, files it somewhere in the back of their mind under “AI is cool” or “AI is dangerous”, depending on their existing bias. The other person reads it and starts asking questions. What exactly is a coding agent? What is it not? What are its parts? What system does it live in? Who benefits? Who loses? What relationships are at play? The same information, organized differently, leads to a completely different mental model.

When you master O, the I becomes almost secondary. You can take even a small amount of information and build something useful with it. And you can take a mountain of information and build nothing, if you lack the skill to organize it. O is the skill. O is what you practice. And most people never practice it on purpose. They just consume more I.

The Love Reality Loop

There is another concept from the Cabreras that I want to introduce here. It’s the engine behind everything I wrote recently. They call it the Love Reality Loop.

All models are wrong, but some are useful.

– George Box, 1976

The idea is simple but uncomfortable. Your mental model is never a perfect reflection of reality. It can’t be. Reality is too complex, and our models are always simplified. That’s fine. The question is: what are you pointing your model at? Are you trying to match reality? Or are you trying to match what you want reality to be?

The Love Reality Loop says: love reality enough to let it correct you. Accept it as the target. You don’t have to like what reality shows you. But you have to accept it. Two mechanisms make this work. There are feedback loops, where reality sends signals back to your model. Your code breaks. Your deployment fails. The customer complains. These are signals. And there are fit-back loops, where you actively adjust your model to better match what you observe. You change your assumptions. You redraw the boundary. You update your understanding.

When both loops work, your mental model gets better over time. It fits reality more closely. You make better predictions. You see risks earlier. You ask better questions.

And here is where the AI hype becomes a problem. Many people in this space have stopped fitting back. They fell in love with their model of AI, not with reality. The model says “10x productivity.” Reality says the bugs just arrive later and more expensively. The model says “we need fewer engineers.” Reality says the remaining engineers are drowning.

When you stop fitting back, your mental model drifts. And the further it drifts from reality, the worse your decisions become. The hype machine is essentially a broken Love Reality Loop at scale. Millions of people pointing their mental models at what investors, vendors, and influencers want reality to be, rather than reality itself.

So How Do You Practice O?

This is the practical part. Systems thinking, specifically the DSRP approach from the Cabreras, gives you the O. Four patterns of organizing information: Distinctions, Systems, Relationships, and Perspectives. I have written about these before, and if you want the basics, my earlier posts on the kitchen door example and the systems thinking introduction are a good start.

But there is a layer on top of the four patterns that I haven’t written about yet. The Cabreras and their research lab have identified what they call the “6 foundational mental moves.” Think of them as exercises. Like push-ups or sit-ups, but for your thinking. Research from Cabrera Lab shows that practicing these six moves can increase cognitive complexity. The idea is a Pareto principle for thinking: 20% of the effort gets you 80% of the results.

The six moves are:

Is/Is Not List (a Distinction move): Write down what something is and what it is not. Draw the boundary. Sharpen the edges.

Zoom In (a System move): Take something and break it into its parts. What is it made of?

Zoom Out (a System move): What is this thing a part of? What larger system does it sit in?

Part Party (a System + Relationship move): You’ve listed the parts. Now draw the relationships between them. How do they interact? Where are the feedback loops?

RDS Barbell (a Relationship + Distinction + System move): Take a relationship, turn it into a thing in its own right, and then examine its parts. Don’t just note that A connects to B. Crack open the connection itself.

P-Circle (a Perspective move): Lay out all the perspectives. Who is looking? From where? What do they see? And which perspectives are missing?

These are the six moves. And in the next six posts, I will take you through them one by one. I will name them explicitly. I will explain each move. And then I will apply it to the AI landscape as I see it from my perspective. With 25 years of experience watching hype cycles come and go.

Why This Series

This is me going meta. I have become quite fond of systems thinking and getting better at it. And I want to share what I have learned. For two reasons. To improve my own thoughts about systems thinking by writing about it. And to share what I have learned. With the chance that someone else might take away a bit or two from it.

If you’ve read my recent posts, you’ve seen the output of my thinking. The ownership problem. The bathtub that overflows. The fence that’s back. The lock-in catastrophe. This series will show you more behind the scenes. How I arrived at the mental model I had at that point in time.

Next up: Part 1, the Is/Is Not List. What is AI? And more importantly, what is it not?

The Leverage Effect, or Why AI Is Eating Everything

What happens when the richest companies in human history all push in the same direction, at the same time, across almost every industry? We’re finding out right now. And if you look at it through a systems thinking lens, what you see is not just a trend. It’s a cascade of leverage effects reinforcing each other.

Let me explain what I mean.

Donella Meadows wrote about leverage points in systems. Places where a small push can cause big changes. She ranked them, from weak to powerful. Things like changing parameters at the bottom, changing the rules of the system near the top. What’s fascinating about the current AI wave is that it doesn’t just push one leverage point. It pushes several at once across all layers. And that changes things dramatically.

This isn’t entirely new, by the way. We’ve seen similar patterns before.
The railway mania of the 1840s looked remarkably similar. Huge investment, transformative technology, and a collective belief that this would change everything. And it did change everything. Eventually. But not before the system overshot spectacularly, fortunes were lost, and the correction was painful. The technology survived. Many of the companies and people caught in the hype did not.
The Dotcom boom of the late 1990s had massive capital flowing into internet companies. Wild narratives about how everything would move online. And a genuine breadth of application that made it all feel inevitable. Until the bubble burst.

Let’s look at the forces at play.

First, there’s capital. The companies driving AI adoption are not scrappy startups hoping for a lucky break. They are among the richest organizations in human history. Microsoft, Google, Meta, Amazon, Nvidia. And this is only looking at the US market. There is more money in China. When you have that kind of money, you don’t just experiment. You reshape markets. You acquire talent, buy infrastructure, and fund research at a scale that smaller players simply cannot match. Capital is a stock in the system, and right now, it’s flowing in one direction with enormous pressure.

Second, there’s marketing ambition. These companies are not quietly building tools and waiting for the market to notice. They are actively pushing the narrative that AI will transform everything. Every product launch, every keynote, every partnership announcement reinforces the message. AI is the future, and you’re falling behind if you’re not on board. This is a reinforcing feedback loop. The more people believe the narrative, the more they invest, which creates more success stories, which strengthens the narrative further.

Third, and this is the one that makes AI different from previous tech waves, there’s the breadth of application. AI is not like blockchain, which was a solution looking for a problem. AI genuinely touches almost everything. Healthcare, logistics, finance, education, software development, creative work, legal research, customer support. The list goes on. This breadth means the system doesn’t have one feedback loop. It has dozens, all running in parallel, all reinforcing the overall momentum.

Now here’s where it gets interesting from a systems thinking perspective. These three forces don’t just add up. They multiply. Massive capital fuels aggressive marketing. Aggressive marketing creates demand across many industries. Demand across many industries attracts more capital. And round and round it goes. Meadows would call this a reinforcing loop with multiple drivers. And she’d probably point out that such systems tend to overshoot before they correct.

But what about the brakes? Every system has balancing loops, forces that push back. And they do exist here. The EU AI Act is an attempt to introduce regulation, to change the rules of the system. There are real technical limitations, too. AI models hallucinate, they consume enormous amounts of energy, and scaling them is hitting physical and economic boundaries. Public skepticism is growing as well. People are starting to ask uncomfortable questions about reliability, about bias, about what’s actually being replaced and what’s just being automated badly. These are all balancing forces. The problem is: right now, they’re weak compared to the reinforcing loops driving adoption. Regulation moves slowly. Technical limits get framed as temporary obstacles. And skepticism struggles to compete with billion-dollar marketing budgets. The brakes exist, but the engine is far more powerful. At least for now.

And we can already see the effects. Companies are going “all in” on AI, restructuring entire departments. People are losing jobs, not because AI has proven it can replace them, but because the narrative says it will. Decisions are being made based on anticipated leverage, not demonstrated value. That’s a subtle but important distinction.

I’m not saying AI isn’t powerful. We just haven’t found out yet, what it’s really good at. While the technology is based in the 1940s, the fundamental breakthrough came in the last few years. This is a new Gold Rush moment. Everybody tries to find new applications to make the most money with AI. This is the “toothpaste with radioactive material to make teeth more shiny” moment. When I look at this through a systems lens, what I see is not just a technology trend. I see multiple leverage points being pushed simultaneously by actors with enormous resources and strong incentives. That combination creates momentum that is very hard to steer, and even harder to slow down.

Many people expect the bubble to burst. Including CEOs of the key players. Meadows warned us about systems that grow too fast. She talked about the danger of optimizing for one goal while ignoring the side effects. Right now, the goal is adoption. Growth. Market share. The side effects, job displacement, dependency, loss of critical skills, those are someone else’s problem. For now.

So next time you read about another company “embracing AI transformation,” zoom out. Look at the forces. Look at the loops. Ask yourself, who is pushing which lever, and why? The technology is real. But the leverage effect around it? That’s the story worth understanding.

The Hybrid Shop, or Knowing When to Use Power Tools

I’ve been woodworking for about 20 years now. And if there’s one thing that separates a hobbyist who makes okay stuff from someone who makes really satisfying work, it’s not the tools. Trust me, I have a few Festool tools, and I still suck at using them. It’s knowing which tool to reach for when and how to use it.

Let me give you an example. Dovetail joints. The kind you see on a quality drawer box. Those interlocking tails and pins that are both beautiful and incredibly strong. There are three ways to make them.

You can cut them entirely by hand. Marking gauge, dovetail saw, a sharp chisel, and patience. It’s slow. It demands skill. But when it fits, it fits perfectly. Because you understood every step, you felt the resistance of the wood, you made small adjustments as you went. The result can be stunning. But it takes a lot of practice to get there and considerable time even when you’re good.

Or you can cut them entirely by machine with a router and a dovetail jig. Fast. Repeatable. Any beginner can do it on day one. But the fit is whatever the jig gives you. The spacing is fixed. The proportions are generic. And if something goes wrong, you often don’t know why. Because you were just following a template, not understanding the joint. And honestly, they look it. You can usually tell.

The hybrid approach is where it gets interesting. You use a table saw to remove the bulk of the waste quickly. Then you come in with a chisel to pare the corners clean. You sneak up on the baseline, and get that satisfying, slightly-resistant fit that tells you it’s right. You get the speed of the machine and the precision of the hand. But only if you understand both well enough to know which part of the job belongs to which tool. The outcome depends on the craftsperson knowing the difference.

And now I keep thinking about this every time I open the AI chat window in my IDE.

Because using AI in software development feels exactly the same to me. There’s this tendency, especially when you’re new to it, to either do everything yourself (the “I don’t trust it” crowd) or to throw everything at the AI and hope for the best (the “it can do everything” crowd). Both approaches miss the point. Both lead to worse outcomes than you’d get otherwise. I’m not a fan of black/white approaches. The truth is somewhere in between.

The developers who use AI really well are the ones who understand the grain of the wood, so to speak. They know which parts of a problem are the rough, dimensioning work. Boilerplate, repetitive patterns, well-understood CRUD operations, generating test scaffolding. Feed that to the machine. Get it done in seconds. Fine. But they also know when they need to pick up the hand plane. When the problem requires nuance, context that isn’t obvious, architectural judgment, or security sensitivity. That’s when you slow down and think it through yourself, with or without the AI as a sounding board.

What I try to say is, the AI doesn’t know when to use the planer and when to use the hand plane. You do. Or you should. And if you don’t yet, that’s okay, but that’s the skill to develop. Not just “how do I prompt better” but “when does this kind of task benefit from AI assistance and when does it introduce risk or noise?”

I’ve seen AI-generated code that works but that nobody on the team can actually explain. And I’ve seen people spend three hours doing by hand something that would have taken ten minutes with AI. Both are inefficiencies. Both come from not understanding the tool.

The master woodworker doesn’t obsess over hand tools. But they also don’t think a CNC machine makes them irrelevant. They know what each does well, and they move between them with confidence and intention. That’s what good craftsmanship looks like, in the shop and in a codebase.

So the next time you’re about to either dismiss AI entirely or dump a complex problem into it uncritically, stop for a second. Ask yourself: is this the dimensioning work, or is this the joinery? Reach for the right tool. And if you’re not sure yet, that’s okay. That’s what we call learning.

As long as you continue learning. I see too many folks on social media talk and write about dropping everything to the AI. Coming up with the best workflows for using AI only, for all the steps. Skim through my last half dozen posts, if you want to know my opinion on that.

System Seeing Adventure – The AI Lock-In Catastrophe

As long-time readers might remember, Ruth Malan has these wonderful 30-day challenges she calls “System Seeing Adventures.” It is an exercise where you pick a system and you look at it through different lenses. You look at the connections, the dependencies, and the feedback loops. You use different techniques for visualization. You stop looking at the marketing slides and start looking at what is really happening. You zoom in on a component and zoom out to the market.

By doing more writing again lately, I have had my own little system-seeing adventure. And frankly, what I see is a train wreck in slow motion.

Let me walk you through an Ishikawa (fishbone) analysis that has been rattling around in my head. In this diagram, the “head” of the fish is what I call the Total Loss of Sovereign Agency. Every bone is a causality leading us to a point where “choice” is just an illusion.

Nothing happens all at once. There is a complex web of causality at play. It is a series of decisions that feel right on a Tuesday but leave you paralyzed on a Friday. Nothing happens just now. There is always a series of events. And the causes might not necessarily relate, but they can definitely amplify the effects.

The Effect: Total Loss of Sovereign Agency

We all know about outages. AWS, Google, Microsoft, and Cloudflare. They all go down from time to time. But when infrastructure fails, we recover. We recover because we had the “muscles” to react. Engineers could triage, reroute, and debug. The dependency was on plumbing rather than understanding.

The “AI Lock-In Catastrophe” is different. When the AI shifts, breaks, or the vendor turns the price screw, you do not just have a technical problem. You have a “brain” problem. You have lost the ability to pivot. You are fundamentally paralyzed.

Bone 1: The Knowledge Exodus

This is the big one. Companies are “optimizing” by laying off the very people who understand why the system exists. We are replacing domain experts with AI-augmented workflows. The code is there. The Confluence pages are there. But the institutional knowledge lives in people’s heads.

When your cloud goes down, your team jumps in because they own the system. When your AI logic shifts and your team has been “optimized” away, who jumps in? You are essentially onboarding a “junior developer” every morning that has forgotten everything from yesterday. Except now, you have no seniors left to catch the hallucinations.

Bone 2: The “Brain Transplant” Fallacy

“We will just switch to another provider.” It sounds reasonable if you haven’t thought about it for more than two minutes.

Cloud migration is moving pipes. Switching AI models is a brain transplant. Your prompts, your RAG pipelines, and your evaluation frameworks are tuned to a specific model’s quirks. You are not just switching a service. You are switching a logic provider. And the new brain does not think like the old one.

Bone 3: The Systemic Bottleneck

Even if you decide to switch, you are not alone. If a major provider changes their terms or fails, the entire market tries to move through the same narrow exit at once. You will face capacity limits and “onboarding queues” that take weeks. You are not just locked in technically. You are locked in by the crowd.

Bone 4: The Reinforcing Loop

There’s also a classic reinforcing feedback loop at work, and Donella Meadows would have had a field day with it.

Fewer people leads to more AI reliance.
More AI reliance justifies fewer people.
Fewer people means less ability to verify the AI.
Zero verification leads to “velocity without verification.” That is just negligence with extra steps.

The longer you stay in this loop, the more the system settles itself into a state where recovery is impossible.

Bone 5: Non-Determinism is not only a Compliance Nightmare

AI models are non-deterministic black boxes. Just because you think you can watch them think, it does not mean you actually are. That thinking output is just a show effect.
You send the same input twice, and you might get different outputs. Sometimes subtly different. Sometimes wildly different. My engineers tell me stories every week. Model A said it works, while B said it doesn’t. C implemented it this way, and D implemented it another way. And often you don’t need to switch models to get this kind of behavior.

In traditional software, if something breaks, you can reproduce it. You can write a test. You can trace the logic. With AI, you are debugging a black box that doesn’t even give you the same answer twice. How do you test that? How do you validate it? How do you explain to an auditor that your system “usually” gives the right answer? In regulated industries (medical, financial, automotive), this isn’t a philosophical question. It’s a compliance nightmare.

Bone 6: The Ground Shifts Under Your Feet

In software, we pin dependencies. We control the upgrade. With AI, that control is gone. Vendors “improve” models and deprecate versions whenever they feel like it. You find out your system has changed when your outputs start looking “off” on a random Tuesday. The vendor decides when the ground shifts. You just try to keep your balance.

Bone 7: Your Intellectual Property on a Transatlantic Trip

Here is one that European companies in particular should think about very carefully. For years, organizations have fought to keep their data in EU data centers. GDPR, Schrems II, and data residency requirements took a lot of effort and a lot of investment. And a lot of compliance work.

And now? We are pumping the entire codebase, the business logic, and the intellectual property through APIs that route straight to US servers. Every prompt, every context window, and every “please review this code” prompt results in a data transfer. All the careful data residency work is bypassed through the context window of an AI model. Nobody planned this. It just happened because the AI tools are so convenient. And the productivity gains are so visible that nobody stopped to ask where the data is actually going.

Bone 8: The Price Trap

Let’s talk about money. Right now, AI is cheap. Suspiciously cheap. The major AI vendors are burning through billions in infrastructure costs. They are building massive GPU clusters, paying for energy, and scaling data centers. And they are not making that money back. Not even close. The current pricing is subsidized by venture capital and the desperate race for market share.

This is a pattern we’ve seen before. Get everyone on the platform at a price they can’t refuse. Wait until they’ve rebuilt their workflows around it. Then adjust the pricing to something that actually reflects the real cost.

The real cost in this case is enormous. Training and running these models at scale is one of the most expensive computing endeavors in history. When the market consolidates and venture capital money runs out, the surviving providers will need to actually turn a profit. Prices will rise. Not by 20%. Potentially by multiples.

You are locked in, and the rent just exploded. The business case that justified your AI transformation? It was built on introductory pricing. And introductory pricing never lasts.

See the System

System seeing challenges are about training yourself to notice reality. Not what the marketing deck says. Not what the hype cycle promises.
As Derek Cabrera always says, “Love Reality!” You don’t need to like what you see, but no model replaces reality. Reality is the state YOUR model should reflect, not vice versa.

If you’re going all-in on AI, I’m not saying you’re wrong. AI can be powerful. And it really depends on your context. But understand what you are betting on. Understand that you’re not just adopting a tool. You’re restructuring your organization’s knowledge, your dependencies, your risk profile, and your sovereignty. All at once. Often without realizing it.

Look at the system. Every bone is a risk that does not show up in the “velocity metrics” your leadership is celebrating.

Love reality and see the system!

More Output Is Not More Value, or Why AI Might Break Your System

Every few years, a new tool arrives with bold promises. Better, faster, cheaper. We’ve seen it with test automation frameworks, with DevOps platforms, with low-code solutions. The marketing fliers are always shiny. The demos always impressive. And the promises always the same: more output, less effort.

But this time it’s different. Not because AI is fundamentally better than previous hype cycles (the jury is still out on that), but because it’s so generic. Previous tools targeted specific segments. A test automation tool promised to speed up testing. A CI/CD platform promised faster deployments. You could evaluate them within their narrow domain. AI, though, targets everything at once. Coding, testing, reviews, documentation, analysis, communication. It promises to boost output across the entire software development lifecycle. And that’s exactly what makes it dangerous.

Let me explain why, and I’ll borrow from Donella Meadows for this.

The Bathtub You’re Not Watching

Meadows loved the bathtub analogy for stocks and flows. Imagine a bathtub. Water flows in through the faucet (inflow) and drains out through the plug (outflow). The water level in the tub is your stock. In software development, think of the stock as “work in progress”: code written but not yet tested, reviewed, integrated, deployed, and validated in production.

Now, AI turns the faucet wide open. Developers produce more code, faster. The inflow increases dramatically. Wonderful, right? The dashboards look great. Lines of code, pull requests, features “completed.” Management sees the numbers and smiles.

But here’s the thing. The drain hasn’t changed. Proper testing, code review, integration, deployment, operations, monitoring, all of these are still running at the same pace. Maybe slower, because they now have to deal with more volume and, let’s be honest, with code that nobody fully understands because a machine wrote half of it. And very sure it will be slower when you lay a good part of your people off. Because AI does it all.

The water level in the bathtub rises. And rises. And nobody notices, because everyone is staring at the faucet.

The Bottleneck Just Moved

This is a classic systems thinking trap. You optimize one part of the system in isolation, and the bottleneck shifts somewhere else. In Meadows’ terms, you’ve changed one flow without understanding the feedback loops it connects to.

More code means more testing needed. More testing means more bugs found (or worse, more bugs missed). More deployments mean more operational complexity. More features mean more documentation, more user support, more cognitive load on the team. The system was in a rough equilibrium before. It wasn’t perfect, but the inflows and outflows were somewhat balanced. Crank up one inflow without adjusting the rest, and you don’t get more value. You get a flood.

And here’s where the quality problem sneaks in. When teams are overwhelmed with volume, they cut corners. Reviews get superficial. Testing becomes checkbox exercises. “The AI wrote it, it’s probably fine.” That’s not a technical problem. That’s a human and organizational problem. The kind that no tool can fix.

What Managers Don’t See on the Flier

The shiny marketing flier shows you the faucet. It shows you the increased output, the productivity metrics, the impressive demos. What it doesn’t show you is the bathtub. It doesn’t show you the rising water level of untested code, of poorly understood systems, of technical debt accumulating silently.

And it certainly doesn’t show you what happens when the bathtub overflows.

If you’re a developer or tester, you probably feel this already. The pressure to absorb more, faster, without the time to do it properly. If you’re a manager or decision maker, I’d ask you to pause for a moment. Look at the whole system. Not just the part that the vendor highlighted for you.

More output is only valuable if the rest of the system can absorb it. Otherwise, you’re just filling a bathtub with the drain closed. And eventually, you’ll have water on the floor.

Before you celebrate the increased velocity, ask yourself: where is the bottleneck now? Because it didn’t disappear. It just moved. And if you can’t see it, that’s when it’s most dangerous.

You Ship It, You Own It.

My colleague James and I had a conversation the other day about this whole AI vibe-coding thing and how it affects us. And I realized that all the topics I wrote about over the last few days all lead to the same thing. The lack of ownership.

A few weeks ago I wrote about what happens when you outsource testing to the AI. The core message was that skipping verification, validation, and review means you lose your mental model of the system. You become a stranger to your own code. But there’s another angle of the same problem, one that is maybe even more fundamental. It’s not just that you lose understanding. It’s that you lose the sense of responsibility. And that’s where things get really dangerous.

Let me give you an example. You prompt an AI to generate a feature, and it looks good on first glance. You click around, it seems to work, so you push it. Maybe you deploy it to production. Maybe you share it with a customer. And somewhere down the line something breaks or doesn’t behave as expected. And then, somehow, the reflex is to say: “Well, the AI wrote that.” As if that changes anything at all.

It doesn’t. You own it. All of it.

The moment you deploy code, share a document, publish a report, or put anything in front of another person, you have made a decision. That decision is yours. The AI did not press deploy. The AI did not send the email. The AI did not sign off on the release. You did. And with that action, you took ownership. Whether you read what you shipped or not. Managers are used to this. That’s why they have people they trust. When you decide to ship what the AI produced, and not understand what it produced, you still own it. And when you don’t check it, it means you trust the AI. More or less blindly.

And this is exactly the problem with what is sometimes called vibe coding. Prompt, generate, prompt again, generate again, check if it roughly looks right, ship. Repeat by the dozens. Nobody is reading the code. Nobody is thinking about edge cases. Nobody is asking what could go wrong. Nobody is checking for regressions in parts that should not have been touched. And when something goes wrong, there is this strange collective amnesia. “I didn’t really write that.” But you shipped it. That’s the same thing.

Think about it from a non-software angle. If you ask a colleague to draft a report for you, and you send it to your manager without reading it, and it turns out to be full of errors, whose problem is that? Yours. You put your name on it by sending it. The fact that you didn’t write it yourself is completely irrelevant at that point.

The AI is a tool. A powerful, impressive, sometimes surprisingly capable tool. But a tool doesn’t carry responsibility. You do.

And this is not just a philosophical point. There are real consequences. Legal consequences. Professional consequences. Consequences for the people using the software, reading the document, or relying on the output you generated and shipped without looking at it twice.

I understand the temptation. The tools are fast. The outputs are impressive. The friction is low. You can generate a dozen things before lunch that would have taken days before. But speed without verification is just a faster way to create problems. And those problems land on your desk. Or your customer’s desk. Or somewhere worse.

And here is where it gets interesting. Because ownership is not just about avoiding blame. It actually leads to better quality. Alan Page wrote about exactly this in his post “Quality is a System” last December. Drawing on Robert Pirsig’s Zen and the Art of Motorcycle Maintenance, he argues that quality emerges from care, and care means being fully present with your work. When you are present with the work, you notice things. You catch the thing that feels off. You ask the question that the AI didn’t ask. You make the judgment call that no prompt can make for you.

When you skip review and thorough testing and just ship, you are not present. You are absent from your own output. And absent people don’t catch problems. They create them.

What I try to say is this: solid architecture, good design, validation, verification and reviews are not bureaucratic overhead. They are the moment where you actually take ownership. That’s where you go from “the AI generated this” to “I checked this, I understand it well enough, and I’m prepared to stand behind it.” Skipping that step doesn’t make you faster.

The rate of shitification of software, as I called it in one of my last posts, is already increasing. A big part of that is not the AI itself. AI is just a tool being used. It’s the people using it without taking responsibility for the output. Don’t be one of those people.

Review what you ship. Own what you deploy. It was always your job. It still is.

AI Increases Output by 10x. Especially Stress.

Open LinkedIn or YouTube on any given day and you’ll find someone confidently explaining that engineers now produce 10x the code thanks to AI. Ten times! Let it be five times. Doesn’t matter. And companies are drawing the logical conclusion. If one engineer does more work by herding AI agents, you need fewer engineers. The code practically writes itself. Every bottleneck on the way gets mitigated by removing roles and people. The layoff announcements follow more quietly a few weeks later.

I work in the medical device industry, as you might have heard before. This is one influence of my deeper system thinking. In my world, having users find bugs is not a nuisance. It’s not a “move fast and fix it” moment. It’s an incident. It can be a patient safety issue. So when I see this particular trend, I don’t just find it frustrating. I find it genuinely alarming.

But even outside regulated industries, here’s the thing about 10x coders. One engineer cannot review this much output. So you produce also potentially 10x the untested assumptions. 10x the “we’ll clean this up later” quietly baked in. 10x the surface area for things to go wrong. Code volume is not the same as value delivered. And this is where the cascade begins.

What actually happens is this. Engineers, now supercharged by AI, push code faster. The pipeline accelerates. Releases that used to take two weeks now go out in two days. And without sufficient amounts of QA in the loop, because they have been let go, there’s nobody whose actual job it is to ask “but have we build the right thing?” before that code reaches users. The assumption is that the engineers will catch their own mistakes, or that automated tests will cover it. But those automated tests were often written by the same AI, in the same hour, with the same incomplete understanding of the edge cases. Has the engineer taken the time to review the test cases? Are they actually useful? I doubt that these 10x engineers, using AI, thoroughly review their work.

So the bugs (failing code, functionally wrong output, wrong solution in total) ship. And now they’re somebody else’s problem. Specifically, they become the problem of whoever is at the support desk or in on-call that night. Support teams, operations teams, SREs, platform engineers, people who were hired to keep stable systems running, suddenly find themselves triaging regressions at 2am. Doing archaeology on code generated in 40 seconds by a tool that doesn’t know your system, your users, or your domain. The defects didn’t disappear. They just arrived later, in a worse context, at a higher cost.

Shifting the burden doesn’t eliminate the burden. It just moves it somewhere more expensive.

And here’s what really gets me: the people making these decisions often genuinely believe they are speeding things up. Because the metric they’re watching is output. Lines of code. Features shipped. Velocity points. But those are stocks, in the Donella Meadows sense. The real question is about the flows. What is the rate at which risk is entering the system? What is the rate at which trust is being eroded with customers? What feedback loop tells you that your “10x productivity” is actually a delayed disaster?

Systems thinking, and specifically the DSRP framework, has a concept called part-whole. Zooming in, zooming out.
An engineer generating code faster is one tiny part of the system. The whole is the product. The code, the tests (or lack thereof), the deployment, the monitoring, the customer experience, the on-call rotation. And also the team morale when everything is always on fire. If you optimize only the code-writing part and ignore the whole, you are not actually optimizing anything. You’re just moving stress around.

I’ve been in software for over 25 years. I’ve seen this story before. It was “we’ll hire offshore teams to produce cheaper.” It was “we don’t need architects, the developers can just figure it out.” The pattern is always the same: short-term gain, long-term pain. The people who made the decision are often long gone by the time the bill arrives.

Slowing down to do things properly is not weakness. It’s leverage. A well-tested feature that works reliably is worth more than ten features that kind of work until they don’t. A defect found before it reaches a customer is worth more than any hotfix process you can build. A team that ships less, but ships with confidence builds faster over time. Because they’re not constantly paying back technical debt with interest. And you don’t make the wrong people suffer all the time. I hope the 10x AI agent orchestrator is also part of the on-call shift.

So yes, use AI. It’s a remarkable tool. Let it help to amplify what’s already there. A team with good practices, a quality mindset, and proper testing will get genuinely faster with AI assistance. A team without those things will just produce more chaos, more quickly. You have to scale the whole, not only one part. And removing people and even complete roles is not scaling, that is destruction.

Put the health of the whole team and the pipeline first. And an increase in speed by using AI tools will follow. Don’t use a crowbar approach or you will make more damage than good.

Slow down and zoom out. Please.

Friction-Maxxing, or The Case for Elbow Grease

My friend Maaike Brinkhof recently wrote a post called “They will not break me.” Her point, in a nutshell: the more they push her to offload her work to an LLM, the more she grabs pen and paper, takes her time to think, and does her work slowly and thoroughly. And it finally kicked me into writing this piece that’s been bouncing around in my head for a while.

The best developers I know are lazy. Not the “I’ll do it tomorrow” kind of lazy. The “I’ll spend four hours writing a script so I never have to do this ten-minute task again” kind of lazy. That’s productive laziness. Being that kind of lazy can be extremely arduous. Still, that’s beautiful.

But there’s another kind of laziness creeping in, and it’s not beautiful at all.

You know those cleaning products on TV? The ones that promise you can just spritz a surface, make a swift wipe, and the stain vanishes. No scrubbing, no effort, just spray and walk away. Sounds amazing, right? Except that you actually try it, and the stain is still there. Grinning at you. Because it turns out, most stains need friction. They need you to bring the scrubs and brushes, get on your knees and start scrubbing. The product helps, sure. Because they put something in against everything. But the elbow grease does the actual work.

Nobody wants to hear that. We want the spritz. We want the magic. We want the easy route.

And now, enter AI.

AI is the ultimate “just spritz it” promise. Need to understand a complex codebase? Ask the AI. Need to write a test strategy? Let the AI draft it. Need to debug something? Paste it in and get your answer. And honestly, it works. You get results. Sometimes surprisingly good results.

But here’s the thing. When the cleaning product does the work for you, you never learn how to actually clean. You don’t understand why some stains need acid and others need alkaline. You don’t build the intuition for which surface can handle abrasion and which can’t. You just spritz and hope. You fully rely on a tool, and when it stops helping you don’t know what to do anymore.

The same is happening with AI and thinking. People are offloading the hard parts, the painful parts, the parts where you sit with a problem and struggle until something clicks. And that struggle? That friction? That’s where the learning happens.

I’m not saying AI is bad. I use it myself, regularly. It’s a good tool. A very versatile tool in my tool box. Like a really good cleaning product. But a tool works best in the hands of someone who understands the craft underneath. If you’ve spent years debugging, writing tests, understanding systems, then AI amplifies your ability. It takes away the tedious parts so you can focus on the interesting ones. That’s the good kind of lazy. That’s automation.

But if you skip the learning, if you let AI do the thinking from the start, you end up in a strange place. You can get things done. Your output looks fine. Maybe even impressive. And sometimes, getting it done is what counts. I won’t deny that. Ship the feature, fix the bug, move on.

But for the overall picture? It should not be enough.

Because sooner or later, you’ll hit a situation where the AI gives you something that looks right but isn’t. And you won’t know. You won’t have the instinct to say “wait, that doesn’t smell right.” You won’t have the scar tissue from the times you tried the wrong approach and learned from it. You’ll trust the spritz because you never learned to scrub.

Let me put it differently. Friction is not a bug. Friction is the mechanism by which we build understanding. Every shortcut you take through a problem is a piece of understanding you didn’t build. And understanding compounds. The person who struggled through debugging for years sees patterns that the person who always asked AI simply cannot see.

My friend Stu Crocker once said something that stuck with me deeply: “Quality is the absence of unnecessary friction.” I love that and made it my mantra. But the key word is “unnecessary.” Not all friction is bad. Some friction is the price of admission. It’s on us to learn to tell the difference between the friction we should eliminate and the friction we need to push through. No pain, no gain, as they say. And right now, too many people are removing all the friction, including the kind that makes them grow.

So here’s my take. Use AI. Be lazy in the smart way. Automate the repetitive garbage. But when it comes to understanding how things work, to building your mental models, to developing the instinct that separates someone who gets things done from someone who truly knows what they’re doing, embrace the friction. Get on your knees and scrub.

The stain won’t remove itself. And neither will ignorance.

Vibe Coding is Like Groundhog Day

Vibe coding reminds me of micro-managing a talented junior developer. Except it’s worse. Because the junior eventually learns. The junior starts anticipating your expectations, picks up patterns, develops judgment. The AI? You explain the same basics over and over. Every. Single. Task.

Yes, I know about context files and skills and custom commands. I use them myself. But let’s be honest here, every task you hand to an AI still needs to contain everything. The full picture. The constraints. The edge cases. The things a human colleague would just know after a few weeks on the team. The AI doesn’t carry that forward. It doesn’t grow. You’re essentially onboarding the same junior every morning, and they’ve forgotten everything from yesterday.

And that’s fine, actually. It’s a tool. Tools don’t learn. I don’t expect my lathe to remember the last bowl I turned on it. But I also don’t pretend my lathe is a master woodworker.

Here’s where it gets interesting. People on LinkedIn and YouTube proudly announce they’re managing multiple AI agents in parallel. Five agents! Ten agents! Shipping features left and right! And I sit there thinking: how? How are you reviewing all of that output?

Because managing even one agent properly is hard work. You write the prompt. You check the output. You verify the logic. You validate that it actually does what it’s supposed to do, not just what it looks like it does. You catch the subtle bugs that the AI introduced because it didn’t understand the why behind your architecture. That’s a full-time job for one agent. And you’re running five?

I can only think of two explanations. Either these people are superhuman reviewers who process code at the speed of light. Or, and I suspect this is the more common case, they’re not actually reviewing much of anything. They look at the output, it seems reasonable, ship it. Next agent. Next task. Next feature. Velocity!

But velocity without verification is just negligence with extra steps.

Yesterday’s post was on throwing things over the fence. There’s a dangerous variant. Instead of catching what has been thrown at you, just redirecting it over the next fence, into the wild.

What I try to say is this: I’m not against using LLMs. I use them nearly daily. For certain tasks, they’re genuinely useful. Drafting boilerplate. Exploring approaches. Rubber-ducking a problem. Translating compliance language into English. All great. But every single output needs a human brain that understands the domain to look at it, question it, and decide whether it’s actually good or at least good enough.

The moment you skip that step, you’re not engineering anymore. You’re gambling. And here’s the thing that makes it even worse: LLMs are not deterministic. A script you wrote does the same thing every time you run it. That’s the whole point of code. But give the same prompt to an LLM twice and you will get two different outputs. Different structure. Different assumptions. Different bugs. It’s like asking that junior to redo a task. Only this time the junior has amnesia. And they interpret it slightly differently each time. And you can’t even train them to be consistent.

And the frustrating part is that the consequences don’t show up immediately. The code works. The tests pass — assuming someone wrote meaningful tests, which is a whole other conversation. The feature ships. Everyone’s happy. Until a few days later, when the subtle misunderstanding baked into the AI’s output causes a cascade of problems that nobody can trace back to the original “vibe.”

Traceability is an issue. Non-deterministic behavior is an issue. Validation is an issue. This is all fine if you are developing prototypes, or apps for yourself or for friends. But this doesn’t scale in corporate context. When you have millions of users you can go the route that big tech goes. Just deploy to a canary group, see if shit is broken and fix it. That’s what is happening right now with LLMs themselves and now with all the output generated by it. LLMs produce so much slop that it’s impossible to verify and validate a significant amount of the output. The LLMs don’t care. And the worst is, many of the humans involved in producing slop don’t care either.

I get it. The temptation is real. The AI produces something that looks impressive in seconds. And the dopamine hit of shipping fast is addictive. But looking impressive and being correct are two very different things. Ask any tester.

So before you spin up your next fleet of parallel agents, ask yourself one honest question: am I actually supervising this, or am I just watching it happen? Because if it’s the latter, you’re not managing agents. You’re just generating AI slop and technical debt at scale.

Review your outputs. Every single one. Yes, it’s slow. Yes, it’s tedious. That’s the job.

Vibe Coding and the Return of Throwing Stuff over the Fence

I touched on this topic in my last post. I want to elaborate on this phenomenon some more.

Testers, remember the old days? Developers on one side, testers on the other. A fence between them, and code flying over it like a catapult delivery. “Here, we’re done. Go find the bugs.” No context, no conversation, no shared understanding. Just a lobbed artifact and a prayer. Bug ticket ping-pong, our favorite sport.

We spent years tearing down that fence. We fought for collaboration, for testers embedded in teams, for shared ownership of quality. Having discussions about requirements together. On the same table, as equals. We came a long way.

And now? The fence is back. But it’s wearing a shiny new outfit.

Welcome to the era of vibe coding.
Service human: You describe what you want, sometimes in a sentence, sometimes in a paragraph.
Return computer: An AI agent produces code. Maybe it compiles. Maybe it even runs. The agent might verify that much itself. (That’s more than sometimes the humans did in my past.)
Now the human is busy again, dealing with the output. Now we see who is a good engineer.
Does it actually solve your problem? Does it handle the edge cases you forgot to mention? Does it fit into the architecture?

So what happens? You get code that looks plausible. It compiles. It might pass a basic test. And then you have to figure out if it actually does what you need. You realize that you forgot half the context you should have put into the prompt. You left out constraints you didn’t even realize were constraints. You assumed things that aren’t in the prompt because they’re obvious to you but invisible to the machine.
So, throw things back at the machine. Next round of ping-pong.
The feedback loop between you and the AI is not a conversation. It’s a negotiation with someone who has amnesia.

Sound familiar? It should. It’s the same dynamic as the old silos, just with the roles flipped. This time, the engineer is the one receiving half-baked deliverables tossed over the fence.

And here’s the crux: now the engineer has to test. Not just glance at the output and nod. Actually test it. Poke at it. Question it. Code review is a good start. Think about what could go wrong, what’s missing, whether the thing that was built is the thing that was needed. Welcome to the world testers have lived in for decades. Turns out, validating someone else’s work against unclear requirements while missing half the context is a skill. You need to think about boundaries, about edge cases, about the gap between what was asked and what was meant. Engineers are now learning that testing is a discipline, not a checkbox. They have tried to automate testing away, but without success.

Or they don’t. They trust the output. Because it looks right. And the AI sounded so confident. And the last time you asked it for some script, it also worked. Why should it not work this time? False trust!

If vibe coding is essentially about describing what you want, validating whether you got it, and iterating through feedback cycles — who’s actually been doing this for a living? It’s not the engineers. It’s the requirement engineers and the testers. They’ve spent their careers translating vague human intent into something more precise. They know how to spot the gap between “what was said” and “what was meant.” They’ve been dealing with incomplete information and implicit assumptions since day one. Could it be that testers and requirement engineers are actually better suited for vibe coding than the developers?

You need the full package, man!

People knowing BBC’s “Death in Paradise” will know that Dwayne Myers is the “full package”.

To be successful in vibe coding, you need a lot of skills.
You need solid requirements engineering to define the right context, describe precisely what you want. Specification Driven Development is the new term for stuff we called Behavior Driven Development (BDD), Example Mapping, etc.
You need some software architecture knowledge to understand how the solution is built. What goes where, how elements interact. Or even better, define it yourself. And provide more constraints. You should make the architecture decision. Not the AI. Clear the context, switch the model, and it will change the architecture, just because.
You need software engineering knowledge. You should at least somehow understand what the AI builds. You really should review the code. At least spot check it.
And you need to test the solution. Verification AND validation. You need exploration skills. And you need test knowledge to determine whether the testing, the AI hopefully implemented, is useful. You need it to write more tests.
You need operations knowledge. How to run the software. How to monitor it. How to deal with errors.
And so much more.
The full package, man! That’s why we have cross-functional development teams. You cannot replace this with one person and a flock of AI agents.

This is the irony that gets me. We spent decades learning that collaboration beats handoffs. That shared context matters. We built great cross-functional teams. And here we are, back to individuals throwing things over a fence.
The fence is back. And the silos are back. Let’s not pretend they aren’t.

	motiv8n on Simple is not the opposite of…
	Kayla S on Reinventing Testers and Testin…
	Mirek Długosz on Web of Causation and why I don…
	Deacon W on Parallels between testing and…
	Five for Friday… on Software Testing for 20 y…