programming – Thinking with Nate

New Coding Project

I just published a new coding project! If you’ve been following along, this is the “grown up” version of a demo project I posted last year.

It’s further exploration of what I call an “epigenetic algorithm.” It’s inspired by a simple observation: in living cells, the process of evolution is actively managed by the cell, which itself is an evolved mechanism. Using evolution to optimize evolution seems like a powerful trick, so I’m trying to reproduce it in small-scale AI experiments. I hope to make evolutionary computing more open-ended, more successful in vast search spaces, and less biased by the programmer. In this case, I’m generating cool looking Game of Life simulations, but I hope to find many more practical applications in the future.

I’m not sure whether or not I will publish this as a work of science. It’s complicated and weird and there’s more work involved to make this a proper controlled experiment. As I learn more, I’m already thinking of other ways to explore this idea that might be more effective. So, for now, the plan is just to share the code and I may or may not revisit this later, depending on where my other leads take me. 🙂

I gave a presentation that covers the motivation, results, and challenges of this project for a technical audience.

You can also peruse the source code on Codeberg.

Sorry this is less accessible than my usual blog posts! If you have any questions, just drop them in the comments, and I’ll happily answer them.

Self-Made Life

A Cell’s Eye View of Evolution, Part 2

(this month’s image is a photo I took at Glass Beach in Fort Bragg, California. It consists of millions of tiny pebbles of polished sea glass in myriad colors. Each one is improbable to find at a beach, but all of them collectively tell the story of a larger-scale process that explains their presence—many years of disposing of waste bottles nearby)

This is part two of a three-part series. You can read it on its own, but to get the whole story, you should start from the beginning.

Every living cell is a universal automaton, capable of an infinite variety of forms and lifestyles. The cell itself determines the range of possibility—what proteins can it make, how will they integrate into the whole, and how will they shape the organism’s life? But within that space of possibility, the gene sequence is the program that determines what specific body to make, and how it should respond to a chaotic environment. That program is of the utmost importance, selecting one specialized lifestyle out of infinitely many possibilities, most of which would never work. But who wrote it?

Life did. Or, in other words, cells wrote the programming for cells, but it doesn’t seem fair to call them “programmers.” A human programmer is a top-down problem solver. To modify some working software, I start by analyzing that system’s performance: what does it do, and how well? I then try to understand how the system works. I read its programming, and consider how that leads to the behavior I observe. I consider how the system might be better, what specific improvements I might make, and what consequences I expect. I design a change to the code, then tinker with it in a sandbox environment until it does exactly what I want with no surprises or side effects. Only when I’m satisfied do I push the code out for use in the real world.

How are cells supposed to do that? They don’t have a mind like I do. They monitor their own health and situation, but they can’t imagine themselves “from the outside.” They don’t understand their lifestyle, their goals, or their relationships with the world except in terms of stimulus and response. They can execute short stretches of programming from their DNA, but they can’t examine the program as a whole, or imagine what would happen if they ran some other program instead. Mutations and cross-breeding cause the program to change, but cells can’t verify those changes in a safe testing environment. They can only make a new organism, release it into the world, and hope for the best.

This seems like a paradox, but only from the perspective of an individual cell. Life never exists as a lone individual. It always consists of diverse communities and ecosystems. So let’s take that perspective. Life isn’t a cell. Life is a vast collection of cells, an unimaginably huge number of concurrent individuals, living and dying continuously over billions of years. I like to think of it as a massively parallel computer. Life learns by accumulating and generalizing over the experiences of many lifetimes. It tries out many lifestyles simultaneously; whole different categories of lifestyles in different species and niches; many variations on a theme, among organisms of the same species. Whether any individual will be successful is hard to predict, because luck is such a huge factor. But by running the same experiments thousands of trillions of times over, life can get a relatively clear picture of what works and what doesn’t.

From this massively parallel perspective, each cell is a processor unit with two jobs. The first job is to live and reproduce. We normally think of that as “the purpose” of life, but from the collective perspective, it’s just the engine that keeps the system as a whole running. Individual lifetimes come and go rapidly and continuously like the cycles in a computer processor. The second job of each cell is to learn about the Universe and how to thrive in it, then integrate that information back into the system’s programming. That is life’s deeper purpose. Each cell has a very limited perspective and can only do this in a minimal way. But taken all together, vast numbers of cells can behave much more like a traditional programmer.

A cell has no understanding of what it does or how it fits into the big picture, but, in a sense, life as a whole does. Life consists of every cell, every lifestyle, every program that has worked so far, and all the dead ends that aren’t represented. That’s a tremendous amount of information about what the system does, which parts are performing better than others, and where there’s untapped potential. There’s no top-down designer, but this information, embodied in the population itself, still shapes the system’s evolution. Each cell acts selfishly, but by competing and collaborating with each other, life distributes resources to the parts of itself that are thriving in their niche, and encourages exploration of new lifestyles where opportunity lies.

The big difference between life and a human programmer is speculation. Humans envision the system as a whole and imagine where it might go. When I program a computer, I test ideas first in my mind and then in a virtual environment on the computer in order to avoid mistakes and dead ends. Life doesn’t do that, because its massively parallel, distributed design makes that impossible. There simply is no top-down view, no central authority to perceive and decide, no way to step out of the system for testing. Instead, life just tries everything. It tests code in production. It walks straight into failure rather than avoiding it. Death sorts what works from what doesn’t. That’s why we usually think of cells as “dumb” and humans as “smart.”

It would be better to think of humans as more efficient, because in a sense what we do isn’t so different. We both generate lots of options, test them, winnow them down to the most successful, and iterate. I do it with simulations in my brain, life does it with matter in the physical Universe. From a computational perspective, that doesn’t matter. A computer can be realized with cogs, transistors, neurons, or a simulation inside another computer. The materials can be anything so long as the functionality is the same. As individual living animals, it seems tragic and absurd that life must test out ideas by letting things die on a massive scale. From life’s perspective, though, a single lifetime is just a brief use of materials and programming that will get recycled for future experiments. If it were conscious, life as a collective whole wouldn’t regret (or notice!) the death of an organism any more than we regret the death of one of our cells.

You might argue that the human programmer also has foresight. I don’t just try stuff at random, I have a sense of what might work and what probably won’t. I use trial and error, but I prioritize and focus my exploration. It’s tempting to say cells don’t do this, but that’s too dismissive. Cells are quite opinionated. DNA doesn’t contain a blueprint for an organism, but a bunch of recipes that the cell can choose between depending on context. Cells must decide how to act. Evolution has equipped them with a wide variety of strategies and tools, and guidance on when to use them, all encoded in the DNA. This is how life learns from experience, plans ahead, and avoids dead ends. Life explores new paths at random, but it mostly tries out reasonable variations of known working strategies.

Individual cells aren’t much like human programmers, but they do have one thing in common: they write code. Mostly, the cell just copies its own DNA (copy / paste is a popular strategy among human programmers, too), but it makes an important decision: what variations to try in the next generation. This happens either through mutations, or by swapping genetic material with other cells. Without a top-down view of the gene sequence, this can only be a random process. The cell has no idea what changes it’s introducing, or what the consequences might be, but it can try to shape that randomness for the better. Primarily, this means spending a ton of time and energy on error correction, so that only a small number of mutations get through. It also shows up in things like mate selection, genetic recombination, and preparing the environment for the next generation.

I like to think of Darwin’s story as how life got started. Random mutation and selection are all you need to evolve better forms. As we learn more about cells, though, we see there’s a lot more going on. Life didn’t just find better lifestyles, it invented a general purpose platform capable of an infinite variety of lifestyles. As a collective, life uses that platform to explore many lifestyles simultaneously, pruning dead ends and investing more resources into exploring evolutionary paths that seem fruitful. Life started out randomly, but it grew more opinionated over time, and it has evolved many sophisticated ways to direct and shape its own evolution. That will be the topic of my next post.

What do you think? Did reading this make you think of life, cells, or evolution any differently? Any new ideas? Does anything I said sound wrong or misleading? Do you have other ways of looking at it? This post is more speculative than usual, and represents some of the ideas I hope to pursue in my PhD research, so I’m very interested in criticism and feedback. If you have any thoughts, please let me know in the comments!

Universal Automata

A Cell’s Eye View of Evolution, Part 1

(This month’s image is a photo I took of the full-scale model of Babbage’s Difference Engine at the Mountain View Computer History Museum. This is one of the first examples of a programmable digital computer. It’s a completely mechanical device, operated by hand crank.)

This is part one of a three-part series. For an overview, check out the introduction.

In the traditional story of evolution, each organism lives a single lifestyle, and the forces of nature select which ones are fit enough to reproduce. From that perspective, evolution is something that happens to life. But this story fails to explain something very strange and important: cells are not single-purpose machines. Although they only live one lifestyle at a time, they have the capacity to live an infinite variety of lifestyles, depending on their DNA programming. That requires an enormous amount of complexity and effort that doesn’t directly contribute to a life well lived. In fact, being programmable doesn’t help at all in a single lifetime if the program never changes. So why does life work this way?

To make sense of this, let’s look at a parallel example in computer technology. Consider an ATM. It’s a highly specialized kind of machine, but these days if you look under the hood you’ll often find a Windows PC that’s programmed to be an ATM. That seems like an odd choice at first. ATMs do things most PCs don’t (like dispensing cash), and Windows supports things that you don’t want in an ATM (like running random programs off the internet). You could make a better, safer, more efficient ATM if you designed a custom machine for that purpose, but nobody does that, because it’s harder. Digital computers are so versatile and easy to reprogram that they show up everywhere. As they get used in new applications, their range of capabilities expands, enabling new use cases and further innovation.

Cells are very similar. Being programmable doesn’t help with any one lifestyle, but it makes it possible to explore new lifestyles relatively quickly and easily. Each individual operates in a complex, roundabout way that only uses a fraction of the cell’s potential. That seems like a bad thing, but the adaptability makes it worthwhile. The world is in constant flux, especially once organisms started actively changing things and competing with one another. Very few evolved lifestyles withstand the test of time. For this reason, nature doesn’t just select “the best lifestyles” for life. Life invested in a general purpose platform to make the search for new lifestyles more efficient.

Let’s take a closer look at how the platform works. A cell can be thought of as a kind of microscopic robot. The “programming” for that robot is stored in DNA, which is surrounded by a complex mechanism that reads that data and uses it to produce the form and behavior of the organism. Each cell has a very limited capacity for intelligence, but they’re very good at working together. Like a sort of “autonomous smart matter,” they collaborate by the trillions, which is how every form of intelligence on this planet is made. There’s no reason to think there’s an upper limit to what can be built in this way.

What makes this possible is the protein-synthesis engine at the core of every cell. The nucleus of a cell is a bit like the brain of a human, in that it’s a specialized sort of computer that’s “in control” of the cell. It’s surrounded by the cell’s body, which serves as the interface between the program in its nucleus and the outside world. This is where the similarity ends, though, because the nucleus and the brain are very different kinds of computing devices.

The nucleus works by continuously handling requests, looking up protein recipes, and sending those recipes back out to the cell for construction. A cell can make an astonishing variety of complex molecules this way. These proteins are what make up the cell, its inner workings, and outward behaviors. They serve as building material, messages, tools, or even nano-robots that move about within the cell, manipulating other molecules, and doing useful work all on their own. Sometimes a single protein can serve all of these roles, depending on context. They interact with each other in a vast complex network of activity that keeps the cell alive.

These cellular mechanisms continuously send messages back to the nucleus, reporting on the cell’s health, situation, and needs. The nucleus uses this information to figure out what proteins to make next, adapting the cell’s makeup and behavior to fit the circumstances. For instance, E. coli bacteria normally feed on glucose sugar, but they can eat lactose instead, if that’s what’s available. When that happens, the cell reports to the nucleus that it’s running low on energy and what molecules are around. The nucleus then decides to switch some genes on and off, which instructs the cell to make different enzymes, which results in different cascades of chemical reactions, in order to digest and use the lactose. By reading the DNA differently, the nucleus shifts the whole cell from one lifestyle to another, in response to a changing environment.

Another way to think of the nucleus is as the engine of the cell. The proteins it makes drive all the chemical reactions that keep the cell alive. Ultimately, everything the cell does is about collecting the energy and raw materials to feed that engine and keep it running. This is the cell’s metabolism. When the engine runs faster, the organism becomes more active, moving, “thinking,” and reacting with speed and vigor, but quickly burning through its energy stores. When it runs slowly, the cell becomes sluggish and conserves its energy. If it ever comes to a full stop, the cell dies, or, in special cases, enters suspended animation. In other words: cells live to make proteins, and making proteins is what makes cells alive.

DNA is where the cell keeps all these protein recipes, but the DNA molecule itself is completely inert. It just carries information, like a computer memory card. It can’t do anything by itself, and certainly can’t make a body from scratch. To build an organism, you need a cell to interpret the gene sequence and do the construction. This is why cells always reproduce by splitting in two. The daughter cell is basically just half of the parent cell, full of the same soup of proteins and organelles, in a fully operational state. The only part that’s really “new” are the DNA molecules in the nucleus, freshly copied from the parent(s). Any changes in that DNA program will only manifest when the daughter cell sends a message to the nucleus and gets a different response back than its parent would have seen.

That means that every cell has the crucial responsibility of reading and writing those DNA programs. They contain every useful protein recipe life has discovered, and must be actively maintained over generations or those recipes will be lost. But what does life actually record in the DNA? Geneticists say DNA is made of four amino acid base pairs (A, G, T, C), which are grouped into triplets called “codons” that serve as instructions for protein synthesis. That makes it seem “natural,” as if that were the only way to do it. The truth is, the code is totally arbitrary. Life made it up. By trial and error, life invented a coding scheme. It gave meaning to those molecules and all the ways they can be combined. The programming language of life was invented by life. It wasn’t the beginning, but a tool that cells made to manage their behaviors, learn new ones, and pass knowledge to future generations.

Let’s put that all together. A cell is a programmable micro-robot (in technical jargon, a “universal automaton”), capable of making virtually any protein and living virtually any lifestyle. In a sense, a cell is not just one organism, but potentially an infinite variety of organisms, depending on the programming in its nucleus. But how does the program get written? Life had to do all the work itself, without a programmer in the traditional sense. A cell has no mind with which to analyze its DNA and understand what it means. It cannot imagine the consequences of any changes to its programming, or test them out to make sure they are safe. And yet, somehow life invented a programming language and used it to write countless programs and build the full diversity of organisms we see on Earth.

We’ll delve into the details of how this happened in the next two blog posts. At a high level, though, there are two main parts of the story:

Life is self-made. Each cell is relatively simple and mindless, but working together in huge populations over long stretches of time, they develop their own programming. How they do it is quite different from how a human programmer would, but from a collective perspective, there are also some surprising similarities.
Life influences future generations. Organisms don’t just worry about their own survival, they put an enormous amount of time and energy into influencing the next generation for the better. Science is only beginning to understand this, but it offers the tantalizing possibility that, in some limited sense, life might steer its own evolution.