Blog

2020 2021 2022 2023 2024 2025 2026
2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
2004 2005 2006 2007 2008 2009

Cognitive Pollution

A couple of weeks ago, I used Claude to vibe code three formative assessment widgets to use in Jupyter and Marimo notebooks. It took less than two hours to get them working, and another 15 minutes to build a fourth. Given how rusty my JavaScript is, and how little I know about the AnyWidget protocol, I believe it would have taken at least a couple of frustrating days to write them by hand. I only have a high-level understanding of how they work (mumble mumble traitlets mumble mumble), but since they are exploratory proofs of concept, I’ve told myself that doesn’t really matter.

Two weeks from now, I will fly from Toronto to London, then to Edinburgh. By doing so, I will be responsible for the emission of approximately one tonne of CO2. That won’t do any measurable harm on its own, but when combined with the emissions of several billion other people, it will make the world my daughter inherits poorer and more dangerous in countless ways. I know that, but I’m still going to get on the plane.

Here’s another analogy. Synthetic opioids have destroyed the lives of hundreds of thousands of people, and while low-level dealers are routinely incarcerated, none of the Sackler family has ever faced serious consequences. On the other hand, the only thing that made the last few months of my brother’s life bearable was a steady drip of those same drugs. I would fight hard against them being banned, but I would also fight hard against them being completely deregulated.

Here’s a third thought. My brother died of mesothelioma, a cancer that is caused by exposure to asbestos. We grew up in a logging town on Vancouver Island; he did cleanup work at the local sawmill as a teenager, but it took decades for the cancer to manifest. I expect it will similarly take years or decades for us to discover the effect chat bots tuned to maximize engagement have had on what young men believe about what women enjoy.

What ties all of this together for me is:

  1. AI is useful.

  2. It is already causing harm.

  3. Saying “just don’t use it” isn’t going to have any more effect than saying “just don’t fly” (or preaching abstinence to teenagers).

  4. The people driving the AI goldrush have proven that they don’t care about anything except adulation and profit.

We now recognize the ill effects of the cognitive pollution caused by social media. I believe current attempts to address them via age verification are naïve; I think it would be more effective to regulate or ban the use of algorithmic ranking based on personal data, but the truth is that I don’t know. I don’t know enough about how safety standards became normal for the chemical, pharmaceutical, food, and transportation industries to feel that my opinions about regulating AI are worth listening to. What I do know is that people have devoted their careers to studying these things, and would probably be willing to explain them to us if we asked.

Looking back, I’m very glad that I took the time to learn a bit about evidence-based pedagogy before telling other people how they should teach. I therefore think that before we make recommendations about what anyone ought to do about AI, we ought to find out what has and hasn’t worked elsewhere. I’ve been thinking about this for a long time, but I still don’t know how to make it happen.

104 Days

It has been 104 days since I was laid off. In that time I have written approximately 64,000 words, of which 75% has been fiction and 25% non-fiction. (These figures don’t include email or social media.) I’ve actually written on all but 30 of those 104 days; at 71%, that puts me a little short of my 75% target but slightly ahead of the 65% of days I’ve managed over the past year.

As for time, I’m averaging about 5 hours a day of trackable activity, which includes exercise, music practice, and pro bono work as well as writing, programming, teaching, and looking for a job. I don’t really know where the rest of the day goes—I don’t believe sleep, chores, Wordle, and an episode or two of Elementary fill nineteen hours out of every twenty-four—but I’m trying not to worry about it.

Ongoing projects include:

One thing I haven’t done much of is read. I used to devour a book or two a week, but these days I find it difficult to get into most fiction, and even harder to read non-fiction. I don’t know if this is because I’m distracted by personal and world events, or whether it’s a stage of life, but I miss losing myself in someone else’s prose for a few hours at a time.

Four Traditions Revisited

Tedre and Sutinen’s paper “Three Traditions of Computing: What Educators Should Know” has shaped my thinking ever since I first read it. And this table (reproduced from the paper) summarizes their analysis:

Mathematical tradition Engineering tradition Scientific tradition
Assumptions Programs (algorithms) are abstract objects, they are correct or incorrect, as well as more or less efficient – knowledge is a priori Programs (processes) affect the world, they are more or less effective and reliable – knowledge is a posteriori Programs can model information processes, models are more or less accurate – knowledge is a posteriori
Aims Coherent theoretical structures and systems Investigating and explaining phenomena, solving problems Constructing useful, efficient, and reliable systems; solving problems
Strengths Rigorous, results are certain, utilized in other traditions Combines deduction and induction, cumulative Able to work under great uncertainty, flexible, progress is tangible
Weaknesses Incommensurability of results, uncertainty about what counts as proper science Limited to axiomatic systems Rarely follows rigid, preordained procedures; poor generalizability
Methods Empirical, inductive, and deductive Analytic, deductive (and inductive) Empirical, constructive

As I wrote three years ago, I’m struck now by what’s not there. I think there should be a fourth column titled “Humanist tradition” that focuses on values, on how computing is used, and on how cognitive and social psychology support, shape, and limit what we can build and how we build it.

I also now think that their distinction between the engineering and scientific traditions isn’t particularly useful. In practice, they are nearly-identical attempts to turn software development into an engineering discipline on par with chemical or electrical engineering. UML, requirements engineering, the use of statistical models to predict bug rates: all are signs of “engineering envy”, and by and large, practitioners have voted with their feet and not adopted them.

Instead, the overwhelming majority of the programmers I’ve worked with fall into what I used to call a “craft” tradition, but which I now think has a lot more in common with industrial design. Using Tedre and Sutinen’s categories:

I think this analysis explains why practitioners and software engineering researchers mostly talk past one another. Most researchers subscribe to what Scott’s book Seeing Like a State labelled “high modernism”: they believe comprehensibility and control will come from uniformity and formalism. Practitioners, on the other hand, are defending the local traditions in which they are personally invested. In my idle moments, I wonder where we’d be if that long-ago NATO conference had adopted industrial design as a metaphor instead of engineering.

Updating Snailz

I have updated the synthetic data generator I built last year to generate datasets I can use in my SQL tutorial. I might also use it as a running example if I ever teach a course on software design in Python to researchers.

If Not Lessons, Then What?

I used to think that when I retired, I would spend my time writing short tutorials on topics I was interested in as a way to learn more about them myself. I’ve now been unemployed for three months, and while I’ve written some odds and ends, it’s not nearly as fulfilling as I expected because I know that most people aren’t going to read a three-thousand word exposition of discrete event simulation: they’re going to ask an LLM, and get something pseudo-personalized in return.

To be clear, I don’t think this is inherently a bad thing: ChatGPT and Claude have helped me build https://github.com/gvwilson/asimpy and fix bugs in https://github.com/gvwilson/sim, and I believe I’ve learned more, and more quickly, from interacting with them than I would on my own. But they do make me feel a bit like a typesetter who suddenly finds the world is full of laser printers and WYSIWYG authoring tools.

I believe I can write a better explanation than an LLM, but (a) I can only write one, not a dozen or a hundred with slight variations to address specific learners’ questions or desires, and (b) it takes me days to do somewhat better what an LLM can do in minutes. I believe I go off the rails less often than an LLM (though some of my former learners may disagree), but is what I produce better enough to outweigh the speed and personalization that LLMs offer? If not, what do I do instead?

First-of in asimpy

Adding a “first of” operation to asimpy required a pretty substantial redesign. The project’s home page describes what I wound up with; I think it works, but it is now so complicated that I’d be surprised if subtle bugs weren’t lurking in its corners. If you (or one of your grad students) want to try using formal verification tools on ~500 lines of Python, please give me a shout.

Trying to Understand asimpy

As a follow-on to yesterday’s post, I’m trying to figure out why the code in the tracing-sleeper branch of https://github.com/gvwilson/asimpy actually works. The files that actually matter for the moment are:

I’ve added lots of print statements to sleep.py and the three files in the package that it relies on. To run the code:

$ git clone git@github.com:gvwilson/asimpy
$ cd asimpy
$ uv venv
$ source .venv/bin/activate
$ uv sync
$ python examples/sleep.py

Inside src/asimpy/actions.py there’s a class called BaseAction that the framework uses as the base of all awaitable objects. When a process does something like sleep, or try to get something from a queue, or anything else that requires synchronization, it creates an instance of a class derived from BaseAction (such as the _Sleep class defined in src/asimpy/environment.py).

Now, if I understand the protocol correctly, when Python encounters ‘await obj’, it does the equivalent of:

iter = obj.__await__()  # get an iterator
try:
    value = next(iter)  # run to the first yield
except StopIteration as e:
    value = e.value     # get the result 

After stripping out docs, typing, and print statements, BaseAction’s implementation of __await__() is just:

def __await__(self):
    yield self
    return None

Looking at the printed output, both lines are always executed, and I don’t understand why. Inside Environment.run(), the awaitable is advanced by calling:

awaited = proc._coro.send(None)

(where proc is the object derived from Process and proc._coro is the iterator created by invoking the process’s async run() method). My mental model is that value should be set to self because that’s what the first line of __await__() yields; I don’t understand why execution ever proceeds after that, but my print statements show that it does.

And I know execution must proceed because (for example) BaseQueue.get() in src/asimpy/queue.py successfully returns an object from the queue. This happens in the second line of that file’s _Get.__await__(), and the more I think about this the more confused I get.

I created this code by imitating what’s in SimPy, reasoning through what I could, and asking ChatGPT how to fix a couple of errors late at night. It did all make sense at one point, but as I try to write the tutorial to explain it to others, I realize I’m on shaky ground. ChatGPT’s explanations aren’t helping; if I find something or someone that does, I’ll update this blog post.

Introducing asimpy

I put the tutorial on discrete event simulation on hold a couple of days ago and spent a few hours building a small discrete event simulation framework of my own using async/await instead of yield. As I hoped, I learned a few things along the way.

First, Python’s await is just a layer on top of its iterator machinery (for an admittedly large value of “just”). When Python encounters await obj it does something like this:

iterator = obj.__await__()  # get an iterator
try:
    value = next(iterator)  # run to the first yield
except StopIteration as e:
    value = e.value         # get the result

Breaking this down:

  1. Call the object’s __await__() method to get an iterator.
  2. It then advances that iterator to its first yield to get a value.
  3. If the iterator doesn’t yield, the result is whatever the iterator returns. (That value arrives as the .value field of the StopIteration exception.)

We can simulate these steps as follows:

# Define a class whose instances can be awaited.
class Thing:
    def __await__(self):
        print("before yield")
        yield "pause"
        print("after yield")
        return "result"

# Get the __await__ iterator directly
awaitable = Thing()
iter = awaitable.__await__()  # this is an iterator

# run to first yield
value = next(iter)
print("iterator yielded:", value)

# Step 2: resume until completion
try:
    next(iter)
except StopIteration as e:
    value = e.value
    print("final result:", value)
before yield
iterator yielded: pause
after yield
final result: result

asimpy builds on this by requiring processes to be derived from a Process class and to have an async method called run:

class Process:
    def __init__(self, env):
        self.env = env

    @abstractmethod
    async def run(self):
        pass


class Sleeper(Process):
    async def run(self):
        for _ in range(3):
            await self.env.sleep(2)

We can then create an instance of Sleeper and pass it to the environment, which calls its run() method to get a coroutine and schedules that coroutine for execution:

env = Environment()
s = Sleeper(env)
env.immediate(s)
env.run()

Environment.run then pulls processes from its run queue until it hits a time limit or runs out of things to execute:

class Environment:
    def run(self, until=None):
        while self._queue:
            pending = heapq.heappop(self._queue)
            if until is not None and pending.time > until:
                break

            self.now = pending.time
            proc = pending.proc

            try:
                awaited = proc._coro.send(None)
                awaited.act(proc)

            except StopIteration:
                continue

The two key lines are in the try block:

  1. Send None to the process’s coroutine to resume its execution.
  2. Get something back the next time it calls await.
  3. Run that something’s act() method.

For example, here’s how sleep works:

class Environment:
    # …as above…
    def sleep(self, delay):
        return _Sleep(self, delay)

class _Sleep:
    def __init__(self, env, delay):
        self._env = env
        self._delay = delay

    def act(self, proc):
        self._env.schedule(self._env.now + self._delay, proc)

    def __await__(self):
        yield self
        return None
  1. Environment.sleep() returns an instance of _Sleep, so await self.env.sleep(t) inside a process gives the environment an object that says, “Wait for t ticks.”
  2. When the environment calls that object’s act() method, it reschedules the process that created the _Sleep object to run again in t ticks.

It is a bit convoluted—the environment asks the process for an object that in turn manipulates the environment—but so far it seems to be able to handle shared resources, job queues, and gates. Interrupts were harder to implement (interrupts are always hard), but they are in the code as well.

Was this worth building? It only has a fraction of SimPy’s features, and while I haven’t benchmarked it yet, I’m certain that asimpy is much slower. On the other hand, I learned a few things along the way, and it gave me an excuse to try out ty and taskipy for the first time. It was also a chance to practice using an LLM as a coding assistant: I wouldn’t call what I did “vibe coding”, but ChatGPT’s explanation of how async/await works was helpful, as was its diagnosis of a couple of bugs. I’ve published asimpy on PyPI, and if a full-time job doesn’t materialize some time soon, I might come back and polish it a bit more.

What I cannot create, I do not understand.

— Richard Feynman