Locklin on science

Very small engines

Posted in Design by Scott Locklin on March 25, 2026

One of the interesting things to contemplate is the scale of the internal combustion engine. It’s a very human scale device; pistons the size of fists, Valves about as wide as knuckles. It’s the kind of thing a man with normal sized machine tools can make. Most internal combustion engines in the world are on this human scale. The ideas came about from the very human business of making cannons and pumps for coal mines, so no real surprise at this. There are also fairly large ones driving cargo ships with pistons which are about a yard in diameter. Those are about as big as they get: half meter to a meter in diameter pistons and something around that size has been in existence for about a century (along with similarly sized steam engines they evolved from). Cars with fist sized pistons have a thermodynamic efficiency of around 25%, maybe 35% on a good day. The thing with manhole size pistons hits 50% and is able to burn tar-like bunker fuel.

The more important prime mover is the turbine. For gas turbines, the turbine blade is of a similarly human length scale: the things that convert heat into motion are single crystals of nickel-superalloys which are a few inches long; about 6 inches long -not real different in scale from car or marine engine pistons. Steam turbine blades are made of less exotic materials and are considerably longer; maybe a few feet long -just like the old timey big piston steam engines. If we ever switch to supercritical CO2 turbines, the blades will be much smaller -back to gas turbine size or smaller.

There are lots of reasons for this, but the primary reason is people are people sized and tend to make things out of parts on people scales. If you start thinking about other length scales, things get very different. For the same reasons you can’t just make a lathe very small and expect it to function similarly, you can’t make an efficient heat engine very small and expect it to work the same way. For example, the surface area to volume ratio in smaller engines becomes unfavorable for standard designs. Combustion looks different on millimeter length scales than it does in fist sized objects; it’s much more unstable, and the droplet size from something like a fuel injector or carburetor isn’t so favorable to very small motors. To put a scale on it; diesel motor injectors make droplets around 5 microns. Gasoline/alcohol, maybe 25 microns. If you’re using a carburetor, which on a small engine you probably are for “it’s difficult to fit a fuel injector in here” reasons, probably 100 or 200 micron droplet sizes. Imagine you have a 5mm (aka 5000 microns) bore engine, the droplets start to look like giant beach balls bouncing around inside the piston. That’s going to produce very strange burn dynamics compared to the same droplets bouncing around. Average motors having a bore size of 90mm, it doesn’t look so bad. Going smaller than 0.1cc obviously this gets worse. Same story but worse for stuff like steam piston engines, along with the additional hurdle of having a tiny steam bomb in your prime mover.

There’s obvious reasons why a small heat engine might be desirable. Hydrocarbons are a great way of storing energy. Much better than the present generation of battery technologies in terms of weight and volume. That’s why life uses hydrocarbons to store energy. Having a little motor and some ethanol for a laptop battery sounds pretty cool to me. I mean, a fuel cell would be more silent and futuristic, but nobody can make those work right, and people do make motors work on a regular basis and have for 150 years or more. Again, 1200kJ/kg lithium batteries versus 40,000kJ/kg kerosene. Imagine you’d like an insect sized drone (people definitely want this); you ain’t gonna power such a thing for very long with a tiny volume of lithium polymer, but you could certainly do it with some hydrocarbons.

Of course model engineers have made small heat engines for over a century now, but as far as I know, none of them have concentrated on making them efficient small heat engines; just making them function is enough work, or push a model airplane around.

Starting with the Carnot model, we can begin to see even more reasons why there might be challenges with building small, efficient heat engines:

$\nu = 1 - \frac{T_C}{T_H}$

Squeezing big heat differences into a small space is going to be more difficult than squeezing big heat differences into a large space. An efficient heat engine burning kerosene or whatever might have $T_H$ of 2700 kelvin, with $T_C$ of 300 kelvin. Maintaining a temperature gradient of 2400C over a few feet is fairly easily doable, but seems more difficult over millimeters unless you start making the things out of zirconia or other ceramics.

Making flame on a small length scale is also inherently difficult, there is a phenomenon called “flame quenching distance.” Over a length scale of a few millimeters the flame can’t propagate well. I believe this is independent of the beach-ball sized fuel droplets in tiny motors, but it’s probably somewhat related.

Speaking of scale: stuff like piston rings assumes a piston-cylinder gap which involves a piston of a couple of inches drilled by conventional boring bars. These have gaps of a certain size, which work very well at this point for pistons of this size. They work like shit on much smaller pistons/bores because, like, geometry. A tiny gap in a 95mm piston looks huge in a 5mm piston in comparison to total area.

Surface area to volume: this is why we don’t have very large insects (but did when there was more oxygen in the atmosphere). Bugs breathe through holes in their skin rather than through lungs. Works fine for small critters, falls over for anything bigger than current year bugs. Similarly the surface to volume ratio is very different for a 1cc (model airplane) or 0.1cc or 0.01cc engine than from a more ordinary 2000cc or 4000cc motor pushing your car around (7000 cc for Americans I guess). There are many implications related to this: heat transmission is one of them. It’s easy to maintain large temperature differentials in a bigger motor. Large temperature differentials means higher efficiency. At smaller length scales, the thermal conductances scale differently from the forces as well, so something like a steam engine is going to look radically different at 0.1cc than 50,000 cc like in a big old timey ship steam engine piston.

The other thing is a little motor is necessarily going to have to run at a higher RPM for high energy densities, and that’s kind of bad for combustion efficiency because the flame has to propagate and the high RPMs make it more difficult for it to do so.

There are wackier ideas. Something like thermoacoustic engines was a pretty interesting foray into strange domains. In effect, this is a Stirling engine where the pistons are standing sound waves in a resonant chamber. These are using different kinds of physics to get rid of moving parts. They are pretty good sized -something like a foot long. There’s some crazy German dude on youtube building such things in hopes of powering his house using the effect, burning self-generated biogas. It’s not so much this design, as being inspired by it: using new kinds of physics to make small prime movers.

Keeping with the idea of using sound, there’s an idea called the thermoacoustic ratchet. You can create microcavities which create standing waves at very high frequencies when there is a temperature differential, from there you can harvest the energy using some other idea; maybe piezoelectric. There’s other material properties; people have started using pyroelectric materials to harvest such energy. Even weirder: using little vapor bubbles in liquid capillaries. Other ideas: evaporation has been looked at. Squeezing liquid through weird little pores. There’s probably a lot of crazy ideas in tribology and materials science that could be put to work here. One of the cool things about all this is much of it is open to tinkerers.

Small steam engine:

https://www.mpg.de/4691201/thermodynamics_microscopic_steam_engine

https://www.sciencedaily.com/releases/2011/12/111211134002.htm

3 comments

Post money Silicon Valley Lotharios

Posted in five minute university, fun, Locklin notebook by Scott Locklin on March 14, 2026

There are many amusing stereotypical personalities in Silly Con valley. Steve Sailer coined the phrase “Silicon Valley Adventuress” for the very obvious type of women who try various kinds of shakedowns on tech firms and their executives. There’s the more obvious “Divorce Tick” kind of woman; someone who marries a clueless but rich nerdoid and relieves him of his extra wealth. These are often hilarious to watch at work, even when it’s pointed at you: the divorce ticks often talk these nerds into open relationships. Dude always thinks it means hot threesomes with her friends; generally the way that works out she gets sex whenever she wants from the pool boy or whatever, and he gets bupkiss because he’s a ridiculous nerd with a hot wife who cheats on him. Divorce happens with mathematical certainty when she’s maximized her profits. The free market and regulatory capture can be an awesome thing to watch at work. It’s a shame nobody has written a novel on the sorts of people who populate the place: it’s filled with all manner of amusing characters. I’ve written about the blonde grifter and insecure Indian guy with a sports car before: you’ll literally meet dozens of those if you spend any time there. There needs to be a Flaubert of Silicon Valley Divorce Ticks and Adventuresses and grouchy Harpeet with Becky in the Porsche. The TV show by Mike Judge captures the vibe and some of the superficial personalities, but it’s only skin deep: the actual place, like anything made of actual people is much more weird.

One of the varieties of Silicon Valley character is the post-money Lothario. While people have written about the Adventuresses and to a lesser extent, Divorce Ticks, nobody to my knowledge has written about the post-money Lothario. He should be considered as delicately amusing, as he’s generally even more of a boob than the garden variety sociopaths and their victims mentioned above.

Nerds are as sexually ambitious as anyone else. Nerds suck at being charismatic though; otherwise they’d get jobs talking people out of their money instead of actually trying to create value by doing tedious engineering work. At some point, usually around middle age, the successful ones end up making enough money to be competitive with chads. Some of them get married and live happily ever after. Some of them get married and some divorce tick removes half of their money, and they live less happily ever after. Others decide to become womanizers in their middle age. This isn’t a new phenomenon; it’s as old as time. It’s at least as old as the 1500s where Cranach the elder was making hilarious paintings of the phenomenon.

Note her hand is in his pocket

Beyond the fact that this sort of thing is ridiculous, I don’t have anything against it. Young women have to make their way in the world, and old nerds need love too (feel free to send digits, ladies).

What boggles my mind is one of two situations. One is the situation in which the man makes the woman his protege. This is ridiculous because sucking an old guy’s crusty dick is not actually good training for entrepreneurial work, no matter how smart and helpful the old dude is. Sure you can get him to invest in your startup: you still have to make it work by yourself, and if you’re the type of person to suck a dick for a venture investment, you’re probably no good at actually executing a profitable business. Women who do this are highly unlikely to actually act the role of the protege or Ganymede or whatever. Your average women being what they are, probably thinks her benefactor is ridiculous and a fool. He actually is, though he’s probably giving her decent advice, and she should probably listen. Men involved in this sort of relationship may be hard headed captains of industry, but when the blood flows to their nether regions, they go soft about the cerebral cortex and don’t realize the girl is probably just an ambitious sperm spittoon.

This is an annoying phenomenon for actually productive people as many startups, mid-trajectory ventures and most of all, old companies often have to deal with the fallout. It really, really sucks when you end up with some kind of lame prostitute as your boss in your management chain. I’ve never experienced this, as it’s rare I work for anybody who isn’t me, but you can see it when it happens. Anyone who has worked in such a firm has encountered this whether they realize it or not. It even happened at the national labs. Some of these “couples” go beyond this and the women attempt to start their own companies. Most famous example of this is, of course, Eric Schmidt. I feel bad picking on him as he’s a friend of several of my friends, but it’s not like nobody knows about it. That’s what you get for making too much money; plenty of other retards have behaved in this way but aren’t as rich and famous and subject to scrutiny, he will have to suffice for visual aid. Such investments are inevitably disastrous of course. Maybe it doesn’t matter if they can afford it, but I think it ends up bothering them, because these fools for love think they were doing a good deed, rather than overpaying a prostitute.

The other annoying one is the full on playboy nerd. Schmidt kind of did this too, or at least was fairly active in his romantic adventures in the past. To be clear I’m not picking on him, Larry Ellison was even more notorious for it (generally speaking, Larry is the only interesting silicon valley member of club billionaire: no h8 -also thanks in advance for fixing Star Trek muh nig nog), though since Larry is adept at buying media organizations, people don’t talk about it as much in the funny papers. Anyway both these guys have cut a wide and not entirely disreputable swathe through local adventuress vaginas. That’s just nature, naturing, life in balance. It’s the dudes who go all grody on it. I had a series of booty calls while I was still in grad school to these very fancy loft apartments. Apparently the girls I was smashing had the keys to these places from Silicon Valley Lotharios, and were part of their harems. As a poor grad student I was all in favor of free hookers who thought I was more interesting than the guy with the fancy house, and appreciated their hospitality, high thread count sheets and fine taste in alcoholic beverages. The stories I heard was that these dudes were off porking prostitutes, so it was OK to drink his booze and pork his allegedly not-prostitute (according to them) women in his house. Later on, I (more or less) aged out of being That Guy, and ended up knowing similar characters whose lives and fortunes were dedicated to playing space invaders with different kinds of women, pretty much as avocation, hobby and meaning of life. You ever meet a 50 year old who is really into coke or weed or mushrooms or whatever? Kind of pathetic, right? Same difference; hoes are also a vice.

Carravagio noted this happens to younger men as well

I guess I was lucky in some ways, in that I always did reasonably well with ladies. Even went through a chad phase where I was avidly pursued, due to absurdly modest local fame, requiring little effort on my part. Despite my advanced state of disintegration into my constituent elementary particles, I still occasionally get the hairy eyeball from comely and fertile ones. As such, these sorts of degenerate “bro I had a threesome with a blue haired asian and a redhead” antics aren’t impressive to me at all, or my idea of the good life. It’s just lame shit I did before I could afford more interesting hobbies, like telescopes or having machine tools in my spare bedroom. Doing such things at my age, which a lot of these guys continue to do, seems downright tedious. It’s just not a good use of my time, and I’m not as wealthy or accomplished as a lot of these assholes, so from a time value of money point of view I don’t know WTF they are thinking. I don’t know how they can do it, in the same sense I don’t know how a heroin junkie can stick the spike in their arm as their meaning for life.

Look we all love the poosy, it’s great. Women fucking you because you’re a rich asshole is not the same thing as young love, or being popular because you’re an actual charismatic chad rather than some chode who gets in the newspaper for being awfully rich. Being a balding, flabby dot com deci-to-centi millionaire who spends most of his time arranging threesomes with Stanford educated sl00ts you met on a website for “arrangements” is just lame shit. Join the liftwaffe, lift weights, become freeking hyooge, learn ancient greek, become a great martial artist, start right wing death squads: you’ll probably have a better romantic life than buying stanford educated prostitutes, or spending half your time managing your polycule. End sermon; I know you faggots won’t change, and I won’t stop making fun of you.

If you want to live some kind of baller lifestyle that nobody will fault you for, look to mafiosos or wealthy Russians for your model of the good life. Having a “goomah” who was a gymnast or ballerina or something is based. Porking a bunch of money grubbing sluts who are too lazy to be a good wife or goomah to a rich guy or whatever is just lame.

Related: marriage advice from the likes of me (never married, therefore winning at life) on Man’s World.

16 comments

Coding assistant experience

Posted in tools by Scott Locklin on February 18, 2026

I’m a modest LLM skeptic. It’s not that I don’t believe in LLMs, I am aware that they exist, I just know that they’re not doing what people do when we think, and that they’re not going to hockey stick up and replace everybody. If it helps people, they should use them: I do. ask.brave.com is my first stop for answering transient questions or software configuration issues. It produces useful results and cites its sources; a great search API. It also doesn’t remember what I asked it (Brave is privacy first), which is what you want most of the time. Grok gives OK answers too, but I don’t like the answers as much, and I have no idea what their privacy policies are. Qwen has been OK for answering coding questions and small code fragments.

I have a few jobs I’ve been putting off; fiddly and annoying translations from Python to R, updating APIs, etc. I also have a couple of challenge problems I have asked AI chatbots to gauge where we’re at for things which I care about. Qwen is by far the best free and open chatbot I’ve used, and it had gotten good enough I decided to fork out for claude-code and take it for a spin. Also inspired by asciilifeform’s comments; dude’s grouchier and more skeptical than I am, so I took his statements on the utility of claude-code very seriously. People who use LLMs at work already can probably skip to the end for this, as you already know more than I do about using these things, though maybe some of the observations are of use.

Mostly the type of work I do is numeric, and numeric coding is significantly different from what most do. I never had any doubts that an LLM could do Javascript plumbing, or even back end plumbing code. Lots of examples of this to train on, along with complicated regular expressions, SQL queries and so on. I figured they’d eventually do something with numeric stuff, though it was less clear when it would happen for my favorite programming languages.

Some claude-code notes:

0) You need to pay for the $200/month one to get anything useful done with claude-code. This is annoying as it’s difficult to burn all your tokens, but the cheap plans run out almost immediately. Jerks. I should be able to pay as I go without talking to some salesdork or signing up for a subscription.

1) Claude code has access to your hard drive, and you have to invoke lucifer and kernel modules to keep it from ruining your life. Yah, in principle you can trust the thing. Back in the 90s you could in principle have an RPC daemon on your Sun workstation which executes arbitrary code, and most of the time nothing bad would happen. Anyone who trusts this thing with sensitive code is fucking retarded. You need to run local for this.

2) One of my unpleasant tasks is translation from the lost souls who think Python is an adequate mode of scientific communication to something less insane (in my case, R, though I still hold Matlab is best tool for scientific communication) is the first task. That’s something an LLM should be great at. Mostly the chatbots haven’t been, but recently they seem to have acquired the skill. This was my most pressing reason for trying claude code, which I assumed would be better than a chatbot. Claude managed to achieve the task in maybe something like twice the time it would have taken me, in a fashion quite a bit more code complete than I would have done. Of course it forgot to add a predict method for a bunch of algorithms that people basically only use to predict things, but once I told it to do so, it did. The first go-round it reproduced every python class in the old repo and made them public, which is exactly what you’d expect from a machine that doesn’t understand anything: the actual algorithm is “fit model, predict model” so you need exactly two public functions, with the other functions being called as options inside the create function. Once I yelled at it enough, hollered at it to update the manual pages to match what’s inside the functions and so on, it did a reasonable job. Another thing I find extremely painful in R: making a vignette and festooning the source with inline documentation using rmarkdown. I’ve always found this onerous, but the LLM don’t seem to mind. I prompted it to use a google style guide for R packages, so the style isn’t horrible. Beating it into shape was a fairly high attention process, though it was my first time using claude code. All told I put much more time into it than I would have fooling around on my own. This is because it’s low effort work, where writing it yourself is high effort work. There’s a problem here: since it’s low effort to generate a lot of code, now you have a lot of code. Code that has to get maintained if you’re actually using it.

3) Another major unpleasant task I have is turning a paper I read into code. For simple things, LLMs should be able to do this. For more complicated things, I assume there is a limit based on prompt windows. Indeed Claude code was able to turn this paper (my go-to challenge problem) into reasonable working R code; Bernoulli Naive Bayes with EM semisupervised updates. This is something I had done myself for a project, but never checked into any remote repo, so I knew there would be no cheating. I also looked fairly extensively for an example on github and didn’t find any (albeit some years ago now, but people are retarded and would rather fiddle with neural nets than this most excellent trick). Claude was considerably slower at this than the translation job, and made what I consider fairly poor code quality, though I didn’t prompt it with any style guides. Still, actually doing the damn thing is pretty good, and I’ll be testing this type of “read the paper, geev mee code,” job further with more difficult problems. For those of you not in the know, Bernoulli Naive Bayes is basically column means, and the EM algorithm is awfully simple: maybe around the complexity of Newton’s method. Someone like me can do it in an hour if you point a gun at me and give me an espresso enema, or a couple of hours if I’m taking my time and being careful. If I can get algorithms from papers on non-trivial problems, this is a nice application for me; I have an enormous backlog of interesting looking ideas with no public code associated with them. Understanding the papers in enough detail to write code is a pain in the ass, especially if you don’t have good building blocks.

4) The final category of unpleasant “I will likely defer this job forever” task is glueing an API into R (or J, which I have ambitions of getting back to), then using that to implement an algorithm. I asked claude to fill out some of the missing functionality from mlpack. Looked OK, I didn’t test them. I also had it code up an API for mlpack for J, which it appeared to do (it’s been so long since I used J, testing it was painful; sorry about all the sub dependencies it put in the repo).

Task 2 and 3 are my most common use cases. Mostly it doesn’t matter if the results are slop. 4 is an occasional dreary task as well, though R has a decent ecosystem of people who have done this for everyone. Telling the thing how to do my daily tasks is probably also automatable to some extent, but it would mostly be a waste of time. Interactive work is interactive, and Captain Kirking it with a LLM agent is just going to piss me off. I don’t even like using R notebooks, so making an LLM R notebook is no good.

qwen3-coder-next:

I also ran qwen3-coder-next on my threadripper. It’s slow, but can be used if the threadripper isn’t chugging on any other serious tasks. The motivation isn’t to avoid the $200 a month subscription fees; it’s the fact that I don’t trust Claude with anything actually sensitive, like things which produce money for me. It was a pain in the ass to get this stood up and functioning. I did it like this:

numactl --interleave=all ./build/bin/llama-server \
-hf unsloth/Qwen3-Coder-Next-GGUF:Q4_K_M \
--numa distribute \
--threads 32 \
-c 262144 \
--no-mmap \
--jinja \
--host 0.0.0.0 --port 8080

ollama basically doesn’t work. In this case, for the first round, I ended up using a python tool called aider to run it (claude-code-agent in emacs for the claude-code interactions). I think aider is a little clunky; it couldn’t figure out how to make a subdirectory from where I invoked it. Probably choking on context. Might be user error somehow; I went back to emacs (gptel-agent) later and fixed it. TPS appeared to be on the order of 20, very slow prompt processing though. Claude is roughly twice this speed, though it feels faster because it’s running on someone else’s hardware and doesn’t choke as badly on context. I was able to reproduce the semisupervised Bernoulli Naive Bayes with EM updates example that claude-code did as well as a simple Python translation example (a novel fast fitting method for logistic regression). Took about as long for the first round, wasn’t as smooth an interaction. Fed it exactly the same prompt. Got the algorithm right in the first shot, but the NB R package was all borked up, which is the kind of thing I noticed in the qwen chatbot. This required a fairly long context window, so I’m a bit dubious pointing qwen-code-agent at a more involved paper until I upgrade my hardware. I actually like the code qwen produces a little better. Not bad for 3 billion active connections, thank you based Chinese frens. Oddly the python translation seemed to give it more trouble, again I think because of the slowness of parsing context windows on the threadripper.

There are a couple of reasonably cheap potential hardware solutions to run this qwen3 thing without heating up the threadripper or spending 10k on a big video card and a new power supply; Strix Halo from AMD and NVIDIA GB10 Grace Blackwell. Both are small boxes running Linux with 128G of shared memory with a medium-beefy GPU. Neither seems to have any huge performance advantages over the threadripper or each other (real world experiences welcome, supposedly NVIDIA is faster on context), but they’d allow me to do vibe coding while using the threadripper cores for other tasks. Nice airgap as well. If anyone owns such a shoebox machine and had good experiences, feel free to pipe up. I ordered the AMD gizmo so I wouldn’t have to deal with maintaining a development environment for ARM chips. I’ll probably run the claude stuff from this machine as well for the airgap benefits.

While qwen3 did an OK job, it was no fun to work with. The slow context parsing speed of the thing makes the tooling even more clunky, though emacs (gptel-agent) it was a better experience than aider. The agentic part of the mechanism and differences in how something like claude-code works (a NPM package) isn’t fully clear to me yet. “Thing that runs machine generated shell scripts” seems to be about the size of it. How the LLM knows when it’s hooked up to something with agency isn’t clear. I suppose I can ask an LLM for an explanation here.

random unconnected thoughts:

A fun and actually useful thing to try would be to get one of these things to make Lush 64 bit clean. If I could do that without bothering the authors, that would be amazing. Maybe I can burn up some Claude tokens on this when I’m not using it for other tasks.

The chatbot part: I don’t think Claude Opus 4.6 is anything special. Like all the other ones, it speaks authoritatively, talks in circles, contradicts itself and is generally full of shit. Makes a decent coding assistant though. Asking it for advice on buying a machine for running qwen3 locally, for example: actual search engines (including ask.brave.com) produce better results that don’t contradict each other every other line.

Fun thing I didn’t fully realize until performing this exercise: LLMs don’t have state. It keeps state by feeding the prompt (in most cases the entire prompt, including the entire codebase you’re working on, all the search results, etc, every time there is an update) back to the LLM, along with the most recent results. This is, of course insane. It is particularly insane that people think this kind of Rube Goldberg contraption is sentient somehow. LSTMs are more sentient.

Complexity: R packages implementing an algorithm are a decent sweet spot for something like this. The R packaging system is designed to insulate the REPL from shitty coders who understand things about statistics. The context window is never going to be enormous, it’s generally going to be a couple hundred to a thousand lines of code that accomplishes a well defined numeric task.

Productivity thoughts:

One thing which is for certain: Claude code isn’t replacing anyone’s job. Anthropic’s headcount isn’t getting smaller. The good thing about using a tool like this is that it has low cognitive overhead; I have to figure out how to constrain a mildly retarded computard helper and make it do the things I actually care about. Once I’ve read the paper or glanced at the original source I have a fair idea of what I want the result to look like, and I have to break the task down to something a retard could understand. This is something I do for myself already (being retarded 👍), though the degree and quality of my personal retardation is considerably different. I also have to debug the result afterwords: there will be a lot of bugs, where writing code interactively is kind of online debugging. But, it is useful enough and does things I find onerous and unpleasant in a relatively painless manner, so I’m gonna use it. Sort of like an employee, yes: but a bad employee. One you can’t trust with anything important, and who takes longer at accomplishing tasks than doing it yourself. People who trust vibed code with important things, well, rotsa ruck to you.

There’s a hidden cost to this sort of thing. Because you can write a bunch of code without burning up your precious brain-sugars, you will write a bunch of code. Now you have a bunch of code of dubious utility. In my case, I’ve been very careful to not engage in writing code from papers or translating from python or whatever unless I was pretty sure there was paydirt. Now I’m gonna do it more often. While it feels non-tiring to do this sort of thing, it still takes a nontrivial amount of time, and an even more nontrivial amount of time to evaluate the algorithms the LLM made for me. Maybe I should be working on something else?

For a trivial example, I just spent a couple weeks fooling around with this nonsense. I have one machine generated R package of marginal utility to my actual project to show for my troubles, as well as a much better understanding of the abilities of LLM coding assistants. This is absolutely abysmal from a productivity point of view. Lines of code generated looks amazing, but I don’t get paid for lines of code. “Maybe it will pay off in future productivity,” but that sounds an awful lot like the sales bilge on the tin for vendors of these things. The real world results indicate otherwise. They’re even starting to notice the Solow paradox, aka the fact that ladies with a rolodex, telephone and filing cabinet are as economically efficient as putting everything online and in databases.

Consider my likely trajectory with this crap: I’ve already dumped $2200 into a Claude membership and a new piece of hardware to run qwen3-coder for me. I’ll have to configure and maintain that piece of hardware, burning more real world time, and the ongoing cost of claude if I continue the membership. I’ll also burn real world time coding up random ideas I would have ignored in the past, or only approached cautiously. Just like putting the internet on my computard, it will open up vast new avenues for wasting time, rather than keeping focused in my pursuit of actually economically productive goals. Is it a win or a loss? I can’t tell. Still gonna use it, but cautiously.

https://github.com/locklin/vibe-coding-experiments

33 comments

Conditional probability: an educational defect in Physics didactics

Posted in physics by Scott Locklin on January 16, 2026

Conditional probability is something physicists have a hard time with. There are a number of reasons I know this is true. Primarily I know it is true from my own experience: I had a high-middling to excellent didactics experience in physics, and was basically never exposed to the idea. When I got out into the “real world” of, say, calculating probable ad impressions this concept became of towering importance. It took me a while to grasp it, and I still occasionally struggle with the idea, but it’s actually pretty simple.

What is the probability a man is over 6′ tall? Well, in the US, you look at the normal distribution and find it’s about 14%. If you know both his parents are 6′ tall, the number is higher. If both his parents are 5′ tall, the number is lower. That’s a practical example of conditional probability. Making it super concrete, imagine you have a deck of cards. Probability of drawing an ace is 4/52. Probability of drawing an ace if (conditionally) 10 cards have been drawn with no aces is 4/42. Probability of drawing an ace if you pulled 10 cards and two of them are aces (conditionally) is 2/42. You can do it with urns or dice or whatever; make yourself happy with your favorite example.

Statistical mechanics seems like this is where you should learn such things in physics, since we have no independent probability theory classes. I looked in Reif and Ma, the two books I learned statistical mechanics from. Reif doesn’t have the concept in the index, though it mentions Markoff and Fokker-Planck (he does mention conditional probability here). Ma only mentions it to argue that he doesn’t need it to teach statistical mechanics (later bringing it back in various places in a sort of ad-hoc way: I shouldn’t have slept in so much in that class). Ma even manages to avoid mentioning conditional probability in his treatment of Fokker-Planck, a considerable intellectual achievement for a set of equations for calculation of a conditional probability. As such, most physicists end up thinking of probabilities as funny sorts of ratios that must add up to one, which is right for a lot of cases in physics, but which is not correct in the general sense. Most of the classical statistical physics done with canonical ensembles (aka most of it) assume we can ignore conditional probability. Stuff like non-equilibrium thermodynamics is going to contain a lot of conditional probability, since it is dynamic and one-way in the same sense as the above card game. Our one example of a non-equilibrium thermodynamic relation which rises to the level of a law, the Onsager relations, certainly uses conditional probability, though Onsager himself never mentions it explicitly. The fact that he never uses the words, nor are they used in didactic explanations probably keeps physicists from having a good think about the implications of conditional probability in this and in other places. Out of sight, out of mind.

There are more pedestrian examples of physicists missing out on conditional probability; I’ll list a couple below:

Jung/Pauli synchronicity. When I was a young pot smoking man, I read with great interest a book on the correspondence between Jung and Wolfgang Pauli on the subject of synchronicity. If you’re unfamiliar with the topic, the following clip from Repo Man explains it well; lots of weird coincidences happen, and our brains ascribe meaning to them. Feels a lot like psychic powers or something. The reality is, the otherwise incredibly meticulous Pauli didn’t know enough about conditional probability, even to the level of understanding the trivial Birthday Paradox. It’s all conditional probability: it’s only surprising because our brains don’t intuitively grasp how conditional probability works. The brain observes many things in a short period of time; if some of them happen to overlap in a conditional way over a human consciousness tier period of time (minutes, hours, a day or two), the brain flags it as something significant, even when it’s entirely expected, like a group of 23 people being 50% likely to have a shared birthday. Pauli is a lot smarter than me; arguably smarter than any living current year physicist whose name isn’t Roger Penrose, yet he missed this obvious thing. Probably because his life was a mess and he was drinking too much, but also because he was probably never exposed to the idea in school or anyplace else.

Fermi Paradox is a case where a Nobel prize winning physicist kind of left out important conditional probability aspects of a model. As we all know it is a calculation of there being other forms of intelligent life in the universe based on approximated probabilities. The Drake Equation lists number of stars in the universe, approximate probability of a planet in the habitable zone, age of solar systems, probability of life, intelligent life, civilizations, civilizations with space travel, etc. In the end he sums things up by multiplying all the numbers together, and comes to the conclusion that there must be intelligent life which we should be able to observe or which have visited us, or there are hidden and depressing dangers which wiped out all these space faring alien cultures. If you look carefully at what he did, you might never notice he didn’t use any conditional probability. Probably he elided over some important conditional probability. For example, most species go extinct in a way that fits the Survival model; there’s no reason to think intelligent ones have any special advantages, and lots of reasons to think any sort of megafauna, intelligent or otherwise is going to be at least as likely as any other species of megafauna to go extinct over time. This is just one of the conditional probability factors at work here. Though maybe earths are just rare, or intelligent life is unlikely in conditions where they might discover electricity (aka aquatic life). Conditional probability isn’t necessarily the right tool here for a quick look at orders of magnitude, but it is conspicuous for its continued absence in a calculation which heavily implies it might be useful.

The thermodynamic arrow of time. The arrow of time is considered a root problem in physics. Microscopic classical physics, there is no obvious arrow of time. The equations work the same way backwards as forwards. Yet you can assemble the microscopic equations into large ensembles and get the very irreversible laws of thermodynamics. Watanabe wrote an important paper on this subject in 1965 where he noticed that we leave out the conditional probabilities when formulating the statistical mechanical ensembles we use to calculate things and derive thermodynamic relations which make things like steam engines possible. Watanabe’s paper is influential with people with good taste, but mostly has been ignored. Certainly ignored in didactics, and often disputed for reasons which remain obscure to me. Rovelli and friends for example (linked above) think it’s a bad argument for various fiddley reasons which make no sense to me, but the idea of using conditional probability to ascertain where the arrow of time is coming from seems obvious. Of course I don’t know how to do it; I’m a mere statistical dabbler. Physicists resist this with all their might; you can find otherwise obviously intelligent people saying, effectively, “it just isn’t, OK.”

My favorite potential example of this is ET Jaynes idea that the mysteries of quantum entanglement go away when you think about conditional probability. I like this one a lot. Mostly because it dispenses with all the psychic powers quantum mysticism that has sprung up around the ideas of quantum mechanics. Also because it dispenses with quantum computers, which are both obviously fake and retarded. But mostly because Jaynes is the patron saint of physicists who make the jump to data science, and so, was uniquely qualified to bring this sort of thing up. Data science people have to know all about conditional probability: that’s pretty much what they’re doing, all day, every day. If nothing else, the fact that the main engagement with this idea in the literature ends up agreeing with it, rather than deboonking it kind of indicates that the conditional probability is weak among physicists. That’s not to say Jaynes was right, but the lack of informed argument against it indicates a weakness in the topic of conditional probability. If indeed the ideas of Jaynes turn out to be true (I’m in no position to adjudicate), this example will be held up by some future Thomas Kuhn type of thinker to be a spectacular example of a field of very smart people deluding themselves with didactic deficiencies, mathematical ignorance and group-think. As Mencken put it:

The liberation of the human mind has never been furthered by such learned (pedant) dunderheads; it has been furthered by gay fellows who heaved dead cats into sanctuaries and then went roistering down the highways of the world, proving to all men that doubt, after all, was safe – that the god in the sanctuary was finite in his power and hence a fraud. One horse-laugh is worth ten thousand syllogisms. It is not only more effective; it is also vastly more intelligent.

As an aside, I found another contemporary researcher who seems to take the conditional probability approach to get rid of quantum woo. I haven’t read his papers in detail, but they seem to be thoughts along the same lines as Kracklauer and others mentioned in the previous article. It’s entirely possible that entanglement is exactly what Scott Aaronson thinks it is, but the fact that its one application is only useful for pumping up fraudulent penny stocks thus far, I mean, I dunno considering the above it wouldn’t surprise me if the big wrinkly brains got this one wrong.

I suppose statisticians also have a hard time with conditional probability with Simpson’s “paradox” being a prime example, and Berkson’s paradox being a less known one. Contemporary statistical practitioners aren’t supposed to be deep thinkers though, so they get a pass.

34 comments

Older Posts »