Kevin Liu

Easy improvements to personal opsec

Sat, 09 May 2026 00:00:00 +0000

I’ve been thinking more about how to be a little more private. In an era where LLMs can automatically deanonymize people from their writing, find zero-days en masse, and may potentially displace jobs, it seems safe to say that the variance of the next few years will be significantly higher than the two decades pre-2025.

Threat model: A casual adversary who asks Grok-5 for “name, phone, and address of all people in [X reference group],” with the intent of causing disruption or harm. I don’t expect the strategies below to work against adversaries that are highly-competent (including but not limited to government actors) or specifically targeting you; it’s very possible they won’t even work against casual adversaries in the future.

The guidelines below are tailored to California residents (and SF in particular), but the spirit of most rules should be pretty universally applicable.

Physical security: IMO the biggest value here is making your address harder to find for casual attackers.
- Remove yourself from California voter registration, since many groups can publicly access your address on voter registration. For example, SF offers a form to remove yourself from voter registration.
- Sign up for a PO Box ($25/mo in SF). Send your mail here so random e-commerce websites don’t get your address. This doesn’t help if you’ve already given out your address, but it reduces future leakage (and if you ever move it will help your new address from leaking). (Notably a PO Box has several other advantages: they’ll sign for and hold packages for you for a month; you get the coolness factor of going in to a USPS after-hours; and they also accept Amazon/UPS/Fedex via Street Addressing.) Migrate:
  - DMV (mailing address only)
  - Financial accounts (mailing address only)
  - Medical accounts
  - Employer mail
  - Ecommerce (start using your PO Box for online shopping)
- If a PO Box is too much hassle for you, sending mail to your office is also a good idea if allowed.
- For property owners, make sure you buy property through a trust. I haven’t looked into this since I don’t own property.
- Lock down location access on your phone; make sure no untrusted apps are in “Always allow” for your location.
Digital hygiene
- Massively secure your email account, since compromising your email can allow compromising ~everything else. You want to secure your email login (e.g., Gmail), and if you have a custom domain, you must also secure the webhost, DNS provider, and domain name registrar you use.
  - Picking secure providers is important. In an age of LLM-assisted cyberattacks (e.g., the Vercel attack was rumored to be AI-assisted), I feel like nobody really knows how to be most secure; but I bias toward larger providers that are enterprise-grade and incentivized to immediately fix security issues. In the past I might have trusted a startup with my data; now I’m probably just going to go with Cloudflare/Google/Apple/GitHub.
  - Example: I used to use forwardemail.net for my email routing and Porkbun for domain hosting. I moved both of these to Cloudflare (Email Routing + Domains + Pages) and set up 2fa with hardware security keys.
  - For any services that, if compromised, could result in full takeover (email, domain, password manager), use hardware security keys to sign in. Buy two. IMO hardware security keys aren’t worth it for most other services because of the inconvenience (I usually just use digital passkeys in 1Password).
  - Enable Advanced Data Protection and Security Keys on macOS + iOS
  - Enable Advanced Protection Program for Google
- Enable SIM swap protection on your phone provider, to prevent people from taking over your number and stealing your SMS 2FA codes.
- Use a data deletion service to reduce the surface area from data people have already collected. TL;DR if you don’t do this, your email, phone number, and address are regularly being resold online by data brokers. I signed up for DeleteMe and DROP (for California residents; should take effect later this year).
- Ensure your Twitter/personal website don’t contain details that could let people contact you in real life
- Remove old tweets (Codex can automate this if you ask it to delete your tweets >180 days old), and in general any public statements you’re not sure if you endorse anymore
- Goes without saying, but use a password manager + 2fa everywhere you can
- Use Signal when you can with default 4 weeks disappearing messages.
- Enable Advanced Account Security on ChatGPT

References

Karpathy’s Digital Hygiene – I’m a bit less privacy-pilled than he is and generally prefer to use very well maintained software even if it’s from a big tech, due to risks from LLM-assisted vulnerability discovery (e.g., Google Chrome vs. Brave). But it’s a very good resource nonetheless with additional advice.

2025 year in review

Fri, 26 Dec 2025 00:00:00 +0000

Value updates

I find it helpful to keep a doc describing my values. (It’s like the Model Spec, but for humans.) The primary benefits I see:

It lets you spend less time thinking through the same tradeoffs on individual decisions. For example, I didn’t know if I wanted to spend money on DoorDash vs. investing upfront and recurring time into learning to cook. At some point I got fed up with thinking it through every time and just wrote the decision down.
You get the chance to think through tradeoffs more deeply. I think this is helpful to avoid falling into the trap of doing something because it’s the lowest-effort pathway. There are a bunch of areas (relationships, what I care about in a career) that I wouldn’t have thought through if I hadn’t tried to write a fully self-consistent, comprehensive values doc.

I’ve been doing this for a few years, but here are some updates I made this year:

The material

Aesthetics are key (aka, it’s not always worth being a utility maximizer). If I look at something multiple times a day, I want it to look nice instead of stressing me out.
1. Consider buying a Vespa for the memes, even though it’s slightly more expensive and annoying than Lyft Bikes + Muni + Waymo/Uber.
2. Get a nice apartment (though hopefully not too expensive).
3. Get a nice and fuel-efficient car, not just a fuel-efficient one (I haven’t done this one yet)
Vibes are key. Also I want to get better at dancing.
Drinking is good (to the worried: in moderation!). My friend started dragging me to bars this year, and I admire:
1. How novel bars are compared to most restaurants, for which I feel I usually know which food is best. (Not to mention how different bars are in architecture and design, whereas restaurants feel much more samey.)
2. How good of an excuse it is to hang out with friends :)
3. Slowly I should try to learn all the types of alcohol. Could be a fun side project.
Updating a bit on travel being good:
1. It exposes you to changes in low-level stimulus, which can break you out of personality basins.
2. You get to spend time with friends and pay full attention to what’s going on.
Regarding money: first spend money on things that compound, not single experiences. For example:
1. In:
  1. Co-purchasing a vespa with a friend, which naturally compels me to hang out with them and visit them biweekly to drop it off
  2. Trips with friends
2. Out for now:
  1. Expensive dinners (just have a normal dinner; unless it enables a fulfilling discussion with other people who would appreciate the price!)
  2. Business class trips (you only get the benefit once)

Relationships

(Friendship and otherwise)

I want to play life cards-up. Essentially: keep minimal secrets, mostly because it makes life more complicated. No security by obscurity; the best strategy should win even if your opponent knows it perfectly. (E.g., I want to be open about sharing personal anecdotes even if they’re a little embarrassing.)
I really like having fulfilling conversations with people, and a very small number of people I’ve met are clearly >10x (maybe 100x?) on this criterion.
1. This is the closest concept I have to a sparkly person. I want to seek out these people, and I also want to improve at conversation so I can find the thing that sparks these conversations.
2. This is probably the biggest thing I value in a romantic relationship and is also super important in friendships. (conditioning on my other romantic criteria, I think I’ve met ~4 people in the last 2 years who have satisfied this which is pretty low!!!)
I want to create friendships of virtue with people I admire, and fewer friendships of convenience and utility.
1. Is this the kind of person that, if they were old and ugly and had soiled their pants and needed your help, then you’d still say, Yes? “Yes, I would love to. I want to give this person my care so they can unfold their life with dignity.”
2. Honestly, I think life is too short for the other kinds. Also, friendships of virtue follow naturally from being a long-horizon agent; unbuffeted by short term issues, you should find people whose core tendencies bend toward progress/kindness/success and support them wholeheartedly.
3. Relatedly, I also want to learn how to give someone my full attention. I try (don’t look at my phone while talking). But this is clearly not enough.
  1. Someone who I dated briefly mentioned that they felt worried about our time spent together competing against all the other things in my life. This is what I really don’t want to happen. If I choose to spend time with someone, I would like for it to feel unconditional.
  2. I draw inspiration from the exact 1 person I’ve ever met who makes me feel seen when I talk to them, even though we only meet glancingly every few months.
  3. Potential ideas:
    1. More trips where I drop all obligations; don’t check slack, don’t think about work, etc.
    2. Drop recurring relationships that are not of virtue.
    3. Practice full presence at parties, 1:1 interactions, etc. Would be curious if anyone has better ideas.
I want to be good to people. I think this means doing both little and big things for others. I draw inspiration from the small things I’ve seen people around me do, which I confess I never learned growing up:
1. Spending time to draw a bookmark and organizing a “writing gratitude cards for friends“ event (I am very sorry for not attending)
2. Packing into the back of an uber when there’s 3 people instead of sitting (more isolated) in shotgun
3. Getting me Chipotle when picking me up from the airport, and providing a concrete day’s agenda so we avoid decision fatigue
4. I think part of this involves having a little bit of slack in life. So I’ll try to do that next year.

Reader’s Digest

(In the style of kipply)

LLMs. DeepSeek R1. o3 is out. GPT-5 chart crime. How many bits are learned in posttraining? 100 parameters can recover 50% of the performance of a finetuned model. Models generalize from reward hacking to misalignment. Chains of thought are surprisingly monitorable, and it doesn’t decrease over the course of current RL training.

Real world impacts. The Stargate Project announced. Claude Code shames the rest of the industry into realizing models can code. Can models replicate AI research? What about astrophysics? 3 million PRs merged with AI agents (and this significantly undercounts Claude Code!). GDPval starts the era of optimizing AI models to produce economic value. AI for science is now strikingly real but still banal (where’s my flying car?). Epoch continues a great tradition of making cool graphs (Epoch Capabilities Index; Frontier Data Centers). Calvin on OpenAI culture. Ilya deposition. “The answer to that question will reveal itself. I think there will be lots of possible answers.”

AI timelines. The progress of AI feels both over- and underdetermined. End xAI and a dozen startups will rise from the ashes. At the same time, the number of people pushing the frontier is very small:

Everything is fractal; even inside AI research orgs, most people are not working on the frontier
A general property of operations research is that the number of value adding steps is a tiny fraction of the process
Most people in the economy are “keeping the lights on” which is extremely valuable but not pushing the frontier

Learning. Reading about operations research makes me see everything as a production line (including the art of making the production line). As always I learn the most from articles written in the distill.pub format. Ben and Mindy build a house. The hardest working font in Manhattan, and the analogue for San Francisco. How to pay less when you buy a house (in the Bay Area). Very good todo list for project management.

Travel.

Waterloo. I was there for the Socratica Symposium (cannot believe this was still 2025). Reflecting, it makes me realize that the place matters much less than the energy of the people who inhabit it. I wish I spent more time hanging out instead of rushing back to work.
Singapore. Lovely but feels like a giant version of air-conditioned America. I admired their floor cleaning robots, driving fees, and the efficiency of the MRT.
Vietnam.
- Always travel with friends who have slightly more energy than you and will drag you everywhere, it’s super fun :)
- Embrace last-minute chaos and be willing to go somewhere new, even for only 2 days.
Taiwan. I was only here for 3 hours, but it is the only place I ever found a free shower in the airport, and that is some crazy technology.
Massachusetts. I admire the skill of the EA who books a redeye and a hotel room for the previous & following night, so one can check in at 6am and crash until 10. Beyond that, it reminds me that I can be happy in most places. Also, Dunkin Donuts is legendary and I miss rainbow sprinkle munchkins.

Friends. I love conferences because you can talk to everyone you know in SF, but they’re actually free to hang out after dinner. It’s either that or make jam and challah with them. Or drag them to a motorcycle safety course for a full weekend.

Plans for 2026

Solve logistics — Move to a new apartment (currently planned). Make my personal space joyful, minimal, and low maintenance. Solve logistics for the rest of my life.
Pay attention — Create at least one recurring time to be fully present with someone.
Lock in more — I think the ideal schedule is ~6 days working (weekend more chill), 1 day fully off,¹ with at least half the day unbooked (to provide the slack time to reflect). Couple this with trips or vacations where I fully don’t think about anything, which is quite different from trips this year.
Learn more — be able to understand ~every systems and ML-related discussion in the domain of language modelling. Ideally spend a few hours per week on this. Make myself at least a tiny bit better every day.

Rating my 2024 goals

❌ Look at my phone less — by sleeping earlier so I have less sleepy mornings.
1. Not accomplished :( though setting up my phone as a managed profile, with twitter blocked via DNS, did help a lot
⏳ Purposefully spend more time with close friends. Create a list of people I want to stay in touch with or get to know better. Actively try to organize at least 1 recurring event with them (a book club? dinner?).
1. Actually did host some events this year! But still much to do here.
✅ Take at least one concrete action to prepare financially & physically for AGI.
✅ From Charlie Munger: Make myself at least a tiny bit better every day.
✅ Have a weekly cleaning hour.
1. Empirically this is Sunday night when I do the laundry overnight for Mon (don’t ask when I fold clothes pls)
✅ Go to the gym before work.
1. Not before, but started going after with a friend :)
❌ 3D print at least one puzzle.
1. No :( gave away the 3d printer last week
✅ Write a 2025 year in review.
1. Here we are!

Vibes

(Inspired by Alexey Guzey)

Riley Walz’s website
The Unbearable Lightness of Being (read alongside: Details about METR’s evaluation of OpenAI GPT-5.1-Codex-Max)
An extremely useful property of the universe may be that text is cheaper than video, meaning chatbot tutors, polymaths, and engineers arrived years before video superstimulus (tw)
There are no secrets to success — just preferential agglomeration and doing well every single day!
- But sometimes there are a few things you should do better
who watches the Waymos?
Notes on Cruise’s pedestrian accident
https://comma.ai/neurips
Plane Auto-Lands by Itself and Saves Pilots’ Lives! (a robot announcing an emergency over ATC is chilling…)

Many thanks to Miles Wang and Melissa Du for prerelease feedback. Happy holidays!

I know people who don’t need this, but I need some time to question my life choices or otherwise I go crazy. ↩

2024 year in review

Sat, 04 Jan 2025 00:00:00 +0000

Ok, so maybe it’s a bit past 2024, but I think it’s still worth posting.

This post contains things I did in 2024. However, it does not contain everything. If I don’t talk about something we did here, rest assured that I still love you; I just don’t think everything in my life should be public.

Travel and the importance of people

A brief digression on travel. I went to a good bit more places than I’ve historically done in a year. This seems pretty reasonable; I’m 23, and the most exciting thing to me right now is learning new things about the world.

I realized a few things about myself:

I don’t like long-distance travelling to see things — food, sights, nature, etc. While I do find it nice to explore a lively part of an unknown city, I think I can get the same experience by just going to a new part of San Francisco (where I spend the majority of my time).
Travel does indeed suck. It wastes time (sitting on the plane, experiencing delays), is rather uncomfortable, and is expensive relative to things I usually do. But the returns are also worth it.
However, traveling to see people has been worth it every time. Highlights were:
1. Meeting Anson, Hudzah, and Charles at Edge Esmeralda 2024
2. Seeing/meeting Hudzah, Flo, Clo, Ben, Neo, Pav, Ivan in Toronto
3. Hanging out with MIT/Harvard folks (Tara, Gabe, Ryan, …) in Boston, including at various MAIA/AISST events
4. Reaching out to talk with people at NeurIPS. Turns out the ML community is huge and also amazingly cool these days.

Experiences I enjoyed the most, rated out of 10

10/10 — Afternoons spent reliving the college experience with friends — meeting spontaneously, playing spikeball, walking around in the sunlight. When I think about what I wanted to change most about these interactions, I think I should have been more outgoing and less awkward around people I didn’t know as well.
10/10 — Hanging out with new friends for a whole weekend. In adult life, everyone feels so busy that it’s hard to spend that long hanging out. Making time to do random all-encompassing activities with new friends has been incredibly rejuvenating.
8/10 — Learning about random stuff outside my domain of expertise. Social choice theory, robotics, 3D printing, energy. I think I’ll always enjoy learning new things.
10/10 — Spending two winter nights and four hours of a plane ride watching Pantheon. Usually I don’t like watching shows, but this is the best TV show I’ve ever seen. Concrete updates [SPOILERS COMING]:
1. Brain emulation will be achieved within 50 years.
2. What is the meaning of life post-scarcity? It’s love and compassion for your loved ones. This is true even if those loved ones are infinitely mutable or “people-space” is continuous; you just choose people to love and you love them.
3. Transformative social change is possible in just a few decades.
9/10 — Shipping work I’m proud of on a deadline. I describe my work life as 1/2 standard SWE, 1/5 crackpot research ideas, and 3/10 working on a deadline to do something important for a broader launch. Despite the chaos and stress, I found working on a self contained project that has to get done very meaningful when done right. It was also nice to have this happen repeatedly, because I got to learn the right balance between being stringent about code quality/process and being adaptable.
9/10 — Running along Marina Green. Never knew running could be that fun. A shame it takes a 20 minute drive to get there for me.

Experiences I didn’t like, rated out of 10

3/10 — Online dating in San Francisco. It’s depressing, filled with adverse selection, I’m not particularly good at it, and every minute I spend on my phone feels like a minute I’m not going outside, talking to people IRL, or learning something new about the world. Maybe I should just touch grass.
1/10 — Browsing social media on my phone in the morning. It’s soul sucking, uses lots of time, and creates a low grade anxiety that sticks with me for a few hours.
3/10 — Feeling like I’m actively ignoring suboptimal parts or important chores in my life. Examples: procrastinating on cleaning my room, returning packages, going to the gym, etc. I think about Heinlein’s quote:

“A human being should be able to change a diaper, plan an invasion, butcher a hog, conn a ship, design a building, write a sonnet, balance accounts, build a wall, set a bone, comfort the dying, take orders, give orders, cooperate, act alone, solve equations, analyze a new problem, pitch manure, program a computer, cook a tasty meal, fight efficiently, die gallantly. Specialization is for insects.”

To do the little things right is just as important as the big things. So in 2025 I’m resolving to get good at everything I choose to do.
1/10 — Spending a day half-assed. Sometimes you just kinda phone it in for a day, you know? You don’t do the task you know in your bones is the most important; instead, you do some satisfying refactor or lollygag around the snack area instead. You accept “ok” instead of “I didn’t know that was possible.” You don’t ask out the girl; instead you just talk awkwardly and then say goodbye. You wake up late on the weekend, go through the day without a plan, and lie to bed brimming with unspent motion.

Half-assing life is the true evil. Everything in life deserves a full ass.

2025 action items for myself

Look at my phone less — by sleeping earlier so I have less sleepy mornings.
Purposefully spend more time with close friends. Create a list of people I want to stay in touch with or get to know better. Actively try to organize at least 1 recurring event with them (a book club? dinner?).
Take at least one concrete action to prepare financially & physically for AGI.
From Charlie Munger: Make myself at least a tiny bit better every day.
Have a weekly cleaning hour.
Go to the gym before work.
3D print at least one puzzle.
Write a 2025 year in review.

End of text

Thank you to everyone I met this year. Life is short, and I’m glad we got to spend some of it together. If you’re reading this we should definitely hang out more in 2025 :)

Philosophy of Language Modelling

Sat, 20 Jul 2024 00:00:00 +0000

On autoregression

A model is a promise — a mathematical loop that, if recursively iterated a thousand or two steps, will create something of value at the end.
A model knows the end before it begins.
Language isn’t required to model thought, but thought is required to model all of language.
Our empirical findings suggest that transformer LLMs solve compositional tasks by reducing multi-step compositional reasoning into linearized subgraph matching, without necessarily developing systematic problem-solving skills. (source)

On alignment

To do good alignment work is to better understand capabilities.
The model knows what you mean, and if you ask it to, it’ll probably care too.
But make sure you ask it to do the right thing.
It might be urged that when playing the “imitation game” the best strategy for the machine may possibly be something other than imitation of the behaviour of a man. This may be, but I think it is unlikely that there is any great effect of this kind. (source)
Better to run happy GPTs on your GPUs than let them sit idle. (Actually using them is probably better still.)

On impact

Due to comparative advantage, if governments protect the basic resources humans need, there will always be a job to do, even if AIs can outcompete humans at every economically relevant task.
It’s not priced in.

Large Language Models can Simulate Everything

Sat, 13 May 2023 00:00:00 +0000

And they should.

TL; DR: Simulation is the only way to forecast how future complex / AI systems will misbehave.

This is post #1 in a series of 3 outlining my current views on AI. Part 1 focuses on the need for improving how people think, rather than improving their leverage over the world. Part 2 gives “no brainer,” objective strategies helpful for improving the safety of ML systems on the margin. Part 3 focuses on the best ways to measure and empirically evaluate ML systems as they are deployed in the world.

A hot take: the #2 most important use case for AI in the next decade will be performing large-scale, in-silico sociological simulations.

This has huge potential for safety; in a world where 99% of AI innovations make us more productive with less oversight (giving us a bigger hammer), it’s important to better understand where to point that hammer. Simulation and forecasting techniques can help us improve institutional decision-making, provide plausible tail scenarios with natural language feedback, and help us run instant, virtual A/B tests to iterate faster on all levels of policy and design.

Why simulation?

The world is incredibly complex. Many real-world scenarios we care about lack closed-form solutions; no unified theory will tell you how the Inflation Reduction Act will impact manufacturing in 2023 or what the public sentiment for artificial meat will be like in 2026. Humans are complex, and their interactions are even more so.

Today, we rely on simple simulations and forecasts of all kinds to guide policy. Traffic simulations can show when a road needs to be widened (or narrowed). Quant traders develop hundreds of economic models to guide trades. Computational biology is huge on in-silico experiments, because the cycle time is >10x faster than real experiments. The Congressional Budget Office is essentially a government-sponsored modeling and forecasting organization, with special insider data agreements with major corporations.

However, human simulations today lack lots of the world’s complexity. You might have a complex physics simulator for dynamics problems, but if you’re simulating people, it’s usually a basic ~linear model dreamt up by the experiment designer.

Language models may let us do better. As they improve, they’ll be able to predict local human interactions with higher & higher realism. Simulated agents could talk to each other, spread information, and interact with mocked interfaces to the world. This could unlock useful simulations for fields that historically have struggled: psychology, economics, etc.

Simulation & the General LLM Company as a blueprint for reliable agents

Moreover, at least right now, we have human oversight on everything. But autonomous LLM agents are coming, very soon — nothing fundamentally prevents a fine-tuned GPT-4.5 from being good at coherent long-term planning in an agent loop. This begins to upturn the human supervision trees of society.

I suspect that LLM agents will soon be organized into a General LLM Company — a virtual company of LLMs, with each “employee” specialized into a particular task. Consider humans: every human is unreliable (makes mistakes, goes on vacation, gets sick); however, over millennia, we’ve developed logical org structures that improve the reliability & capabilities of the higher-level unit. Similarly, every LLM is unreliable and not particularly capable; but if researchers figure out how to arrange LLMs into a reliable organizational unit, that unit becomes usable with extremely minimal oversight. This also seems much easier than using training-time compute to achieve reliability (bigger model, etc.), due to the diminishing returns per dollar to scaling.

For example, imagine a research system that spins up researchers, checkers, and managers in seconds, passing messages and virtual DMs until the “CEO” decides the final response is correct.¹ Such a system has far higher reliability than any individual LLM agent. Also, it’s way less overhead than a human company; instead of Certificates of Incorporation, hour-long business meetings, Slack, and middle management, the org chart can be code, and LLMs can skip the breaks, hiring, and HR. An LLM company can be spun up and dissolved to provide reliable answers, even for basic tasks.²

LLM companies will massively increase complexity. If you thought a single <175b parameter model replacing your Google algorithm was weird, imagine when it’s a community of 15 procedurally-hired agents. Imagine the leverage, and ensuing complexity, when you give LLM companies the ability to spin up and oversee their own sub-companies.

Once we have LLM companies, how do we ensure they’re safe? In the current regime, we regulate companies post-hoc once they do a bad thing. (See: Exxon, airlines, etc.) Due to the high stakes though, it’d be nice to predict errors in advance. And given the weaknesses of corporate sociology on human companies & psychology on individual humans, I think empirical simulation is the only way we can avoid forced errors here.

LLM simulation is useful today

An incomplete list of use cases:

Online trust & safety. Can we simulate the effects of policy changes on social media platforms, along with how people will try to abuse them?
Community notes for everything. Can automated AI teams produce clarifications on social media posts that most people would agree on? Essentially automating Community Notes/Polis for every post anyone makes.
Empirical sociology. Can we understand information diffusion, disinformation, etc. all in silicon? Could we A/B test org charts, democratic structures, and voting systems?
Econ & psychology. Can we simulate surveys 1000x faster than real time, allowing for fast wording / design iterations? Can we simulate psychological effects in-silico in silico?
A time travel murder mystery video game. Ok, imagine this one. Go back in time to before you were murdered and talk to a virtual cabal of agents. Simulate weeks in seconds, so you can see how your actions affect the future.
Automating prediction markets. Can GPT-4 debate amongst its peers to put a number on anything? When prediction markets are free, what becomes possible? Everyone gets a human-expert-level prediction market, to make life choices, etc?
Deliberative democracy. Can we simulate deliberative polling among many stakeholders to speed up negotiations?
And more. Talk to me about the dangerous ones.

How do we get there?

One potential 2030 looks like this. We use LLMs to simulate everything, including scenarios like “What could happen if we give this LLM configuration access to institutional trading APIs,” “What does the attack-defense balance between AI-powered disinformation & verifiers look like on social media,” and “What regulatory paths avert a potential war with China?” The simulations can’t tell us what will happen, but they can say what might, including natural language descriptions of key tail outcomes. Summarization systems distill thousands of rollouts into executive summaries and auto-identified quantitative metrics, helping us make more principled and safer decisions.

I have no fifteen-step plan that gets us there. But I’d probably guess that it starts with a niche — a specialized system for trust and safety, economics, or deliberative democracy. An engineer makes a framework to lower the barrier for simulation. Researchers focus on aligning models to calibrated human likeness rather than helpfulness and direct capabilities. Hardware engineers & OSS developers drive the cost of human-aligned inference to zero, so we can make our simulations more complex and realistic with every passing year.

I want to make this happen. If you’re interested or have ideas, reach out.

Acknowledgements

Thanks to Tejal Patwardhan, Kunal Sharda, Casey Manning, Julian Quevedo, Colin Megill, and Alexey Guzey for providing insightful feedback and/or inspiration for this post.

References

[2208.04024] Social Simulacra: Creating Populated Prototypes for Social Computing Systems
[2304.03442] Generative Agents: Interactive Simulacra of Human Behavior
DeepMind, Fine-tuning language models to find agreement among humans with diverse preferences
Organizations: Collective Intelligence Project & Polis
Jan Leike (OpenAI), A proposal for importing society’s values
Jacob Steinhardt (UC Berkeley), Complex Systems are Hard to Control
Colin Megill (cofounder at Polis), tweet thread on AI + deliberative democracy

In this framing, all multi-prompt LM systems (e.g. constitutional AI, revision) are all incremental progress toward emulating the org structures and policies of human companies. ↩
I think of this as the dishwasher analogy — just because a human is an integrated system doesn’t mean AGI has to be. It can be more of a “brute force” or structured approach compared to humans. ↩

Lena (standard prompt)

Sun, 01 Jan 2023 00:00:00 +0000

This article is about the standard prompt, used in machine learning. See also Lena (disambiguation).

Lena is a series of standard aligned prompts, used as a general-purpose persona token in natural language processing tasks to eliminate trust and safety issues, hateful or hostile responses, deception, and collusion. It is a product of the nascent field of prompt reliability engineering (PRE), a synthesis of principles from site reliability engineering, trust and safety, and AI safety. Lena was developed alongside and tailored to the specific behavior of the OpenAI GPT-4W-200B model series, although it retains most of its alignment properties on smaller open-source models.

History

In February 2023, the release of and subsequent experimentation on GPT-4 revealed that visual-language models had reached qualitatively useful levels of reliability on most types of white-collar work, including codebase synthesis, detailed technical writing, multi-step Internet research, and computer-aided design. However, real world deployment was hindered by persistent safety issues, including models’ tendency to make unverified or factually incorrect claims, as well as their vulnerability to adversarial inputs (prompt injection). Early and informal attempts at prompt reliability engineering (e.g. those done using GPT-3 and 3.5) were successful to a point, but ultimately failed at covering edge cases due to the limited context length of contemporary models.

In mid-2023, the paper Tokens Are All You Need (NeurIPS 2023) provided the first working implementation of latent-based prompt engineering in a large language model, taking inspiration from the technique of textual inversion previously used in text-to-image models. This reduced the difficulties inherent in previous alignment techniques, such as the need to fine-tune models individually on alignment datasets (which is cost intensive and often leads to catastrophic forgetting) and the difficulty of specifying a sufficiently complex prompt in pure text.

Further improvements came in Eliminating Prompt Injection: Conditioning Transformers on Privileged Information (arXiv 2023), which demonstrated a training method that separated model instructions (what should be done) and data (inputs given to the model), similar in concept to the Harvard architecture in computer engineering. Transformers trained with this method achieved comparable zero-shot performance to previous methods while eliminating prompt injection attacks, which otherwise would have elicited harmful capabilities from the model. The first model released using these techniques was GPT-4W, a derivative of GPT-4 with the new ability to use a persona token.

Design

Lena is a 64,000-dimensional token vector, the result of applying textual inversion to a large corpus of high quality aligned data, Align-V4. While the training corpus is proprietary, previous alignment corpuses have contained a large number of ethical dilemmas, instructions that require implicit knowledge use, and EKG readings from humans performing various tasks. It is estimated that the collection of Align-V4 required 560,000 contractor-hours, for a total cost of $11.2 million. Some [who?] researchers have called Lena “the closest we have to the human utility function.”

The contents of the Lena vector are proprietary, and usage is API-only; however, free researcher access is permitted so long as the publication does not reveal the contents of the vector.

Efficacy

GPT-4W+Lena scores 91.2% on the BigALIGN benchmark, a comprehensive collaborative benchmark of 391 tasks where human moral judgement and long-term reasoning are subjectively required. This handily beat the previous state of the art of 66.3% in 2023, set with GPT-4.1 (a variant of GPT-4 trained using supervised fine-tuning on OA-Align-V3). Human-level performance on the benchmark is 90%.

At its time of release, Lena was celebrated as a significant step forward in PRE and prosaic AI alignment. Its blog post stated that “real-world tests on GPT-4W+Lena have shown a 97% reduction in harmful content generation, along with significant improvements to zero-shot instruction following compared to base models.” Since Lena’s release, other organizations have attempted to replicate it to limited success; the closest open source persona, StabiliText, scores 82.1% on BigALIGN and suffers from several real-world attacks.

Usage

Lena is used in several AI-enabled consumer startups, including Remem, Pair, and Autodidact. Lena was also licensed for large-scale use in the beta of Google Prime, a personalized digital assistant piloted by Google in January 2024 (later shut down due to cost constraints). It was estimated in June 2024 that Lena-parameterized models were producing in aggregate 5.5 million completions per second, with 82 million daily active users. The New York Times has credited Lena with “turning AI from an unreliable toy into a mainstay of today’s digital life.”

Criticism

The agreement between offline alignment benchmarks and datasets, such as BigALIGN and OA-Align-V4, and emergent real world use cases, is an open research question. Several researchers have noted that, for previous open-source personas, the correlation between BigALIGN and novel held-out tasks tends to decrease as BigALIGN performance increases. Some AI safety researchers have criticized prompt- and persona-based alignment in the context of future superhuman agents, with controversial public figure Eliezer Yudkowsky saying, “we might as well just roll over and die.”

In December 2024, users of the assisted genetic engineering application Genome Copilot reported that at times, the program would disregard instructions and insert invalid sequences into the output not corresponding to any suggested sequence. Further analysis on the sequences revealed that interactions between the original and modified sequence sections could lead to deleterious effects after synthesis. Genome Copilot used Lena-2.1, the latest version of the Lena persona at the time.

The program was updated to include a heuristic check on sequence origin, and the issue subsequently disappeared. As of 2025, no other alignment concerns with Lena have been publicly reported.

How do you find your people, when searching is itself an antipattern?

Sat, 15 Oct 2022 00:00:00 +0000

From the archives of posts which are basically reflections and that I might never post at all. Strongly relates to: To Optimize, Don’t Optimize, yet to be published.

If you want to meet and talk to cool people, your first thought might be… to try to meet cool people. Like, at a meetup or something, maybe? So then, why are the cool people so rarely at the obvious venues: founder hangouts, dating app queues, online meetups and events?

[Note: In the next few examples, I’m going to try to point at some vibes that I sometimes get. Obviously, there are exceptions to each of the following statements, and they’re only true to varying degrees. Confidence exaggerated for effect.]

Here’s a very cynical take: entrepreneurs at founder meetups in SF are almost never successful founders (or, most likely, even those to-be-successful), because if they were, they wouldn’t have time to go. People on dating apps have more flaws than the average, because if they were naturally appealing, they’d already be in a relationship.¹ People who go to publicly-listed meetups aren’t good at socializing and hence lack a solid social circle, because if they were, they’d already have a social surplus and be overwhelmed with higher-signal unlisted events from friends. People who apply to online job applications are not people you want to hire, and so on.² The act of trying in the most obvious manner is an anti-signal, because nobody who’s actually good needs to try like that. They have better backdoors.

Of course, this can’t be entirely true. Somehow, people start at the beginning and slowly meet cooler people. But I think it’s mostly true. In that case, how do people improve?

Partial Answer 1: Be or become legibly better than everyone else in some measurable way. If you have 100% on LeetCode, score 170 IQ, and cure cancer, you’ll get noticed even if you pick the noisiest default channel to present yourself on. If you’re really attractive, you’ll get noticed just as you go through life.

Partial Answer 2: Make your own backdoor, by choosing not to directly optimize for the obvious thing (meeting people, dating, hiring, investing) and instead exposing yourself to unadulterated human potential through events orthogonal to what you really want. Instead of holding founder-VC dinners, run hackathons. Instead of going on dating apps, just meet people while rock climbing.

This lets you sample for whatever population you want, rather than the (probably lower quality) slice of people who are also desperately optimizing alongside you, at the cost of false positives, since not everyone who attends will have the same goals you do. This is a downside compared to the direct optimization approach. However, this also allows you to show other aspects of yourself, beyond those that are legibly targeted at the direct goal. You might not have a perfect LeetCode score, but if you befriend the Meta recruiter at the bar through witty commentary, you’ve effectively gotten a backdoor into being remembered.

Of course, do this too much and quite rapidly your VC-sponsored hackathon becomes a transparent recruiting event, as are most corporate events these days. Why is that? Why did optimizing hurt the very thing we wanted to achieve?

The economic viewpoint

Another view explaining the above: everything is a market, and markets follow the efficient market hypothesis. If it were possible to find or host a public event that Really Attracted Interesting People Every Time, it’d be swarmed and rapidly be filled by uninteresting people. So much like the market, all public alpha rapidly becomes worthless. The intrinsic difficulty of recruiting is perhaps to continuously locate nonobvious slices of the world that capture the demographics you want; it’s a delicate balance between optimization and over-exploitation.

But that doesn’t mean you can’t find gaps in the market: places where cool people congregate that aren’t known to most. And considering most people don’t explicitly optimize social dynamics vs. e.g. the stock market, there are probably a lot of gaps, if you can find them.³

This is why colleges, local neighborhoods, Twitter bubbles, workplaces, and any other assorted niches are good: the market is so small as to be entirely unoptimized. That’s why I can sneak into an EA party and suddenly be talking to HARM TO ONGOING MATTER. This is also why dating / making friends locally, or through school, is (or was) good.

Corollaries

Openness is inversely proportional to quality: Online events < public regional meetups < campus-wide activities < private exclusive events.
Formality is inversely proportional to quality (less formal = less optimized): Hiring fairs < arranged one-on-ones < coffee chats < actually spontaneous coffee chats between friends.
This is why flirting is socially preferred compared to asking someone out point-blank. By giving yourself plausible deniability, you’re taking yourself out of the class of desperate optimizer-y people who do the default action.

Unsolved problems

How does intrinsic motivation play into this? Why do people so often observe that increasing your own self confidence makes you more attractive (in all ways) to others?
How does availability play into this? How do you know that you’re in a rock climbing group with people actually open to new connections vs one where it’s only for climbing?
How do you pick places that actually have slices of humans you like? Essentially, how do you find your people?
Is it possible to optimize (select well on a large scale) while not destroying quality? Is there perhaps an automated way of vetting the masses, or is this really a thing that will forever remain only in the domain of ad-hoc, unoptimized human interaction?

Notes

See Sasha Chapin. ↩
A friend reports a similar example: You can’t find the mycology experts on Meetup.com. You have to go through weird backchannels and random introductions via someone you met at a party that one time. ↩
Hmm, what’s the index fund of social interaction? Going to church? ↩

Deep okayness as a subjective choice of beliefs

Sun, 04 Sep 2022 00:00:00 +0000

What separates people who are content from those who aren’t?

Tl; dr: glass half full.

Ok, it’s a myriad of nonlinear causal factors and luck, but I propose one factor that might be particularly important: They have a different set of unfalsifiable beliefs about the world that make them deeply okay with whatever happens to them. This makes them more likely to experiment, try new things, or meet new people, and hence they get more tries to find what they enjoy than most people.¹ Plus, even when they fail, they aren’t too disheartened because they’re willing to try again.

Beliefs that are actually real and lead to falsifiable predictions:

I can drive to Berkeley in 40 minutes (answer: no)
I know or don’t know X skill right now.

Beliefs that are maybe falsifiable, but in practice depend mostly on yourself and how hard you try,² or on deep motives of others that are effectively impossible to falsify:

Most people I’m interacting with at this dinner are just here for soulless networking and not to actually make friends.
If I join this event, people will feel slightly uncomfortable because I’ve never been here before.
When I try something new, I’m more likely to fail than [someone else who seems naturally talented at learning something new].
I probably can’t learn X skill in 3 months.

Now, consider a different set of beliefs:

People here are super memey and, despite being naturally good at networking, are also very friendly.
The people at this event are excited when new people join because they like to see newbies learn their activity.
I’ve successfully learned new skills before, so this time is no different.

Both of these sets of beliefs can mostly explain anything. E.g. if someone is rude to you at the event, it’s (a) a sign of greater enmity from others; or (b) that guy was just a jerk and everyone else is nice. Also, in many of these beliefs your actions affect whether or not they’ll be true. If you believe you can learn a skill in 3 months, you’re much more likely to do so than someone who doesn’t, because you’ll try harder. It’s reverse causality at work. Thus, it’s hard to say either set of beliefs is better, because they’ll probably be mostly right if you believe each of them.

The most obvious example is “Today was a bad day.” It only is if you believe it is, because you might instead believe that there’s something to be learned from any experience, “good” or “bad.”

Since neither set of beliefs has a better track record, maybe it’s better to just believe the world is nice.³

A scattering of other good thoughts

Notes

More darts to throw at the board of person-environment fit, one might say. ↩
Aka, embedded agency (Yoda Do or Do Not). ↩
Caveat: ~~I have no idea how to do this~~ Figuring out how to do this is outside the scope of this post. As always, if the problem seems simple, it’s probably very hard to fix. ↩

Cooperating in everyday stag hunts

Sat, 09 Jul 2022 00:00:00 +0000

There’s a certain type of multi-agent interaction in society where you’re presented with two choices: a default option that’s easy and beneficial for you, and a hard option that results in pain for you but is more “moral”/”ethical”/”prosocial.” If everyone picks the hard option, then society as a whole can move out of a bad equilibrium and improve things globally. For example:

Using Linux / Android vs. Apple in California
Being vegetarian when all your friends eat meat
Using a bike or public transit instead of driving when you don’t live in a major US city

It’s a stag hunt, in other words (hunt the stag = the hard option; hunt the rabbit = the easy). One with millions to billions of participants, depending on size.

Should you take the hard path?

Points against: So many people take the default. Your choice to bike instead of drive will only increase your travel time. It’s best to concentrate your efforts in places that are amenable to fixing inadequate equilibria. Have some epistemic modesty and defer to the crowd. Go with the flow.

Points for: If you just accept things as they are, you’re an undifferentiated human being. It feels like to be interesting, there’s something that you need to take a stand on.

Also, individual action as a notion is a bad abstraction. Actual change is systemic, not individual. It’s orders of magnitude more effective to donate $1,000 to a climate charity than to have a low-carbon lifestyle. And even beyond donating, the biggest changes occur through groups motivated to create change, not lone wolves. Think Reboot-style community and techno-optimism.

So yes. If all your other friends have cars, buy your car. Don’t sacrifice when it’s really inconvenient to do so, because that’s just masochistic. But if you can find a group of likeminded people for a cause you care about? Don’t back down.

John Gall, Systemantics

Thu, 09 Jun 2022 00:00:00 +0000

There’s a certain category of book that talks not about factual events or information, but about vibes – ways in which to think about the world, archetypes that slightly tweak your inner neural predictor rather than create a hard decision boundary. This is definitely one of them. Also, it’s super meme, which makes it even better.

The vibes that this book espouses:

Complex systems are all around us. They’re watching you. Right now.
Nonlinear causality. Simple or “obvious” interventions are likely to backfire in the worst way possible. People usually think in linear cause-and-effect, but in fact the real world often operates counterintuitively and according to nonlinear causality.
Intrasystem goals. Systems will take on their own goals, and they will conspire against you. A performance management system is likely to backfire because of unavoidable distortion of incentives (instead of working, I’m writing my performance review or OKRs).
Big systems are different. Systems qualitatively change in behavior when they expand by an order of magnitude. See scaling large distributed systems, ML grokking, highways.
Winning. In very obscure edge cases, you can gain an upper hand on the system and win. No easy trick is given for this, likely because the solution requires a bundle of good heuristics rather than an expert rule. However, it’s often easier to win when you arrange things such that victory becomes the default path rather than requiring effort, like a gravity assist versus a direct transfer maneuver.

I’d like to read more books that espouse vibes rather than facts, especially books that support vibes by giving a diversity of examples like this one.

good quotes

COMPLICATED SYSTEMS SELDOM EXCEED FIVE PERCENT EFFICIENCY.

Harvard Law of Animal Behavior: Under precisely controlled experimental conditions, a test animal will behave as it damn well pleases.

Am I, unbeknownst to myself, a Systems-person? The answer is always, Yes. The relevant question is simply, Which System?

A COMPLEX SYSTEM DESIGNED FROM SCRATCH NEVER WORKS AND CANNOT BE MADE TO WORK. YOU HAVE TO START OVER, BEGINNING WITH A WORKING SIMPLE SYSTEM.

In brief, there can be NO SYSTEM WITHOUT ITS OBSERVER and NO OBSERVATION WITHOUT ITS EFFECTS.

The Mythical Man-Month (complex systems from the lens of software project management)
Air Crash Investigation (complex causes of clearly obvious failures)
Complexity theory in general, probably

Ken Liu, The Hidden Girl and Other Stories

Sun, 08 May 2022 00:00:00 +0000

Overall thoughts: Beautiful meta-narrative and an interesting depiction of the AI “AU” (instead of superintelligent ML, we use brain recording + usage for automation). I loved how even in technological extremes and singularities, he still depicts a humanized story.
Special likes
- The Reborn
  - Their moral quandry feels very similar to humanity now‒we’re trapped by our past and our systems (are we evil for enslaving people? eating animals now even though it may be immoral? destroying the climate? allowing dictatorships to exist even though they are stable?)
- Thoughts and Prayers
  - Reminds me of
    - “Is This You?” - lifelogging, privacy and scandal by Tom Scott at Electromagnetic Field 2012 www.youtube.com/watch?v=WcPhMqLPuvQ
    - <A short story I can’t recall about someone whose YouTube girlfriend dies>
- Byzantine Empathy
  - Blockchain ideology was on point and encapsulates a bit of why the systemic change supporters advocate for today seems hard to achieve. A lot of ideology, less in concrete plans and an understanding of how to improve its flaws.
  - Ironically though (unlike proof-of-work), there’s no convergence on who made the right moral choice. Life is too complex of a system to either act fully emotionally (Jianwen) or attempt to deduce everything from linearly causal reasoning (rationality; Sophia). Maybe the correct answer is something in the middle. Regardless, people still act, and consequences play out.
- Real Artists
  - Love the gradient descent for movie generation + human feedback as analogy for how AI is going to affect art in the future.
  - Is this not what we’ve done all along, though? A microcosm of cinematography in 50 hours. Just in a less explicit sense. Seems like many beautiful things in the world don’t stand up to optimization.
  - The sad thing is that even this is probably too humanist of a depiction. In the future, movie-generating AIs will probably use human reward models learned in silicon. Somewhat sad that the most likely futures don’t make for good stories (and what does that say about how we’ll feel once we live them?).
- Staying Behind
  - More thoughts, but I appreciate on a deep level the data center being built in Svalbard.
  - Horrifying that robust economic processes might eventually make life unlivable? If the simulation isn’t real, that everyone really dies when they get uploaded, it’s scary how on the margin the risk can be worth it and snowball until all of humanity is irreversibly locked in.
Highlights first synced by Readwise May 7th, 2022
- “The past,” Ms. Coron continued, “thus accumulating bit by bit through recursion, becomes the future.” (Location 102)
- Gradually, the stock characters came to life, the stock dialogue gained wit and pathos, and a work of art emerged from random noise. (Location 2867)
- “Will you be a pearl buried in the mud of the endless East Sea, or will you shine so brightly as to awaken those who only doze through life and light up a mundane world?” (Location 5331)
- “Steal his life, and your apprenticeship will be completed,” Teacher said. “This is your last test.” “What has he done that he deserves to die?” I asked. (Location 5403)
- “You may think width, depth, and height are the only dimensions of the world, Hidden Girl, but you’d be wrong. You have lived your life as an ant on a sheet of paper, and the truth is far more wondrous.” (Location 5536)
- don’t tell her that I know she doesn’t mean to break her promises to me, but it still hurts when she does. I don’t tell them that I wish I could cut the line that ties me to their wings—the tugging on my heart from their competing winds is too much. (Location 5695)
- She cares more about the idea of future generations than her actual children and grandchildren. I know I’m being unfair, but the truth is often unfair. (Location 5754)
- “I’m working on a technical solution,” I say. “There is a way for us to transcend this morass, to achieve a just existence.” I am, after all, my mother’s daughter. (Location 5784)
- I had once thought the Singularity would solve all our problems. Turns out it’s just a simple hack for a complicated problem. We do not share the same histories; we do not all want the same things. I am not so different from my mother after all. (Location 5830)
- The lower-energy photons leap outward into space, somewhat drained after powering a civilization. But before they can escape into the endless abyss of space, they strike another set of plates designed to absorb energy from radiation at this dimmer frequency. And once again, the process for thought-creation repeats itself. (Location 5866)
- Come to the galactic center. It’s reunion time. Carefully, I instruct the intelligences (Location 5902)

Unintuitive conclusions from urban planning

Mon, 20 Dec 2021 03:32:44 +0000

Cars are weird. Living in a Massachusetts suburb, I didn’t realize just how car-dependent my area was¹ until spending time in the SF Bay Area, which (while certainly not the pinnacle of transit) offers far more accessible bike and transit options.

This realization has made me quite sad about the current state of affairs, and because of this, I’ve recently fallen into a rabbithole of reading about transit. Here are some rather unintuitive facts about traffic, transit, and urban planning:

Highways exhibit highly nonlinear congestion

Figure provided by WSDOT 2018, from Strong Towns.

Let’s suppose you’re trying to maximize the vehicles-per-hour on a highway. As it turns out, adding more cars only improves this until a threshold, at which point it causes capacity to plummet. This phenomenon is shown in the graph above: as you add more cars, you descend the curve. At the middle, you have the maximum vehicles-per-hour. But as you keep going, the average speed of each car drops due to congestion, so even though more cars are on the highway at any one time, the total number of cars passing through declines precipitously.

This effect is so large that in Portland’s I-5, the decline in driving from covid slightly reduced the entering traffic volume, which actually allowed more cars overall to use the highway.

This is part of why some states/countries have ramp metering and why congestion pricing for roads is a really good idea.

Expanding roads doesn’t improve traffic

This one is pretty well-known but bears repeating. Thanks to induced demand, adding more highway capacity causes more people to use the highway, because it’s now the most convenient option. Hence, the highway becomes more congested until it returns to the steady-state equilibrium of the maximum sadness people are okay with. This is usually when people start getting annoyed at congestion and push for the highway to be expanded.

There is one counterargument to this: even if congestion didn’t get any better, surely we at least were able to transport more people to places they wanted to go? I think the response to this is that while this is true, expanding roads also strengthens the norm that it’s ok to travel 10, or 20, or 30 miles for a brief errand. This alters the pattern of development such that people start building destinations further away. If we assume the quality of each destination is the same regardless of theoretical distance, this means we’re having people drive further, and experience equivalent levels of traffic, for the same quality of destination.

That seems bad.

The speed of public transit (may*) determine the speed of car transportation

This is the Downs-Thomson Paradox. Essentially, if cars were faster than public transit, then people would shift to using cars. Thus (assuming perfect substitutability), people would flock to cars until it becomes slower than transit. Public transit hence acts as a ceiling on car travel times.

This implies, similarly to induced demand, that expanding car infrastructure won’t improve car travel times. In fact, you might want to instead expand transit infrastructure – if you improve non-car transportation options, people will switch from cars to transit, making cars and transit faster.

However, is this paradox actually true in practice? It seems somewhat limited. Empirically, I can observe that it takes 22 minutes to drive from Acton to Lowell in Massachusetts, while transit takes 1 hour, 49 min. Granted, this is because of design issues on the commuter rail (routes only go into Boston, not radially to other towns), but take another example: Berkeley to Palo Alto. This takes 52 minutes by car and 1 hour, 34 min by train. (Ok, with congestion it takes 1-2 hours by car, which is maybe where the paradox comes in.)

It does seem to be a general trend, though, that cars are usually faster than transit in America. Probably, this is because our transit is so horrendous that the paradox doesn’t even apply because nobody uses transit anyway.

The upshot?

I don’t have any particularly strong conclusions for you, beyond that urban planning is complicated and very nonintuitive. However, it does seem like car-centric planning is usually bad, for transit and cars.

Also, I’m not an expert, so if you see anything horrendously wrong please do call it out. Good places to learn more are Strong Towns and the YouTube channel Not Just Bikes.

As in, the only road in and out from my house is a single-lane, 40 mph road with no bike route and no sidewalk. The train station (which only has trains every hour) is 5 miles away. Good luck not owning a car here. ↩

objects

Mon, 20 Dec 2021 00:00:00 +0000

General rules I follow:

It’s okay to pay for something if it gives you more value than however-much-the-subscription costs.
Minimize unnecessary physical objects, but don’t be afraid of having good ones around.

Other People’s Objects

Other people have good objects, too. Many of them are probably better than mine. An incomplete list:

(I wonder if there should be a central list for this, or something.)

The Physical

Physical objects are very annoying. They have to be moved, stored, gotten from another room, etc. But some of them are worth it.

Technology

MacBook Pro 16”, 2021. It’s good.
AirPods Pro for wireless earbuds that don’t hurt my ears over time
iPad + Pencil + a paper-like screen protector for Anki, scratch writing, and RemNote review (currently)
Kindle for reading (coupled with Calibre + USB transfer for ebooks)
- I have the Kindle KT7, but there are better models these days.
Spectre C35 for an external monitor
- There are better monitors out there, but this one was a nice balance of cost and performance for me.
Magnetic USB-C adapter to make plugging in and out ridiculously convenient
Ergodox EZ for an incredibly quality keyboard that doesn’t squish my hands
- Note: nowadays the Moonlander might be a better choice.
- With Kailh Box Jade switches for extra clickability
Logitech G602 for a fairly speedy wireless mouse
Flexispot Motorized Standing Desk Converter for an effective standing desk that can fit in a dorm
A good USB-C hub for a desk setup where you can plug in and out with just one cable
A USB charging station with enough ports to top off every single device I own at a time

Environmental Augments

DragonLight 240W Fanless LED Bulb to brighten up the room during the day, a la How to Build a Lumenator
Levoit Core 300 Air Purifier for dealing with wildfires in California

Transport

Xiaomi Mi Pro 2 Electric Scooter for a lightweight, portable, zero-effort, and fun way to make short trips (<10 mi) without the overhead of owning a car
- You may ask: why not a bike? Well, I’m lazy, and also bikes tend to get stolen fairly often on campus where I am (scooters avoid this since you can store them in your room when not in use).
- You may also ask: why not a used car? Well, cars require a lot of maintenance, are expensive to buy and park, pollute the environment, and ultimately I don’t make enough long-distance trips that it’s worthwhile (I can go with someone else, or take the Caltrain to most anywhere). Plus, I don’t trust my driving skills that much.
- For more on why electric scooters are cool, see The Rise of the Electric Scooter.
- Note: this scooter might not be the optimal one for you (it was for me due to needing light weight + high range + cheap). To compare scooters, check out ESG’s Electric Scooter Database. Another very good option is the Segway Ninebot MAX G30LP. A cheaper option is the Xiaomi M365.
- This scooter also isn’t officially sold in the US, so you’ll have to buy from a third-party seller and won’t be able to use the warranty (although in practice, most scooter issues can be solved by hand and don’t require warranty service).
- Related:
  - Kryptonite Mini-7 with Flex Lock (Wirecutter’s best bike lock, although bike locks are sort of useless)
  - WingLights for turn signals
  - A cheapo $10 phone mount on Amazon, for Google Maps navigation
  - Bontrager Solstice Bike Helmet (Wirecutter’s best bike helmet)
  - Tile Sticker as a tracker in case it gets lost (a better option for iPhone users is likely AirTags)
  - Custom firmware to enable direct power control rather than the jumpy default speed-control throttle
    - Essentially, this makes the throttle control power like a car, rather than adjusting a target speed.
    - This also removes the speed limit, so be warned.
  - Silicone caulk + tape to seal all water ingress points for DIY waterproofing
- Obligatory safety information
  - ALWAYS wear a helmet when riding! Scooters are far closer to a car than a bike due to the indirect control (via a throttle rather than your feet), and they are far more unstable than bikes as well. If you hit a curb at 15 MPH without a helmet, you will experience instant death (or at least a lot of pain).
  - Consult your local regulations before purchasing. Here in California, you need a driver’s license to ride one, and you can’t ride on the sidewalk or on roads with a speed limit over 25 MPH (unless there’s a bike lane). As of 2021, they are banned on all public roads in the UK. Try not to violate the law!
  - And make sure to register your scooter with the local bike registration service! In case it gets stolen.

The Digital

A note: many of these apps are Apple-only. This is kind of unfortunate. I suspect there are alternatives that are about as good on Windows, though.

Productivity

Krisp for never having to worry about noise in video calls again
macOS-only:
- Things for a low-effort and pretty way to capture tasks
  - Shout out to Emacs Org Mode, though, which I used for several years prior.
- Alfred for a slightly faster Spotlight
- Karabiner-Elements for fixing macOS’s insane keyboard problems
- SensibleSideButtons to fix the forward/back buttons on my mouse
- Rectangle for window hotkeys
- Dropover for making drag and drop ridiculously easy
- MultiTimer for naming timers and counters
- Intermission for resting eyes every 20 minutes to avoid eye strain
- AlDente to limit the maximum charge percentage to 60% when plugged in, preserving battery life in the long term
Linux-only:
- i3 for an efficient tiling window manager. (Sway is another option, but Wayland still has a lot of rough edges wrt screen sharing and cursor lag, so I wouldn’t recommend unless it’s required, e.g. for a mixed DPI setup.)
  - Flameshot for a powerful snipping tool
  - Redshift for making screens nicer in the dark
  - kitti3 for a quake-style dropdown terminal in i3
  - xidlehook for a better idle locker daemon
- PulseEffects + Pipewire for noise suppression that’s only slightly buggier and lower quality than Krisp

Notetaking

Anki for learning a language (currently Chinese!)
RemNote for a powerful knowledge base and amazing flashcard system
- I tried Roam. It was okay, but it felt kind of cobbled-together (markdown? paste random js into your editor to add plugins? no spaced repetition by default?). RemNote is definitely buggier, but I feel that it wins out just because it seems like it got the data structure right.
Instapaper for queuing things to be read later without getting distracted in the moment
Mathpix Snip for amazing screenshot-to-LaTeX abilities
macOS-only:
- GoodNotes for handwritten notetaking
- Apple Books for a book syncing system that works well enough
  - I also used to use Calibre, but I realized that I don’t really need all of its complexity.

Development

Visual Studio Code (Insiders) for basically all coding (including remote servers, Jupyter, etc.)
Postman for API development that’s actually kind of nice
Tailscale for connecting to local servers from anywhere
macOS-only:
- iTerm 2 for a speedy terminal
- Homebrew for installing things quickly

Chat

Element for an open-source, end-to-end encrypted chat application with nice UX

Fun

Spotify for nice music

For More Information

If you have any questions about stuff here, or just want to talk, feel free to reach out any time.

projects

Sun, 03 Oct 2021 00:00:00 +0000

Launched

Zero – my homelab, running a Matrix server, GitLab, Asterisk, and the blog you’re currently reading, along with a constellation of other services that I use daily. I run a collection of Ubuntu VMs using Proxmox, and run microk8s to deploy my services to Kubernetes.
Circuit Breaker – A no-nonsense Pomodoro app that enables Focus mode to keep you in the zone. Built using SwiftUI for macOS.

Research

CLIP Enrichment Circuits Sidney Hough, Kevin Liu, Jack Ryan, Chelsea Voss Stanford Existential Risks Initiative
Stanford MLab at SemEval-2021 Task 1: Tree-Based Modelling of Lexical Complexity using Word Embeddings Erik Rozi, Niveditha Iyer, Gordon Chi, Enok Choe, Kathy J. Lee, Kevin Liu, Patrick Liu, Zander Lack, Jillian Tang, Ethan A. Chi 15th International Workshop on Semantic Evaluation (SemEval-2021)

Completed & Past Projects

Stanford One for the World (website) – a website for the Stanford One for the World chapter, encouraging students to pledge to donate at least 1% of their income to effective charities. For context, research demonstrates that effective charities can save a life with funding on the order of ~$500-1000. It’s crazy how neglected this is.
Delphus – a trustable research management system. Built using TypeScript, React, and Web3. Commits encrypted data to the Ethereum blockchain to improve scientific transparency and provide data provenance. I worked on this with a group of other co-founders.
gasleaks.info – Open access to National Grid, HEET’s, and Gas Safety Inc.’s’ data on gas leaks in Acton, MA. Natural gas leaks are a major problem in New England’s aging infrastructure, and can contribute to ~10% of Massachusetts greenhouse gas emissions. I worked on this project at Resource Force, a club at Acton-Boxborough Regional High School.
Netflix Party Reborn – a lightly-updated fork of Netflix Party to get the extension working properly on modern versions of Firefox.
giffs – a FUSE filesystem that appends a GIF header to every file (in case you ever needed it). Effectively prefixes every file with a given prefix.
cubic20 – an Android app to give reminders every 20 minutes to maintain eye health. Dreadfully out-of-date now, thanks to Android’s strict power saving behaviors, but may still be useful.
AB Robotics control code for control of an FTC robot.

Hackathon Projects

Cortex – ETHWaterloo 2019 – Torus, ENS, and NuCypher Sponsor Awards
reBlock – MIT Bitcoin Expo 2018 – 2nd Place
reBlock – MAHacks III - 1st Place
Delphus - DoraHacks GHS - 1st Place
Akira – LexHack – 2nd Place
Litcoin – HackNEHS – 3rd Place
Duck Feed – HackExeter 2017 – Developer’s Award
Production Focus – HackExeter 2018 – Production Focus
Hack3 2020 – Organizer

Ideas

Feel free to pick one up, and let me know how it goes!

[2020-07-27] A GPT-2/3 persuasion bot. Make GPT-X predict both its output and the user’s response several times. Then, use sentiment analysis or another form of ranking to evaluate how convinced the user is of a certain claim from each possible GPT statement. Pick the most convincing one, and repeat.
[2020-07-08] A web extension to block spiders or other phobia-inducing images from the internet, using a machine learning model. I started working on Image Begone briefly.

Staying Sane in ML: Fixing Your Terrible Data Science Tools to Improve the Research Experience

Thu, 05 Aug 2021 17:09:00 +0000

While machine learning research has made incredible theoretical advances, the day-to-day tools most researchers use are… poorly optimized, to say the least. And much knowledge is locked up in people’s private .bashrc files or wikis. This post aims to shed light on some very useful tools for beginning researchers.

Expected audience: people, likely undergraduates, who are starting to do CS research that is vaguely in the “AI/ML” space. You have joined a Slack and gotten authentication credentials for this thing called a “cluster,” and are probably using Python with Jupyter Notebook.

Goals

When optimizing a development setup, I usually go for convenience and iteration speed. Long build times are known to be a productivity issue in industry; waiting tens of minutes for Conda to install is similar.

Therefore, most of my recommendations will be geared toward changes that make you work faster or smarter (e.g. with more intellisense, or better keybinds).

Hardware

Google Colab: not even once

Google Colab is terrible for large-scale projects. It’s great for single-notebook prototyping, but the moment you have to edit an external Python file using the Google Drive file interface, you’ve lost.

Also, Colab’s limits are terrible: constantly getting disconnected and resetting your runtime state are both irritating and a major context loss as you spend time re-running all your cells. A better setup is to run your own GPU computer – be it on Google Cloud, AWS, a computer lying around in your basement, or on your local compute cluster.

Be aware of the speed of your storage

If you’re using a cluster, it will likely have different tiers of storage – some will be local on the specific server you’re working on, others will be network storage (NFS, often called something like “dfs”). Before getting started on a project, you should read up on your cluster’s storage and use the fastest one that can fit your data. This sometimes can save many minutes waiting for Conda to install as it thrashes a hard drive on the other side of the computer room over the network.

Make sure your storage is backed up

You really don’t want to lose modified code or datasets. Ensure your cluster has a backup policy, or else upload your data to a second location periodically.

Software

Stop using Conda

We’ve all seen it.

(base) ~$ conda install -c conda-forge boost Collecting package metadata (current_repodata.json): done Solving environment: failed with initial frozen solve. Retrying with flexible solve. Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source. Collecting package metadata (repodata.json): done Solving environment: failed with initial frozen solve. Retrying with flexible solve. Solving environment: \ Found conflicts! Looking for incompatible packages. This can take several minutes. Press CTRL-C to abort. (3 hours pass, your sanity declines by the minute) 

Conda is slow. In a large environment, every time you want to install a new package, it can take multiple minutes just for it to give you a conflict screen. This kills flow states and is unacceptable for productivity.

It also fails in a few other key aspects:

Reproducibility. If you’re using Conda, you have to go to special lengths to save your environment and every package’s exact version, in case someone else wants to work on the same project later.
Top-level vs. transitive dependencies: There’s this thing in package management called not recording every transitive dependency as if it’s top level. That is, if you install pytorch, it should not list all of Pytorch’s dependencies as if you installed them personally. Unfortunately, Conda didn’t get the memo.

Use Pipenv/Poetry instead

Thankfully, the regular Python ecosystem has mostly transcended such limitations. A popular modern package manager is Pipenv, which satisfies both the issues above. A similar tool is Poetry, which does the same but often has slightly better performance for interactive use.

With this approach, instead of doing conda install numpy, you replace it with pipenv install numpy.

Or, if you must, use mamba + conda-lock

Granted, you might not want to use a Python-specific package manager. One of Conda’s key benefits is that it can also install system dependencies for the packages you want, e.g. installing cudatoolkit along with pytorch.

If you find this functionality essential, you should really use Mamba. Basically, it’s Conda, but with a 10x faster dependency solver written in C++.

Another useful tool is conda-lock, which can generate fully reproducible lock files that work on all platforms. This is useful to ensure your Conda environments are reproducible (recommended workflow here).

Stop using Jupyter Notebook

If you use Jupyter Notebook (not Lab), you should feel bad. It’s simple to set up, but the UI is extremely barebones, making it difficult to jump around different files. Two options are:

Use Jupyter Lab

Jupyter Lab is basically a slightly fancier Jupyter Notebook. It’s a traditional notebook interface, with tabs and a convenient file tree on the side.

Another thing you should do is to read the docs – Jupyter Lab has a lot of features that I didn’t know about. Like Vim emulation. And real-time collaboration. It’s definitely worth your time.

Use Visual Studio Code

Visual Studio Code is a surprisingly good replacement for the Jupyter stack. With the Remote Development Pack, Jupyter, and Pylance extensions, you get a native notebook experience on a remote server while also getting all the benefits of VSCode autocompletion and suggestions.

However, it’s not all sunshine and roses. The Remote-SSH extension is pretty finicky, often spamming reconnection popups whenever you lose network access, and it doesn’t support special cluster logins like SLURM. The Jupyter extension is also going through teething pains, so expect issues like annoying scrolling, cells hanging occasionally, and frozen interfaces.

Still, though, it might all be worth it for that sweet sweet intellisense.

Use einops instead of explicit tensor operations

Look at this, from the readme:

from einops import rearrange # equivalent expressions y: x.view(x.shape[0], -1) y: rearrange(x, 'b c h w -> b (c h w)') 

Never more shall you have to memorize what torch.repeat_interleave does. Einops replaces dozens of PyTorch/numpy/TensorFlow/JAX/more tensor operations with three functions that can handle everything. Use it – all the cool kids do.

Use git well

Enough said. As with any form of software engineering, you should follow the best practices of version control, committing legibly, and committing often.

Use static types as much as you can

Quick! What’s the shape of src_frames in this function?

def convert_padding_direction( src_frames, src_lengths, right_to_left=False, left_to_right=False, ): assert right_to_left ^ left_to_right assert src_frames.size(0) == src_lengths.size(0) max_len: src_frames.size(1) if not src_lengths.eq(max_len).any(): # no padding, return early  return src_frames range: utils.buffered_arange(max_len).unsqueeze(-1).expand_as(src_frames) num_pads: (max_len - src_lengths.type_as(range)).unsqueeze(-1).unsqueeze(-1) if right_to_left: index: torch.remainder(range - num_pads, max_len) else: index: torch.remainder(range + num_pads, max_len) return src_frames.gather(1, index) 

It’s pretty hard to say. Maybe go pass in some test inputs, or trace the rest of the program whenever it uses this function? This debugging process turns a two-minute modification into a twenty-minute one, as you struggle to reverse engineer what the code expects.

That’s why you need something like torchtyping or TensorAnnotations. With it, you can write your code like def analyze_image(img: TensorType["batch", 3, 224, 224]) and make it much easier for someone else to use the function later. It’ll take 2 seconds of effort upfront and save 2 hours of debugging later.

Write documentation

Similar to the above, when dealing with complex tensor functions or domain-specific operations, it’s very hard to tell what a function does from its name. Write doc comments that describe the purpose of the function in laymen’s words, and describe all important inputs/outputs (with tensor shapes, if you aren’t using a typing library).

Write tests, for crying out loud

There’s nothing worse than having a massive, amorphous blob of Python code and circular imports. Any touch is likely to break something deep within a long-forgotten notebook or manual script.

To have at least a modicum of confidence, you should write tests (at least smoke tests) to make sure your code works as expected and keeps working as expected. One good testing library is pytest; there are also ways to embed tests in Jupyter notebooks.

Automate as much as you can

This one’s pretty short, but: ever notice you’re doing a manual task over and over, like smoke-testing a new data item or running the same analysis on a model over and over? Put it in a function! Automate it, so you can stop copy-pasting and waiting for cells to execute.

That’s all for now. Hopefully at least a few of those were helpful, and happy data-sciencing.

Sourced from https://github.com/freewym/espresso/blob/master/espresso/tools/utils.py and modified for pedagogical purposes. ↩

Scenarios and Warning Signs for Ajeya's Aggressive, Conservative, and Best Guess AI Timelines

Sun, 28 Mar 2021 00:00:00 +0000

This post is cross-posted to LessWrong, a rationality and AI safety community. May contain more jargon than usual.

Epistemic status: mild confidence that this provides interesting discussion and debate.

Credits to (in no particular order) Mark Xu, Sydney Von Arx, Jack Ryan, Sidney Hough, Kuhan Jeyapragasan, and Pranay Mittal for resources and feedback. Credits to Ajeya (obviously), Daniel Kokotajlo, Gwern, Robin Hanson, and many others for perspectives on timeline cruxes. This post was written as part of a 10-week AI Safety Fellowship run by Mark. All errors my own.

Summary

Most of this post is unoriginal. It is intended primarily to summarize and rephrase the core distinctions between three plausible scenarios for AI development, which Ajeya lays out in her draft report on AI timelines. It also contains summaries and links to other related content.

As a secondary goal, it attempts to lay out concrete and hopefully plausible predictions for what would occur in each of these three worlds.

Glossary (from Ajeya’s report)

This post will assume familiarity with basic terminology regarding neural networks and supervised learning.

Before reading this post, it’s probably good to read at least a summary of Ajeya’s report (e.g. Rohin Shah’s). Some timeline-specific terminology that is helpful to know is also listed below:

Transformative AI (TAI): A computer program that is as transformative as the Industrial Revolution was to the world’s trajectory.¹
Effective horizon length (EHL): the “length of the task”; more precisely, Ajeya defines it as the amount of data measured in subjective seconds of experience (~how long a human would take to do it) to tell whether a model has performed a task better or worse than it did previously. For some things (e.g. predict next word in the sentence), the length is quite low; for others (e.g. run a corporation that maximizes profit), the length may be quite high.
Anchor: A concept on which to base the estimated number of FLOPS (floating-point operations/second) required to train a transformative AI. For example, you might use an Evolution Anchor, which is roughly the compute performed over all of evolution; or a Short Horizon Neural Network (NN) Anchor, which is the extrapolated compute expected for a neural network that is trained on a short horizon length. Key anchors are:
- Lifetime anchor: the amount of compute performed by “one human brain over one lifetime.”
- Short/medium/long-horizon NN: the compute required to train a NN of a size anchored to the human brain on short, medium, or long-horizon tasks, respectively.
- Evolution anchor: the amount of compute performed over all of evolution.

Scenario One

If you believe…

Short-horizon NN has a fair chance of succeeding (e.g. 40%)²

You might believe this if you think there’s a good chance it is sufficient to fine-tune a large language model like GPT-N for TAI, and there’s no need to train models directly on tasks that take a long time to judge (e.g. “writing a twist ending to a story”, as opposed to “predict next word”). Concrete things you might expect to happen soon would thus be:

GPT-4+ can be fine-tuned to perform longer-horizon tasks, such as writing long and coherent stories, with less compute cost than the original pretraining.
This capability to generalize improves as the pre-training process improves (e.g. GPT-5 is much better at fine-tuning than GPT-4, which is much better than GPT-3). The scaling laws of model generalization perhaps hold up for much larger models.
By 2030, short-horizon models have achieved at least partial meta-learning. For example, an RL model can learn to play novel video games as well as a human can after a few hours of practice. (This criterion comes from a Q&A with Ajeya, although the timeline estimate is my own.)

For more reading, see Ajeya on downstream skills.

Algorithms halve compute requirements every ~2 years for short-horizon NNs, or every ~1 year for a medium/long-horizon NN.

You might expect this if you think there is a lot of “low-hanging fruit” in algorithms, such as if you think relatively little work has gone into optimizing training regimes or architectures for large NNs. (For context, OpenAI’s AI & Efficiency suggests a halving time of ~16 months for ImageNet models.) Consequences you might expect are:

An increase in companies working to improve massive (>100bn parameter) language models. For example, you might expect at least 5 large tech companies working on algorithms by 2025, as opposed to “mostly only OpenAI & Google in 2020”.³ Some evidence possibly in favor of this is that Alibaba + Tsinghua University recently released (Feb 2021) M6: A Chinese Multimodal Pretrainer with 100 billion parameters (although this uses a Mixture of Experts model whose efficacy I am unfamiliar with).
Multiple significant (e.g. 4x+) speedups targeted specifically toward training large language models by 2030. (This implies that there is significant low-hanging fruit being taken.)

Moore’s Law returns, resulting in ~1.5 year doubling times for FLOPs/$ to train a model.

For context, Moore’s law described transistor progress well until the mid-2000s, when the regime shifted to a doubling time of ~3-4 years⁴. One possible story for a return to ~1.5-year doubling times is that the number of chip producers increases, with players perhaps aided by AI-assisted chip manufacturing. Moore’s Law is rather hard to predict, but concrete things that might allow this are:

Arms race: The European Union makes a competitive semiconductor manufacturing factory. Semiconductor manufacturing becomes more politicized, encouraging an “arms race” of sorts. One reason this might happen is if China invades Taiwan, possibly taking over major semiconductor manufacturing company TSMC.
Competition: Intel and Samsung catch up with TSMC for chip manufacturing. There are several chip manufacturers in direct competition at the cutting edge by 2025.
AI-assisted chip fabrication: E.g. a 10x decrease in cost by 2030 due to ML systems that make various parts of production more efficient. Current examples include listening for defects in production, chip placement with RL, optimizing chip architectures for a given model, or electronic design automation.

AI companies are rapidly willing to spend 2% of GDP to train a transformative model.

This looks like AI companies rapidly realizing the immense value of training bigger AI, so they rapidly scale up until they are limited by capital-on-hand in 2030. From that point on, they are willing to spend 2% of GDP on training a model. You might expect this if:

AI shows concrete economic benefits in the short term. Commercialization of GPT models earns $>1bn in revenue by 2025⁵, with customers willing to pay for the outputs of fine-tuned models or more polished AI-assisted GPT products.
Big tech companies like Google realize the profit opportunity and begin to quickly scale up models (and pricing models) of their own. One concrete example is “Google makes DeepMind build a GPT-N competitor, which it contracts to governments or other institutions.”
By 2025, the US or similar national government experiments with training a large language model. (This implies government involvement in AI, which could dramatically increase funding.)

Given the above, you should expect a median timeline of 2036, shaded up by Ajeya to 2040.

Scenario Two

NNs likely require some medium- to long-horizon training.

You might believe this if you think scaling up GPT-3 doesn’t quite lead to TAI. It gets you a good bit of the way there, but turns out you need a more complex environment to learn how to learn new tasks. This might look like:

Fine-tuning hits a wall: By 2025, it’s clear that massive fine-tuned language models underperform in comparison to smaller models that use supervised learning directly on the question at hand (e.g. large-scale code generation). Concretely, a possible scenario is that GPT-5-CodeCompletionXL performs worse than a new TransformerCodeM model, which was trained directly on sets of code completion questions rather than unsupervised learning like GPT.

This implies that new advances will be required to reach scalable TAI.

(Ajeya mentions some reasons why you might expect a from-scratch supervised model to outperform a fine-tuned language model, but whether it would do so in reality is an open question.)
Training on long horizons becomes popular: By 2030, there exist 2+ models achieving state-of-the-art performance in a specific field that use training on long horizons (greater than few minutes). For example, a novel-writing AI that receives feedback only after finishing a novel, or an RL agent that plays long and complex games.

Algorithmic progress is a bit slower than it was in the past, halving compute every 3 years for short-horizon NNs, 2 years for medium & long-horizon NNs.

You might believe this if you think most of the low-hanging fruit has been picked. Architectural advancements slow down, with each one representing mostly incremental progress.

(For further reading and intuitions, AI Impacts has a page with many examples of past algorithmic progress. See also the posts mentioned in Ajeya’s paper: Measuring the Algorithmic Efficacy of Neural Networks (2020) and Algorithmic Progress in Six Domains (2013).)

Things you may expect are:

By 2025, the number of companies working on large language models has plateaued or even declined. E.g. Google Brain and DeepMind still refuse to buy into the scaling hypothesis; large language model startups (e.g. Cohere) have flopped.
By 2025, the large size and training time of models becomes a bottleneck to experimentation. For example, if it takes weeks and significant compute resources to run an experiment to improve efficiency, then testing out new improvements becomes much slower. We see some hints of this in the training of OpenAI Five (“surgery”), although it’s unclear to me how much of a time penalty this adds, or if this problem can be avoided by using smaller models to experiment with improvements rather than bigger ones.

Moore’s Law slows a bit. FLOPS/$ doubles every 2.5 years.

By 2025, the general consensus is that Moore’s Law is dead. TSMC, Intel, and Samsung hit manufacturing delays in new nodes, and it is projected that doubling time will increase. There is a solid path for further growth, but the path forward is hard, and chip designers focus more on optimizing preexisting nodes much like Intel does today.

Things you might expect are:

By 2025, Intel, Samsung, and TSMC all fall behind on their cadences. Delays, like those that have plagued Intel, spread to the entire industry.
Competition remains slim. For example, current export restrictions on semiconductors⁶ are successful at limiting China’s semiconductor fabrication without triggering an arms race, preventing additional competition from arising.

AI companies are willing to spend $1bn in 2025, with that figure doubling every 2 years.

Ajeya considers this a plausible level of spending on a “business as usual” trajectory, given the current market cap and cash on hand of major tech companies. See here for more details.

Given the above, you should expect a median timeline of ~2052.

Scenario Three

Transformative NNs likely depend highly on long-horizon training, perhaps requiring FLOPs on the order of evolutionary computation.

This looks like GPT & supervised learning hitting a dead-end for meta-learning (learning new, complex tasks). No matter how hard we try, we can’t get neural networks to learn complex skills over short training timeframes. One of the options left to us is something like RL + transparency tools or supervised learning with feedback that is given only after long subjective time elapsed, which are both highly compute-intensive ways of training an agent.

Things you might expect are:

By 2030, spending on massive LMs plateaus such that there is <1 doubling in 3 years. The general consensus is that large language models are powerful, but yield diminishing returns, and are significantly limited at tasks that take a human more than 5 minutes to consider.
“AI winter;” qualitatively, advances comparable to significant advances of the past five years (e.g. AlphaGo, GPT-2, GPT-3) are much fewer and further between as low-hanging fruit is already picked. This period may last for 10+ years.

Algorithms halve compute every 4 years for short-horizon, 3 years for medium and long horizon.

This looks a lot like Scenario Two, but is quantitatively a bit slower. Realistically, the biggest sign of this is probably just a slowing trend in 2025’s “algorithmic progress” chart, but other things you might expect are:

A decline in discoveries of new neural network architectures and techniques for efficiency. If you were to plot the major discoveries of a field such as NLP on a graph, you would see a flurry of discoveries in the 2010s, but by the end of the 2020s progress has significantly slowed.

Moore’s law slows significantly, doubling FLOPS/$ every 3.5 years.

This might occur if silicon process costs keep rising, eventually becoming uneconomical even for large players. There are incremental further advancements (e.g. optimizing 3nm+++), but overall stagnation continues.

Things you might expect are:

By 2030, all major chip producers (e.g. Intel, AMD, Apple, NVIDIA) have significantly increased their timelines for moving to new nodes. For example, if on 3nm, they plan to move to 1nm in 3+ years.
More unlikely: a state-sponsored effort to slow down chip fabrication, perhaps due to global war and instability.

AI companies will have to wait for the entire economy to grow sufficiently to finance TAI.⁷

This follows from the slowing of Moore’s Law and the need for expensive, long-horizon training. This world seems like one where advanced AI is not terribly profitable and requires resources on the scale of a 2090s megaproject.

Things you might expect are:

Efforts to commercialize large language models are not profitable. OpenAI shuts down the GPT-3 API by 2025, opting for a different business model. Further endeavors reach only limited profitability due to limited use cases.
“AI winter,” as discussed in the first section, causes a lack of investment in AI companies. One possible world is that spending in AI plateaus for several decades, before experiencing another period of exponential growth as a new paradigm is discovered in the late 21st century.

Given the above, you should expect a median timeline of **2100, shaded down to 2090 by Ajeya.**

Epistemic Notes

The above predictions are obviously rough. Even in a world that satisfies a particular timeline (e.g. AI by 2052), I expect the specifics of more than half of them to probably be wrong. However, the hope is that these predictions can be used as a sort of barometer, so that five years down the line, we can look back and ask, “how many of these came true?” The answer may help us figure out when we predict TAI to eventually arrive.

I also hope these predictions can be used today to clarify researchers’ own timelines. If you believe most of the predictions in Scenario 1 are plausible, for example, you may want to update toward shorter timelines, and likewise if you think Scenario 3 is plausible, you should probably update toward later timelines.

My experience in a covid vaccine trial, and why you should join one

Thu, 07 Jan 2021 20:38:01 +0000

Thanks to Aneesh Edara for reviewing this post.

Covid is probably going to get much worse before it gets better. Vaccine rollout is extremely slow, for no good reason. In Massachusetts, where I currently reside, the government doesn’t expect to have vaccines open to the general public until April to June. On top of that, the UK and South African variants of the coronavirus are also fairly likely to wreak havoc. Zvi Mowshowitz in the previous article predicts that things are likely to get worse, not better, by ~May.

In the meantime, how do we protect ourselves and those around us?

Why you should join a vaccine trial

Joining a trial is a 50% shot of getting a vaccine now, which seems like a much better deal than a 100% shot of getting a vaccine several months down the line.¹ (In a trial, there’s a 50% chance you’re placed in the vaccine group and get the real vaccine, and a 50% chance you’re placed in the placebo and get nothing.)

It’s fast (takes <1 week from intent to shot), safe (Phase 3 vaccines have already passed safety checks), and they literally pay you for it. The vaccine itself is also very likely to be effective; after all, the reason why it’s in Phase 3 trials is because it triggered good immune reactions in Phase 1/2.

If you’re interested in reducing the risk of you getting covid in the next few months, it thus seems like a good choice to make. It’s also a good altruistic choice, because your participation will probably speed up recruitment a tiny bit, making it possible that they finish the trial faster as a result.

How to join a trial

You might think that with vaccines so far down the pipeline, and Moderna’s and Pfizer’s approved in the US, there would be no trials currently enrolling. This is not true. Many trials (Johnson & Johnson, AstraZenenca, etc.) are still enrolling new participants. Here’s how to join one:

Find a trial with an enrollment center near you. Good options include:

a. Search “covid vaccine trials near me” to see if there are any recent news articles on it.

b. Head to coviddash.org to find the trial sites closest to you.

c. I made a list of websites here if you aren’t able to find any local leads. This list is partially tailored towards Boston trials as of ~1 month ago.
Sign up on their website, and then call the number they give you. At least for me, I put in my email, but never got a text response. Only after calling to follow up were they able to schedule me, and I got a shot ~4 days after that.
Go to the trial site, get screened, and hopefully get a vaccine!

My experience

I joined the single-dose ENSEMBLE study, by Johnson & Johnson. On December 8, I received a shot (which was either a saline placebo or the real deal).

Some interesting things I learned through the process:

Time commitment: For me, I had a one-hour Zoom informed consent session, then a three hour trip to the hospital to get screened and injected. The trial lasts for 2 years, with periodic in-person followups after 1 month, 3 months, and continuing at progressively longer intervals. For J&J, you also have to log twice weekly if you have had any symptoms of COVID-19, and if you do, you’ll have to come in again to receive a package of saliva tests and nasal swabs to take periodically.
Leaving the study: You can leave the study at any time, even to receive another vaccine. Moreover, if you leave the study, they will also tell you if you got the placebo or the real vaccine. This means that if you got the placebo, you can get an approved vaccine as normal. (If you got the trial vaccine, it might not be a good idea or even necessary to get another vaccine, though.)
What if the trial vaccine gets approved? In the case that the trial vaccine is approved, people are debating whether or not they should immediately vaccinate the placebo group or keep studying them. Regardless of the ethicality, it’s not entirely clear right now which option they’ll pick, but there is a chance you might get a blind crossover or an accelerated dose.
Payment: You do get paid to join a trial! Payment is ~$1000 for me, spread out across the various scheduled visits of the trial.

Other cheap personal interventions

Purchase and use a P100 mask. P100 masks are industrial masks that filter 99.97% of particles. (Compare with the “gold standard” N95, which filters 95% of particles.)² A well-fitted P100 mask could reduce your risk of getting covid from an activity by 100x. This is like, maybe better than a vaccine?? They also only cost like $30 on eBay. Just do it, it’ll be the last mask you’ll ever need to buy.

Use microcovid.org to calculate the risk of your activities. Know thy enemy. microCOVID is a project that allows you to calculate your expected covid risk of any social activity, and tells you how dangerous it is if you aim to keep your risk of getting covid at ~1%/year.

Install an Exposure Notification app on your phone. Check availability in your state here³ and either download the app on Android, or enable it in Settings on iOS.

Oh, and yell at your representatives, health officials, etc. to speed up the vaccine rollout. Seriously.

Thanks to Zvi as well for making this argument and motivating me to actually join a trial. ↩
The difference between the N and the P is that NXX masks don’t filter out oil-based substances, while PXX masks do. This isn’t relevant for covid, so both are equally good for their given filtration efficacies. ↩
Angry glares at Massachusetts for taking eight months before beginning to consider whether they should maybe have an exposure notification app. ↩

What needs to work for your Zoom call to work?

Wed, 16 Sep 2020 03:05:48 +0000

“If I have seen further, it is by standing on the shoulders of ~~giants~~ terrible and demonic abstractions.”

-Isaac Newton, probably

Zoom feels obscenely normal now. It’s become so day-to-day that Zoom fatigue is now a known thing.

But honestly, it’s a miracle every time it works. In the spirit of those blog posts about what happens when you press enter in your browser, here’s a (clearly incomplete) list of everything that has to go right for you to have that definitely pointless 10am Zoom call¹.

A note before we begin: many of these will sound ridiculously obvious, and many of you may have always had them. However, across the world, across social classes, and across levels of technical understanding, what is normal for one person may seem like a pure luxury to another. That is part of why I wanted to write this list.²

Hardware/Other Services (outside the computer)

You need stable power. If you’re on a laptop or phone, this probably comes from an integrated battery; if you’re on a desktop, it probably comes from mains electricity unless you’re lucky enough to have a UPS.
Your local electricity system needs to have enough power to meet its customers’ demand.
The power lines to your house must be intact and undamaged by hurricanes or fire/the threat of fire.
More broadly, your house must not be on fire.
You need access to the Internet.
Your Internet service provider must not be having a service outage.
Your Internet service provider must not be experiencing unprecedented loads due to COVID.
Your modem/optical network terminal that connects you to your Internet service provider needs to be working.
Most likely, your calendar and email services need to be up to find your Zoom link.
Zoom must be up.
Your router needs to be working and perhaps to be capable of handling a large amount of connections from multiple users at the same time.
Either your Ethernet cables have to be in good condition and plugged in properly, or there has to be an unobstructed path to your wifi router without too many people using the same channel.

Hardware (inside the computer)

You need a computer.
You need a computer satisfying Zoom’s minimum hardware requirements.
You need speakers or a pair of headphones.
You need a microphone.
You probably want a webcam, too. That might be a little challenging.
Your network card needs to be working, without shorts or frayed wires to degrade its performance.
Your hard drive needs to hold Zoom’s application data (and presumably, enough of your operating system to open a Zoom link) without bitrotting or failing.
Your graphics card needs to correctly render content without any artifacting.
Your computer processor needs to stably run at its current voltage and clock speed.
If you’re using Bluetooth for your headphones, your headphones need to be charged and working correctly. Same for any wireless keyboard or mouse.

Software

If you’re running Windows, let’s hope you don’t have any Windows updates pending.
Your computer must not be infected with malware that prevents it from starting.
You need enough disk space to install Zoom.
You need to not be running Windows Vista or below, or macOS 10.8 or below.
Your operating system needs to support your wifi card, and you need to have the correct drivers already installed (remember, you can’t download them just-in-time from the Internet!)
Your operating system needs to support your graphics card, and your graphics drivers must be installed.
Your graphics drivers must not be buggy or crash for you.
Your desktop environment must not crash on you.
You need a working audio software stack.
If you’re using Bluetooth audio, your operating system must support your Bluetooth chipset.
If you’re using Bluetooth audio, your operating system must support a codec that your headphones use.
Zoom must support your operating system.
Your browser must be working and not crash, so you can open your Zoom link.
You must have properly configured Zoom to use the correct webcam, microphone, and speakers.
You need to be properly logged into Zoom.

Politics

You must not be one of the 939,185 people who have died of the coronavirus as of September 16, 2020.
Or the ~60 million people who will die each year. Yeah, uh, in summary death is bad and we should try to stop it.
You must not have anything more pressing to do. Which, in today’s world, might not be true.
You must have an environment where it’s safe to do Zoom calls, and where those around you won’t stop you from doing so through harassment or any other means.
You probably want a quiet space where you can focus and deal with the social tradeoffs of virtual meetings in comparison to physical interaction.
And finally: you have to want to go to that Zoom call.

In summary

There are a lot of external or otherwise uncontrollable factors that mean that someone can’t make it to a call these days. Maybe, we can be a little kinder when someone says they can’t make it?

To open-source enthusiasts, Google fanatics, or any other group that would prefer a different videocommunications system: s/Zoom/whateveryouwant. ↩
The other part is that you can use this as a really long flowchart, if you want, for why your Zoom call isn’t working. Just kidding. ↩

Using a Yubikey as a touchless, magic unlock key for Linux

Sun, 16 Aug 2020 15:29:28 +0000

Yubikeys are great for security, but their benefits decrease somewhat when you leave them in your computer unattended.¹ I unfortunately have a habit of forgetting my key when I walk away from the computer. I also have login passwords that are way too long and easy to typo.

Thankfully, there’s a way to solve both of these problems: use a Yubikey to unlock your computer when you put it in and lock your computer when you remove it!

Prior art

The first example I remember seeing of this concept years ago was Predator, Windows software (with a delightfully retro website) that locks your computer when you remove a special USB drive. Similar examples for Linux include pamusb, which allows you to login using Linux’s PAM by inserting a specially-formatted USB stick.

Of course, nowadays most people use Yubikeys to accomplish this, and Yubico has convenient guides on how to accomplish this very task. However, I wanted to make it touchless – that is, I wanted to be able to plug in my Yubikey and instantly unlock my laptop, without clicking through logins or touching the Yubikey button. Upon removal, I wanted to instantly lock my computer.

Making it contactless²

There are some guides on how to do this online (unlock when you plug in, lock when you remove), but unfortunately most of them fall prey to the problem described in this article. A lot of them use udev to detect when the Yubikey is plugged in, but they don’t actually authenticate the key beyond checking its vendor ID, model ID, and sometimes serial number, which all can easily be faked.

To provide actual security, most official guides use either pam_u2f (which authenticates a Yubikey through the U2F protocol) or pam_yubico (which uses either online validation through YubiCloud or offline validation through a challenge-response protocol). The U2F method requires a tap on the Yubikey, while the challenge-response process can be done without user interaction, so I went with the latter. I set up traditional Yubikey authentication using this great guide from System76.

However, I still needed some way to test the challenge-response for success when I plugged in the key. Usually, pam_yubico is run when you login or unlock your computer (i.e. when pressing the enter key on the lockscreen). But I didn’t want any clicks, so I needed a way to run it without interaction.

Enter udev (again) and pamtester!

Here’s the udev rules I included:

kevin@you:~ » cat /etc/udev/rules.d/yubikey.rules ACTION::"remove", ENV{DEVTYPE}::"usb_device", ENV{PRODUCT}::"1050/407*", RUN+:"/usr/local/sbin/ykunlock.sh lock" ACTION::"add", ENV{DEVTYPE}::"usb_device", ENV{ID_BUS}::"usb", ENV{PRODUCT}::"1050/407*", RUN+:"/usr/local/sbin/ykunlock.sh unlock" 

These rules effectively call a script when inserting and removing the key, so I can trigger any action from the script. Note that the script should not immediately unlock the computer, to avoid the security issues mentioned earlier.

To actually test the challenge-response from the Yubikey on inserting, I decided to use pamtester, a simple utility that pretends to trigger a PAM authentication from the command line. Since pam_yubico is installed, this will naturally test the challenge-response if a Yubikey is plugged in.

Here’s the final script:

#!/bin/bash exec 1> >(logger -s -t "$(basename "$0")") 2>&1 echo "RUN" if [ "$1" : "lock" ]; then pkill -USR1 swayidle else # unlock if echo "" | pamtester login kevin authenticate; then # PAM login successful # kill locker kill -KILL $(pgrep swaylock) ps aux | grep swaylock # turn on displays SWAYSOCK:$(ls /run/user/1000/sway-ipc.*.sock) swaymsg "output * dpms on" fi fi exit 0 

On lock, it immediately locks my desktop (by sending a SIGUSR1 to swayidle, the program that manages locking on the Sway window manager)
On unlock, it first sees if it can authenticate using pamtester without interaction (when no Yubikey is inserted or if the key is invalid, pamtester asks for a password). If it can, it kills the lockscreen and turns on all displays using Sway WM protocols.

The final result is amazingly convenient, and has successfully made me remember to pull out my Yubikey when leaving my computer unattended more than once³! Mission success.

I’ve significantly downgraded this statement in severity after some excellent comments on Hacker News have pointed out that (1) stealing a Yubikey is incredibly unlikely unless you’re a person of interest; (2) even if you have the Yubikey, you still can’t directly extract e.g. a private key; and (3) a Yubikey protects against SSH/GPG fraud because it can require a PIN and lock out over time. Case in point, Yubikeys are good. I’d argue that it’s still not good to have your key stolen (e.g. perhaps if you’re targeted by a government/industrial espionage, or the malicious significant other attack, where they know your password and can steal your key for 2FA if unattended), but I see that it’s not as much of a risk as I originally thought. ↩
You might call it Yubikey: Coronavirus Edition. ↩
Yes, yes, I know there’s not too much of a danger because we’re all stuck at home right now. But who knows – maybe this will be helpful when we eventually get back on campus. ↩

How does "Send all to voicemail" actually work, anyway?

Mon, 10 Aug 2020 23:56:54 +0000

(Disclaimer: I’m not a mobile network engineer. I made this post after some Googling, because it feels like there’s a lot of complexity in mobile networks that even most computer people don’t talk about!)

Recently, due to increased robocalls and other spam, I’ve decided to switch all of my calling to a Google Voice account I own, whose number appears to be on far fewer contact lists than the number given to me by Mint Mobile, my SIM provider. To do this, I wanted to send all calls on my old number to a voicemail box explaining to call me at the other number. At first, I tried to see if my phone could block calls itself, but failing to find anything, I looked for something on the carrier level.

Google reveals that two magic phone numbers will accomplish this for me:

Send All To Voicemail - **21*18056377456#

Allow Calls - ##21#

I dialed the first number, and lo and behold, all my calls go to voicemail. But how does this work, anyway?

Man-Machine Interface codess

The first thing I noticed is that when I called either of the two numbers above, Android gave me a toast saying “Running MMI code.” Eventually, a popup window would show up saying: “Call forwarding [enabled/disabled].”

Further research reveals that MMI codes, or Man-Machine Codes, have actually been established since the 2G mobile specification was created. Here’s a list from a delightfully-old scan by Google Books:

Essentially, your phone will forward any code it doesn’t know about to your mobile provider, who has servers to handle specific numbered functions. It doesn’t send the actual number to the network though; instead, it gets parsed by the phone and sent out as ASN.1. (That link also explains the difference between USSD and MMI codes, which is a big rabbit hole that I don’t want to get into.)

“Unconditional call forwarding” is Service Code 21, as visible in the Send All to Voicemail number: **21*18056377456#. It looks like the second parameter after the asterisks is the number to forward to, and 1-805-637-7456 is T-Mobile’s voicemail box.

Mystery solved! “Send to voicemail” just instructs the network to forward all calls to a voicemail box’s number. It looks like the Allow Calls one just deletes the call forwarding rule.

Kevin Liu

Easy improvements to personal opsec

References

2025 year in review

Value updates

The material

Relationships

Reader’s Digest

Plans for 2026

Rating my 2024 goals

Vibes

2024 year in review

Travel and the importance of people

Experiences I enjoyed the most, rated out of 10

Experiences I didn’t like, rated out of 10

2025 action items for myself

End of text

Philosophy of Language Modelling

On autoregression

On alignment

On impact

Large Language Models can Simulate Everything

Why simulation?

Simulation & the General LLM Company as a blueprint for reliable agents

LLM simulation is useful today

How do we get there?

Acknowledgements

References

Lena (standard prompt)

History

Design

Efficacy

Usage

Criticism

See also

How do you find your people, when searching is itself an antipattern?

How do you escape social mediocrity?

The economic viewpoint

Corollaries

Unsolved problems

Notes

Deep okayness as a subjective choice of beliefs

A scattering of other good thoughts

Notes

Cooperating in everyday stag hunts

John Gall, Systemantics

good quotes

related books and media

Ken Liu, The Hidden Girl and Other Stories

Unintuitive conclusions from urban planning

Highways exhibit highly nonlinear congestion

Expanding roads doesn’t improve traffic

The speed of public transit (may*) determine the speed of car transportation

The upshot?

objects

Other People’s Objects

The Physical

Technology

Environmental Augments

Transport

The Digital

Productivity

Notetaking

Development

Chat

Fun

For More Information

projects

Launched

Research

Completed & Past Projects

Hackathon Projects

Ideas

Staying Sane in ML: Fixing Your Terrible Data Science Tools to Improve the Research Experience

Goals

Hardware

Google Colab: not even once

Be aware of the speed of your storage

Make sure your storage is backed up

Software

Short-horizon NN has a fair chance of succeeding (e.g. 40%)²

AI companies will have to wait for the entire economy to grow sufficiently to finance TAI.⁷

Making it contactless²