November 2, 2022

The Fundamental Growth Curve (Part 2)

[Followup to the previous post. I will likely collect and reorganize these shortforms once the month is over.]

The narrative from the previous post outlines my basic model for how growth works. For a typical skill, be it running cross country or solving Rubik’s cubes or engaging in mathematical research, return on investment follows a curve like this:

In the first stage, growth is fast and cheap. The core fundamentals are the easiest to pick up and best documented. Adepts willing and able to mentor novices are plenty. As the weakest one of the group, you’re no threat to anybody’s status (excepting, perhaps, the second weakest one, but they’re in no position to do anything to you).

This is the regime where the 80/20 principle holds: 80% of the absolute value in the activity is picked up with only 20% of the effort, applied with discernment. Most members in any community are novices somewhere in this first stage, enjoying the immediate gratification of visible gains while lacking the stomach for serious investment.

The gains here are absolute in nature. You might not garner any attention for picking up jogging, but your sleep apnea clears up. You might not win any money playing a game, but you become competent enough to enjoy it and be able to appreciate professional play. You might not prove any new theorems taking undergraduate math classes, but you finally understand how to calculate expected value and stop getting duped by slot-machines.

In the second stage, there is a long plateau, a “wall” that most people hit when the newbie gains fall off and serious commitment or exceptional talent seem necessary to progress.

Past the fundamentals, there exist conflicting and ambiguous schools of thought on how to progress to experthood. The athlete is told to follow three different diets by three different nutritionists. Long debates are held about which chapters of Hartshorne are mandatory to truly learn algebraic geometry. The single-digit-kyu Go player is told to focus on opening, on fighting, on life-and-death problems, on endgame, and each is apparently vitally important and the one true path to greatness.

Months of physical time are spent to see small improvements that have no practical impact on your life: whether you run a 9 minute mile or a 6 minute mile, the only thing you’ll win is a participation ribbon. You blog for years only to see your readership jump from dozens to hundreds. You practice your craft doggedly and jump from the worst surgeon in the hospital to the second-best, but when it comes time for their heart transplants, the billionaires still pass you by to wait in line for that star surgeon.

In the third stage, absolute improvements get even slower, but you finally hit a level of mastery where positional, or relative, gains kick in. And my, do they kick in fast.

You become one of the top players in your cohort, and people start to notice you. Coaches give you special treatment, you win minor awards, get sent to training camps, participate in more rarified cohorts.

You have a renewed and enormous motivation to improve: every tiny absolute improvement could move you up one giant discrete ranking. Jumping from the 11th best-selling author to the 10th doubles your book sales. Shaving a couple seconds off your personal best means a new state record and a full ride to college. Writing one additional paper in graduate school edges out the next candidate to land you the fancy fellowship that keeps your academia dreams alive.

And look, the divide between the second and third stages is not just a shallow artifact of human status regulation. It’s an essential feature of the collaborative optimization problems we face all the time. As long as people work in teams and specialize according to their relative advantage, an individual hardly contributes in a given dimension unless they are the best or nearly so.

Look at it this way: if eight college friends are taking a road trip through gravelly mountain roads, the best driver is going to drive the van most of the time, and the second best driver might help pick up some slack. How well these two drive is dreadfully important, but as for the other six – it makes not a wit of difference if they even have licenses. Everyone in that van is incentivized to pour their resources into helping that best driver become even more skilled.

To be continued…

Leave a comment

November 1, 2022

The Fundamental Growth Curve (Part 1)

In honor of NaNoWriMo, I’m joining a friend to write low-effort shortform blog posts every day this month.

The first topic I’ll write about is what growth feels like, and where people hit walls, plateau, and stagnate. Learn to optimize around the plateau, and you’ll be unstoppable.

A Parable

You have no hand-eye coordination to speak of, but your parents want you to touch grass, so you sign up for cross country. In the beginning, you’re the slowest of the pack, but you improve rapidly. You train nearly every day and your mile time drops from nine minutes to six-twenty in three months. The numbers go down. Walking up the stairs doesn’t wind you anymore. Finally the seniors are waiting for someone else to catch up. Across the whole team, you may only rise from the dead last to the 20th percentile, but to you it feels like winning the Olympics.

Then you hit a brick wall.

You need to shave off a minute twenty to make varsity, and every last second gives you a fight for its life. You slog through nearly two years of training. Seven hundred and thirty days drowning in an endless barrage of shin splints and intrusive thoughts. The fact that you rise from the 20th percentile to the 80th means fuck-all; every freshman who speeds past you to snag a trophy sends you spiraling back into that pit of self-doubt.

And then, by virtue of some minor miracle, you finally push through that wall. You paid your dues. The best runner graduates. Puberty finally kicks in. You have a good night’s rest before tryouts. Whatever the reason, you make varsity.

Coach knows your name – finally! – and starts giving you individual attention. Every second you gain is a triumph – you can see yourself climbing those rankings every single meet. Out of the blue, the gossipy neighbor asks your parents for advice on preparing her middle-school daughter for high-school sports, and mom finally starts taking this running thing seriously. They rearrange their schedules around your training, and cut you some slack on the academic side. Everything is coming together.

You spent three years pumping your exhausted legs through final laps listening to crowds cheering for someone else. Finally, one day, you cross the finish line and you know you won the race. You know because the crowd has just started cheering, and the one they’re cheering for is you.

To be continued…

Leave a comment

September 19, 2021

Where do your eyes go?

I. Prelude

When my wife first started playing the wonderful action roguelike Hades, she got stuck in Asphodel. Most Hades levels involve dodging arrows, lobbed bombs, melee monsters, and spike traps all whilst hacking and slashing as quickly as possible, but Asphodel adds an extra twist: in this particular charming suburb of the Greek underworld, you need to handle all of the above whilst also trying not to step in lava. Most of the islands in Asphodel are narrower than your dash is far, so it’s hard not to dash straight off solid ground into piping-hot doom.

I gave my wife some pointers about upgrade choices (*cough* Athena dash *cough*) and enemy attack patterns, but most of my advice was marginally helpful at best. She probably died in lava another half-dozen times. One quick trick, however, had an instant and visible effect.

“Stare at yourself.”

By watching my wife play, I came to realize that she was making one fundamental mistake: her eyes were in the wrong place. Instead of watching her own character Zagreus, she spent most of her time staring at the enemies and trying to react to their movements and attacks.

Hades is almost a bullet hell game: avoiding damage is the name of the game. Eighty percent of the time your eyes need to be honed on Zagreus’s toned protagonist butt to make sure he dodges precisely away from, out of, or straight through enemy attacks. In the meantime, most of Zagreus’s own attacks hit large areas, so tracking enemies with peripheral vision is enough to aim your attacks in the right general direction. Once my wife learned to fix her eyes on Zagreus, she made it through Asphodel in only a few attempts.

This is a post about the general skill of focusing your eyes, and your attention, to the right place. Instead of the standard questions “How do you make good decisions based on what you see?” and “How do you get better at executing those decisions?”, this post focuses on a question further upstream: “Where should your eyes be placed to receive the right information in the first place?”

In Part II, I describe five archetypal video games, which are distinguished in my memory by the different answers to “Where do your eyes go?” I learned from each of them. I derive five general lessons about attention-paying. Part II can be safely skipped by those allergic to video games.

In Part III, I apply these lessons to three specific minigames that folks struggle with in graduate school: research meetings, seminar talks, and paper-reading. In all three cases, there can be an overwhelming amount of information to attend to, and the name of the game is to focus your eyes properly to perceive the most valuable subset.

II. Lessons from Video Games

Me or You?

Hades and Dark Souls are similar games in many respects. Both live in the same general genre of action RPGs, both share the core gameplay loop “kill, die, learn, repeat,” and both are widely acknowledged to be among the best games of all time. Their visible differences are mostly aesthetic: for example, Hades’ storytelling is more lighthearted, Dark Souls’ more nonexistent.

But there is one striking difference between my experiences of these two games: in Hades I stared at myself, and in Dark Souls I stared at the enemy. Why?

One answer is obvious: in Dark Souls, the camera follows you around over your shoulder, so you’re forced to stare at the enemies, while in Hades the isometric camera is centered on your own character. This is good game design because the camera itself gently suggests the right place for your eyes to focus, but it doesn’t really explain why that place is right.

The more interesting answer is that your eyes go where you need the most precise information.

In both games, gameplay centers around reacting to information to avoid enemy attacks, but what precisely you need to react to is completely different. Briefly, you need spatial precision in Hades, and temporal precision in Dark Souls.

In Hades, an enemy winds up and lobs a big sparkly bomb. The game marks where it’ll land three seconds later as a big red circle. You don’t need to know precisely when the bomb was lobbed and by whom – getting out of the red circle one second early is fine. But you do need to see precisely where it’ll land so you can dash out of the blast zone correctly. When there’s dozens of bombs and projectiles flying across the screen, there might be only a tiny patch of safe ground for you to dash to, and being off by an inch in any direction spells disaster. So you center your vision on yourself and the ground around you, to get the highest level of spatial precision about incoming attacks.

In Dark Souls, a boss winds up and launches a three-hit combo: left swipe, right swipe, pause, lunge. As long as you know precisely when it’s coming, where you’re standing doesn’t matter all that much – the boss’s ultra greatsword hits such a huge area that you won’t be able to dash away in time regardless. Instead, the way to avoid damage is to press the roll button in the right three 0.2-second intervals and enjoy those sweet invincibility frames. The really fun part, though? The boss actually has five different attack patterns, and whether he’s doing this particular one depends on the direction his knees move in the wind-up animation. So you better be staring at the enemy in Dark Souls to react at precisely the right time.

Human eyes have a limited amount of high-precision central vision, so make it count. Don’t spend it where peripheral vision would do just as well.

Present or Future?

Rhythm games have been popular for a long time, so you’ve probably played one of the greats: Guitar Hero, Beat Saber, Osu!, piano. Let’s take Osu! as a prototypical example. The core gameplay is simple: circles appear on the screen to the beat of a song, and to earn the most points, you click them accurately and at the right rhythm. Harder beatmaps have smaller circles that are further apart and more numerous; a one-star beatmap might have you clicking every other second along gentle arcs, while a five-star map for the same song forces your cursor to fly back and forth across the screen eight times a second.

There’s one key indicator that I’m mastering a piece in a rhythm game: my eyes are looking farther ahead. When learning a piano piece for the first time, I start off just staring at and trying to hit the immediate next note. But as I get better at the piece, instead of looking at the very next note I have to play, I can look two or three notes ahead, or even prepare for an upcoming difficulty halfway down the page. My fingers lag way behind the part I’m thinking about.

Exercise: head on over to https://play.typeracer.com/ and play a few races, paying attention to how far ahead you can read compared to what you’re currently typing. I predict that with more practice, you’ll read further and further ahead of your fingers, and your typing will be smoother for it. It’s a quasi-dissociative experience to watch yourself queue up commands for your own body two seconds in advance.

Act on Information

As a weak Starcraft player (and I remain a weak Starcraft player), I went into every game with the same simple plan. Every game, I’d build up my economy to a certain size, and then switch over to producing military units. When my army hit the supply cap, I’d send it flooding towards the enemy base.

At some point, I heard that “Scouting is Good,” so at the beginning of each match I’d waste precious resources and mental energy sending workers to scout out what my enemy was doing. Unfortunately, acquiring information was as far as my understanding of scouting extended. Regardless of what I saw at the enemy base, I’d continue following my one cookie-cutter build order. At very best, if I saw a particularly dangerous army coming my way, I’d react by executing that build order extra urgently. This amounted to the veins in my forehead popping a bit more while nothing tangible changed about my gameplay.

To place your eyes in the right place is to gather the right information, and the point of information-gathering is to improve decision-making. Conversely, the best way to improve at information-gathering is to act on information. If you don’t act on information, not only do you not benefit from gathering it, you do not learn to gather it better. If I went back to learn scouting in Starcraft again, I’d start by building a flowchart of what choices I’d change depending on the information I received.

Filter Aggressively

I was introduced to Dota 2 and spent about four hundred hours on it in the summer of 2020 (of course, this makes me an absolute beginner, so take this part with a healthy pinch of salt). Dota is overwhelming because of the colossal amount of information and systems presented to you – hundreds of heroes and abilities, hundreds of items and item interactions, and multitudes of counterintuitive but fascinating mechanics that must have gone through the “it’s not a bug, it’s a feature” pipeline.

To play Dota is to be constantly buffeted by information. I watch the health bars of the enemy minions to last hit them properly, or I won’t make money. I watch my own minions to deny them from my opponent. I track my gold to buy the items I need as soon as I can afford them. I pay attention to the minimap to make sure nobody from the enemy team is coming around to gank. I watch my team’s health and mana bars, and the enemy team’s, to look for a weak link or opportunity to heal. I can click on the enemy heroes to look at their inventories, figure out who is strong and who is weak, and react accordingly. And this might all be extraneous information: maybe the only important information on the screen is the timer at the top of the screen which says the game started 1 minute and 50 seconds ago.

*The clock might be the most important information on this screen.*

To understand why the game timer might be the most decision-relevant information out of all of the above, you have to understand a particularly twisted game mechanic in Dota, the jungle monster respawn system. You see, the jungle monsters in Dota each spawn in a rectangular yellow box that you can make visible by holding ALT. The game is coded such that, every minute, a monster respawns if its spawn box is empty. You read that right – the monsters don’t have to die to respawn, they just have to leave the box. Exploiting this respawn mechanic to make many copies of the same monster is called “stacking,” and is a key job for support players: if you attack the jungle monsters about 7 seconds before the minute, they’ll chase you just far enough that a duplicate copy of the monster spawns. This means that near the beginning of the game, a good support player can “stack” two, three, or even four copies of a jungle monster for his teammates to kill later, even if nobody on the entire team is strong enough to fight a single one directly. Fifteen minutes later, leveled up teammates can come back and kill the entire stack for massive amounts of gold and experience.

Stacking is further complicated by an endless litany of factors, but the most interesting is probably this: the enemy can easily disrupt your stacks. The game code only checks if the yellow spawn box is empty, not what’s inside the box. A discerning opponent can foil your whole stacking game-plan just by walking into the appropriate box at the 1:00 mark and standing there for a second. More deviously yet, he might buy an invisible item at the beginning of the game to drop in the box before you even reach the area.

Anyhow, some support players have a job to do every minute, which is either stacking or a closely related thing called “pulling.” When they’re gone doing this job, this leaves the hero they’re supporting vulnerable to a two-on-one. This is where the game timer comes in: the enemy support leaving the lane might be my best opportunity to be aggressive and land an early kill. And so, out of all the fancy information on the screen, I need to be checking the game timer frequently. Some early game strategies hinge on correctly launching all-out attacks at 1:50, and not, say, 1:30.

Treat the world as if it’s out to razzle dazzle you, and your job is to get the sequins out of your eyes. Filter aggressively for the decision-relevant information, which may not be obvious at all.

Looking Outside the Game

There is a certain class of video games that are difficult, if not impossible, to play without a wiki or spreadsheet open on a second monitor. This can be due to poor game design, but just as often it’s the way the game is meant to be played for good reason, and it’s the mark of an inflexible mind to refuse to look outside the game when this is necessary.

Consider Kerbal Space Program. You can learn the basics by playing through the tutorials, and can have plenty of fun just exploring the systems. But unless you’re a literal rocket scientist you’ll miss many of the deep secrets to be learned through this game. There’s no way you’ll come up with the optimal gravity turn trajectory yourself. If you don’t do a little Googling or a lot of trial-and-error, your attempts at aerobraking will probably devolve into unintentional lithobraking. You’ll have a nightmarish time building a spaceplane without knowing the correct relationship between center of mass, center of lift, and center of thrust, and it’s highly unlikely you’ll figure out that a rocket is a machine that imparts a constant amount of momentum instead of a constant amount of energy, or that you can exploit this fact by accelerating at periapsis. And god forbid you try to eyeball the perfect transfer window in the next in-game decade to make four sequential gravity assists like Voyager II.

*Some games are meant to be played on multiple monitors.*

The right place to look can be outside the game entirely. Whether it’s looking up guides or wikis, plugging in numbers to a spreadsheet to calculate optimal item builds, or using online tools to find the best transfer windows between planets, these can be the right place to put your eyes instead of on the game window itself.

III. Applications to Research

In this last part of the post, I apply the principles above to three core minigames in academic mathematics: talks, papers, and meetings. For each of these minigames, we’ll try to figure out the best places for our eyes to go, informed by the following questions.

Should I focus on myself or the other person?
How far into the future should I be looking?
How can I act on the information I receive?
Out of all the information being thrown at me, what is decision-relevant?
Might the best way to get better at the game be outside the game itself?

Talks

When giving a talk, self-consciousness is akin to keeping your eyes on yourself. Telling yourself not to be self-conscious is about as useful as trying not to think about the polar bear; negative effort rarely helps. Move your eyes elsewhere: to the future and to other people. Rehearse your presentation and anticipate the most difficult parts to explain. Pay attention to your audience and actually look at them. See if you can figure out who is engaged and who is daydreaming. Find one or two audience members with lively facial expressions, study them, and act on that information – their furrowed brows will tell you whether you’re going too fast.

When listening to a talk, realize that there will typically be more information than any one audience member can digest. Sometimes this is the fault of the speaker, but just as often, this information overload is by design and functions similarly to price segmentation. Noga Alon recently joked to me, “Going to a talk is difficult for everyone because nobody understands the whole thing, but it’s especially difficult for undergraduates because they still expect to.” Information at a variety of levels of abstraction are presented in the same talk, so that audience members with widely varying backgrounds can all get something from it. An undergraduate student might understand only the opening slide, and a graduate student the first ten minutes, while that one inquisitive faculty member will be the only person who understands the cryptic throwaway remarks about connections to class field theory at the very end. Filter aggressively for the parts of the talk aimed at you in particular.

Remember that the topic of the talk itself is rarely as interesting as the background material mentioned in passing in the first ten minutes. These classics – the core theorems and examples mentioned time and time again, the simplest proof techniques that appear over and over – are the real gems if you don’t know them already. Sometimes you can learn a whole new subfield by sitting in on a number of talks in the area and only listening to the first ten minutes of each. Bring something discreet to keep yourself occupied for the other fifty.

Remember also: information you never act on is useless information. A couple years back, I was taking a nap in a computer science lecture about a variation on an Important Old Theorem. As I was nodding off, I noticed to my surprise that my adviser, sitting nearby, was quite engaged with the talk. I was very curious what caught his attention, and he enlightened me on our walk back to the math department: instead of listening to the talk, he’d spent most of the hour looking for a better proof of the Important Old Theorem introduced in the first five minutes. From this, I learned that the most important information in a talk might be an unsolved problem, because it is certainly the easiest information to act on.

This conversation with my adviser had a great effect on me, and every so often I practice this perspective by going to a talk for the sole purpose of hearing new problems. As soon as I hear an interesting problem, I zone out and try to solve it immediately. Anecdotally, it worked a couple times.

Papers

Most of this section was already covered in Of Math and Memory (part I, part II, part III), but I’ll reiterate here the relevant bits. Mathematical proofs are rarely meant to be written or read linearly. Instead, they are ideally arranged as a collection of outlines of increasing detail: a five-word title, a paragraph-long abstract, two pages of introduction, a four page technical outline, and only then the complete 20-page proof. Each outline is higher-resolution than the next, giving readers the chance to pick the level of understanding that suits their needs.

This organization is meant to solve one basic difficulty: it’s very hard to follow a proof without knowing where it’s going. Without reading the proof outline, you can’t tell which of the ten lemmas are boilerplate and which are critical innovations. Without running through a calculation at a high level, there’s no way to know which of alpha, n, epsilon, x, and y are important to track, and which are throwaway error terms. Reading through a paper line-by-line without knowing where you’re going, it’s easy to get lost in the weeds and dragged down endless rabbit holes – black boxes from previous papers to unpack, open problems to mull over, lightly explained computations which might contain typos – and while these rabbit holes might be worth exploring, you would do well to map them all out before picking one to dive into.

When reading a paper, orient your eyes towards the future whenever possible, like reading several words ahead in a game of TypeRacer. Scan the paper at a high level to understand the big picture, then read all the theorem and lemma statements to see how they fit together, and only then decide which weeds to get into. Only check a difficult computation once you already know what its payoff will be.

Meetings

In research, the vast majority of your time is spent in one of two ways: bashing your head against a wall alone, or bashing your head against a wall with company. These two activities can be more or less traded freely for each other to suit your level of introversion, and I’ve found that I usually prefer meeting with others to work on research together over working alone.

One of the pitfalls of working with others, especially when you are young and underconfident, is that you can naturally slide into the role Richard Hamming calls a “sound absorber”:

For myself I find it desirable to talk to other people; but a session of brainstorming is seldom worthwhile. I do go in to strictly talk to somebody and say, “Look, I think there has to be something here. Here’s what I think I see …” and then begin talking back and forth. But you want to pick capable people. To use another analogy, you know the idea called the `critical mass.’ If you have enough stuff you have critical mass. There is also the idea I used to call `sound absorbers’. When you get too many sound absorbers, you give out an idea and they merely say, “Yes, yes, yes.” What you want to do is get that critical mass in action; “Yes, that reminds me of so and so,” or, “Have you thought about that or this?” When you talk to other people, you want to get rid of those sound absorbers who are nice people but merely say, “Oh yes,” and to find those who will stimulate you right back.
~ Richard Hamming, You and Your Research

On top of underconfidence, I suspect that the chief mistake “sound absorbers” make is that they have the wrong idea about where their eyes should be in a research meeting. I think a “sound absorber” is completely fixated on personally solving the problem. Not having generated interesting ideas for solving the problem, they contribute nothing at all. Again, this mistake is akin to being too self-conscious, and keeping your eyes on yourself when there is no useful information to be had there.

Personally solving the problem is certainly a great outcome for a research meeting, but it’s by no means the only goal. First of all, there’s a world of difference between personally solving the problem and getting the problem solved. If your collaborators are any good, they are just as likely to come up with the next crucial idea as you are, so truly optimizing for getting the problem solved involves spending a substantial fraction of your time supporting the thought processes of others. Repeat their thoughts back to them, write down and check their calculations, draw a nice picture or analogy for what they’re doing on the blackboard, project your enthusiasm for their insights. You can do all this without generating a single original thought, and still help with getting the problem solved.

Getting the problem solved is a higher value than personally solving the problem, but higher still is the value of improving at problem-solving in general, and this holds doubly if you’re still performing your Gravity Turn. Especially when meeting with your PhD adviser or another senior mentor, focus a substantial minority of your attention on modelling the thought processes of your mentor. Figure out and note down what examples and lemmas they pull out of their toolbox time and time again, what calculations and simplifications they do instinctively, and how they react when stuck on a problem. Learn the particularities of how they perform literature searches, who they ask for help about what, and how they decide if and when to give up. None of these decisions are arbitrary; they form an embodied model of the terrain in your field. Watching the other person can often be a better use of your time than staring blankly at the problem.

In hurried conclusion, research is like a simpler version of Dota: we are bombarded by information on all fronts, most of which we don’t even notice, and tasked to make complicated, heavy-tailed decisions. A fundamental skill in any such game is orienting your eyes – literally and figuratively – at the most valuable and decision-relevant information. Reacting to and executing on this information comes later, but you can never act properly if you don’t even see what you need to do.

1 Comment

August 16, 2021

Gravity Turn

[The first in a sequence of retrospective essays on my five years in math graduate school.]

My favorite analogy for graduate school is the gravity turn: the maneuver a rocket performs to get from the launch pad to orbit. I like to imagine a first-year graduate student as a Falcon X rocket, newly-constructed and tasked with delivering a six-ton payload into low Earth orbit.

Picture this: you begin graduate school, fresh as a rocket arriving at Cape Canaveral and bubbling with excitement for your maiden voyage. Your PhD adviser, on the other hand, is the Hubble Space Telescope. Let’s call her Dr. Hubble (not to be confused with the astronomer of the same name). Dr. Hubble is ostensibly the ideal guide for your first orbit insertion. After all, she is famously good at staying in orbit – she’s been up there since 1990.

But problems quickly arise as you probe Dr. Hubble for advice on how to approach the launch. Namely:

She left Earth more than thirty years ago, and space technology has since been completely revolutionized.
She states all advice at an extremely high level with birds-eye-view detachment, observing, as she is, from a vantage point a thousand miles overhead.
Most fatally, the Hubble Space Telescope vessel does not include the lower-stage rockets that brought her into space. In fact, she doesn’t include large engines of any kind. Her thirty years of experience free-falling in orbit will do you very little good until you break out of the stratosphere.

The problem is even worse than this, however. It is not that Dr. Hubble, despite her best intentions, gives outdated advice. It is not even that Dr. Hubble cannot consciously articulate all the illegible skills she’s reflexively performing to stay in orbit. The problem is that even if you could perfectly imitate what Dr. Hubble is doing right now, you would likely still crash and burn.

What I didn’t understand going into graduate school is that academic mathematicians are often working in a state akin to the free-fall of orbit. The Hubble Space Telescope remains in orbit around Earth because it travels horizontally so quickly that, even as it’s continuously accelerating towards the Earth, it continually misses. The laws of physics have arranged it so that it is not possible – barring deliberate sabotage – for her to fall back into a sub-orbital trajectory.

Similarly, a successful research professor is embedded in an intricate system that, as surely as Newton’s laws, keeps her in a state of steadily producing new research. Many of her ground-breaking papers are not one-off productions – they produce sequels, variants, and interdisciplinary applications year after year. She has cultivated dozens of long-time collaborators of the highest level who freely share ideas and research directions, and has the reputation to find more at will. She attends conferences every other month that keep her updated on the leading edge of the field. Every year her research group grows, as if by clockwork, adding a couple graduate students and postdocs to whom she can delegate projects with only the gentlest supervision. As a result, the careers of many other people depend on Dr. Hubble to continue producing research at a steady rate. Every incentive is aligned for objects in motion to stay in motion, and it would take deliberate sabotage to bring Dr. Hubble out of her successful research trajectory.

This is not to say that academic researchers all start cruising in free-fall after they leave graduate school or make tenure. It is perfectly normal for a spaceship that reaches orbit to proceed onto its next adventure after some rest, continuing on to visit another planet or leave the solar system altogether. The best researchers I know are similarly courageous, taking on more responsibilities and pushing past their comfort zones time and time again. I’m merely remarking that once one reaches a certain horizontal velocity in space, it is actively hard to fall back down from the sky.

Contrast this to the sorry state of Dr. Hubble’s new graduate student stranded on the launchpad under the blistering Florida sun. He has no prior publications producing continuous dividends, no access to brilliant and dependable collaborators, no knowledge or intuition about what problems are within reach, no students to farm ideas out to, and no reputation to trade off for any of the above. Above all, nobody else really depends on him, so his motivation to succeed is mainly shallow self-interest. This is particularly hard on him, as there are many things he would do in a heartbeat for someone else that he can’t work up the energy to do for himself. The singular advantage he has over his adviser is youth – a finite amount of extra fuel that he must burn quickly and judiciously like a first-stage booster rocket in order to reach her altitude.

There is a paradox inherent to orbit insertion: rockets launch straight up, while orbit is all horizontal. For some diabolical reason, a spaceship must spend its initial phase accelerating in a direction completely perpendicular to its desired velocity. That reason is called the atmosphere: in order to avoid continuously paying the toll of air resistance, a rocket spends a period of time flying straight up. But any additional vertical motion past the upper atmosphere is wasted motion, so at some point (and sooner is better than later), the rocket starts turning smoothly towards the horizon and accelerating towards orbit. Thus is birthed the smooth quasi-hyperbolic curve known as the gravity turn, the ideal orbit insertion trajectory.

How is graduate school like a gravity turn? For one, it is an enormous error in a gravity turn to try to directly imitate the velocity vector of a ship in space while still at sea level. Regardless of its power, a rocket launched horizontally will quickly nose-dive into the Atlantic. Similarly, a student can rarely succeed in graduate school by solely imitating the activities of established researchers. The student must engage instead in certain activities, such as studying fundamental background material and actively networking, that are mostly orthogonal to a research professor’s day-to-day.

For another, it is an equally enormous error to dip your nose cone towards the horizon too late, and spend too much fuel accelerating vertically. Once you break the atmosphere, all excess vertical velocity is wasted motion. At some point during graduate school, the student must transition away from activities that only grant temporary altitude. Becoming knowledgeable gets you to a great place to start doing research at a higher level. But spending too much time studying without attempting original research renders you a mere encyclopedia. Taking classes, networking, applying for fellowships, and going to student summer schools all follow the same principle – there is an appropriate amount to do, past which they increasingly approach wasted motion as far as getting into orbit is concerned. (Of course, if you enjoy any given activity intrinsically, then by all means continue to do it as much as you want.)

An additional consideration is that, while the gravity turn is the most technically fuel-efficient method of orbit insertion, not everyone who arrived in orbit took this most efficient path. In every department there are superstar students who were outfitted with nuclear reactors in place of conventional rocketry, and these folks get to space by pointing their nose cones in any old direction and blasting off. If you’re such a person, just blast off; calculating the optimum gravity turn curve might be the real wasted motion. Also, many of your professors will likely have fallen in this rarefied category in their own graduate school experience, so their advice on efficient gravity turns will be entirely theoretical in nature.

It is worth remarking though, that even a nuclear rocket might learn something useful from practicing the gravity turn maneuver. Just because you have an easy time leaving Earth’s atmosphere and have no need of finesse, doesn’t mean your travels won’t land you on Venus someday. And breaching that monstrous atmosphere will take every ounce of efficiency you can muster.

A natural question remains: if many graduate school activities only count for temporary vertical altitude, what constitutes horizontal motion that is useful for permanently entering orbit? Examples include:

Producing good research, as every nice paper you write continues to pay dividends year after year.
Becoming an attractive collaborator, partly by acquiring enough reputation that people are willing to work with you, and partly by being productive and pleasant enough that they stick around.
Learning to support the research of others, as much of your potential impact lies not in personal contribution, but in the network effects accumulated from being a positive community member.

This last skill begins at the very start of graduate school, where the biggest immediate impact you can likely have is facilitating your adviser’s and other collaborators’ research.

I will close by reminding the reader that the gravity turn maneuver is not a truth delivered from up high that holds for all time across all circumstances, but an engineered solution to an inelegant and ever-varying practical problem. Launching from a moon base, for example, does not require a gravity turn at all because the moon has no atmosphere to fight against. There, you could comfortably reach orbit by blasting off almost horizontally from the lip of a crater. Only you know exactly where you’re launching from and the thrust-to-weight ratio of your vessel. Adjust your gravity turn accordingly.

I hope it is a comforting thought that free-fall is possible: that one day through all the striving of graduate school you may reach a position where the system propels you forward in your research and all you have to do is sit back and relax. I hope that on that day you continue to strive anyway.

3 Comments

December 16, 2020

Two Explorations

Much has been written about the fundamental opposition between explore and exploit, chaos and order, yin and yang. In this post I make two observations about the psychology of this opposition.

In the first part, I challenge the metaphor of the comfort zone: a slowly-changing region in activity-space where everything inside is comfortable and everything outside induces anxiety. The point is that anxiety depends not only on the spookiness of the activity itself, but on one’s proximity to safety. I am afraid because it is dark; I am terrified because the light switch is all the way down the hall. In particular, it is possible to reduce anxiety by bringing your comfort zone with you in the form of a safety behavior or a trusted companion. This serves as an alternative to Comfort Zone Expansion.

In the second part, I note that explore and exploit are often embodied in the human personality as two competing subagents. In almost everyone I’ve met, one of these subagents dominates the other. I tell four typical stories of this imbalance, and then suggest that something better is possible. This is perhaps the central example of integrating disagreeing subagents.

Part 1: Distance to Safety

1. Mother and Child

(The following is my retelling of an old story, dating back probably to at least Jean Piaget. I first encountered a variant of this story in The Monkey Wars, where identical behavior was observed in primates.)

A mother brings her child, a girl of perhaps three or four years of age, to an empty playground at the park. Autumn has progressed to the stage where leaves fall in twos and threes. The girl peeks out of the folds of her mother’s coat, staring now at the swing set, now at the neon tube slide. With a nudge, the mother pushes her daughter onto the mulch and gives her an encouraging nod. After glancing back to her mother several times, the girl slowly approaches the playground and begins to play.

Soon, she is clambering about, but periodically looks back at the bench where her mother is sitting. Each time their eyes meet, the girl waves, pig-tails dancing, and the mother waves back. The girl then returns to free play.

At some point, the mother briefly leaves her post to chat with an acquaintance. The girl, pepping herself up to go down the slide for the first time, looks back, finds her mother gone, and panics. She curls up into a ball at the top of the playground and fights back tears, trying to make herself as small as possible. The thought of going down the slide vanishes from her mind.

2. The Difficulty of Dark Souls

Dark Souls (III) is one of those difficult video games which spawns endless internet debates about whether every game needs an easy mode. (“I have enough challenge in my day job, I just want to relax when I play a game,” says the one. “Git gud scrub,” says the other.) Dark Souls is not in fact mechanically difficult, or at least not exceptionally so. Just off the top of my head, Celeste, XCOM 2, and FTL were all significantly more difficult mechanically for me than Dark Souls. And yet I do believe that Dark Souls was the hardest game I ever played.

Call me a coward, but the main difficulty of Dark Souls for me is the psychological difficulty of its dominant aesthetic: loneliness and nihilism. You, the protagonist, are born “Nameless, accursed Undead, unfit even to be cinders. And so it is… that ash seeketh embers.” All the friendlies in the world have been sitting in the same positions for eons before you died, and will be sitting there long after you return to ash. The brief encounters with friendly NPCs in the wild are ephemeral: each person passes in and out of your life, and you are as likely to meet them again as to stumble upon their ashes. The entire rest of the world is Out to Get You: treasure chests that turn out to be mimics, halberdiers hide between crates, face-eaters hang from ceilings. Your job is to save a world that doesn’t care to be saved.

I could never play Dark Souls for more than a couple hours at a time, and found myself constantly teleporting back to base. I told myself I came home to level up, to upgrade gear, and to purchase items, but the truth of the matter is: I kept coming home just to hear a friendly human voice again.

Code Vein is one of many Dark Souls knockoffs, notorious mainly for the addition of anime waifus to the familiar formula. For all its abysmal enemy variety and boring level design, I loved playing Code Vein for one simple reason: I can bring a companion. Bringing a companion on my journey solved all my anxiety in-game. It didn’t matter so much that my digital companion got stuck in corners and committed suicide in boss fights and repeated the same half-dozen canned lines. The feeling that someone has my back let me enjoy a Souls-like world for entire afternoons at a time, and I almost never felt the need to teleport back home.

Upshot

Human beings explore too little. One common solution to this problem is Comfort Zone Expansion, gently straying beyond the boundary of your comfort zone to notice things out there are not as scary as you thought. This can be a fine solution, but is inadequate for situations where you are required to immediately leave your comfort zone far behind for long periods of time.

Perhaps you travel internationally for the first time. Perhaps you attend a self-improvement workshop with cult-ish vibes in a remote location. Perhaps you try to prove the Riemann hypothesis, and are beset on all sides by the diabolical malice inherent in the primes.

If so, remember that exploration anxiety is a function of how far (you feel like) you are from safety.

The little girl comfortably clambers around the playground when her mother is nearby. When the mother disappears, the girl curls up in fear and is incapable of sliding down the exact same slide she went down a minute ago. The slide didn’t change, the girl’s distance to safety did. Playing Dark Souls, I teleport to base after finding each new bonfire (checkpoint). Playing Code Vein, a companion follows me around and the psychological need return home disappears. Every time a married person gives an acceptance speech, they thank their spouse for being “their rock.” This is a rather unflattering term for “unwavering center of my comfort zone.”

What are concrete applications of this principle?

One purpose of collaboration is for each collaborator to serve as a mobile comfort zone for the others. This might explain why most successful startups are founded by a small group and not an individual or a larger group. The comfort zone effect hits rapidly diminishing returns past one trusted collaborator. In this lens, the purpose of open communication in collaboration is simply to feel psychologically close to the other person.
The optimal solution to comfort zone expansion may be planting a small number of well-spaced “bases of operation.” Instead of continuously expanding one connected chunk of activity-space, plant comfort flags on the points of an ε-net. Comfort zones, like lighthouses and highway truck stops, cover more space if you place them far apart. If anxiety is a major limiting factor for you, consider focusing your energy on a small number of extremely different activities so that the comfort zones that radiate out of each one together cover as much space as possible.

Part 2: Explore Versus Exploit

The previous part upgrades the usual model for comfort zone expansion, taking for granted the value of exploration.

This part turns to a new topic altogether: the internal conflict between the drive to explore and the drive to exploit.

Hereafter I assume a multi-agent model of the mind, and refer to two common subagents: the “explore” subagent which tends towards freedom, creativity, contrarianism, and chaos, and the “exploit” subagent which tends towards structure, discipline, lawfulness, and order. Whether to interpret these subagents as full-blown independent subpersonalities or merely conflicting desires within a single mind is entirely up to you, and should not affect the meaning of this post.

I begin with four archetypal stories to illustrate varying levels of imbalance between “explore” and “exploit.” These are amalgams of real stories from my own life and others’.

1. The Unmoored

The Unmoored was stifled as a girl, surveilled at every quarter-hour by parents, teachers, tutors, and coaches. Every day from dawn to dusk was packed with activity that would gentle her mind and ennoble her condition. As her fingers marched from piano to textbook to tennis racket, her mind danced farther and farther away into Wonderland.

But even her dreams are turned against her. After she shows a passing aptitude for story-telling, her parents sign her up for creative writing classes and poetry jams. The beloved characters in her flights of fancy are clamped into straightjackets and paraded onto stage to be judged by panels of condescending curmudgeons.

When she finally escapes her shackles, there is no temperance to her wildness, no second-guessing, no backward glances. She drops out of pre-med, then out of college, then out of polite society altogether. The world is joy and light and one-way airfares.

Many years later, she snaps awake in the lap of a street urchin in the outskirts of Ulaan Bataar. She’d had a strange dream: she was back in front of the piano playing Mendelssohn, and she liked that feeling of rote and mindless obedience to the sheet music. She shakes off that absurd notion and takes another hit.

2. The Dreamer

Like the Unmoored, the Dreamer yearns to be free, to one day live fully unconstrained to pursue his creative vision. He has not quite decided whether he wants to write screenplays, musicals, or novels; his creative side finds the even the idea of making such a decision fettering. In the meantime, he works an programming job that makes good money.

Unlike the Unmoored, the Dreamer knows the instrumental value of discipline and constraint. Under the flickering lamplight, he writes short stories with a tomato timer by his side, following a classic book of writing prompts. The more exotic the prompt, the more alive he feels bending around it.

At times, the Dreamer worries that his day job is changing him. He cannot help but take joy in turning on his monitors in the morning, in passing code review on the first try, in following procedures to the letter to build something that comes alive before his very eyes.

When he notices himself feeling this way, the Dreamer clamps down on this joy and reminds himself, “I hate this boring, technical job. I’m only working it to get my art off the ground. One day, I will finally be free of this drudgery.”

The tomato timer goes off and he goes back to writing. He never considers that part of his soul might not want to be free.

3. The Magpie

Unlike the Unmoored and the Dreamer, the Magpie is primarily motivated by the joy and comfort of the known. She delights in decorating and redecorating her cozy little apartment, in organizing her books and plants in tidy rows, and in folding origami animals that she can leave around all the rooms, each a permanent addition to her little family.

One day, she hopes to add a few extra bedrooms and a full bathtub to that apartment, and a partner and children to that family. She is the Dreamer’s coworker, but unlike him she hopes to stay at this programming job for the rest of her career. She imagines inviting direct reports into her well-lit corner office to admire her cactus collection and ask for her advice on the operating system she helped architect.

The Magpie understands the instrumental value of exploration and creativity, but fears it. Every year, she takes a vacation to a scary new place to challenge herself, but more importantly to bring back souvenirs, pictures, and memories, to better decorate her place. At work, she pushes herself to learn new technologies and programming languages, but she yearns for the day she won’t have to do this any more.

The Magpie believes that at the end of the day, one should only explore until one finds the best place to nest.

4. The Recluse

Like the Magpie, the Recluse is primarily motivated by comfort and familiarity. But his world was constantly in flux for too long, so he does everything in his power to hide away in his comfort zone and wall off the rest of the world.

His parents divorced before he entered middle school, and he ping-ponged between their two new families, never truly belonging to either. Traveling, learning new things, meeting strangers all terrify him, and yet he’s forced to do more and more of all of these to survive.

When he finally finds a place to settle down, he never leaves it again. In every relationship, he becomes deeply codependent. He is a consistent member of a few local clubs; new faces join and leave, but he is one of the few who remain through the years. If he had his way, the clubs would meet in his living room and he’d never have to go outside.

Occasionally, he reconnects with an old friend who is a Dreamer or an Unmoored, and becomes deeply fascinated with their alien way of being. It is hard for him to understand how comfort and familiarity can be claustrophobic, but he’s glad he has these friends. They bring him new knowledge and experiences in a safe and digested way, or if not, at least they send him postcards.

Upshot

In each of the above four stories, there is an imbalance between the “explore” subagent and the “exploit” subagent.

The Unmoored lives to “explore.” The “exploit” subagent is greatly suppressed or externalized.

The Dreamer also lives to “explore”, but understands the instrumental value of “exploit.” He views “exploit” as a unsightly means to an end, and suppresses the needs of the personality (comfort, safety, order) associated with it.

The Magpie lives to “exploit,” but understands the instrumental value of “explore.” She views “explore” as a dangerous means to an end, and suppresses the needs of the personality (freedom, creativity, chaos) associated with it.

The Recluse also lives to “exploit.” The “explore” subagent is greatly suppressed or externalized.

In all four cases, one subagent or the other dominates the personality, and holds the other subagent and its needs in contempt. Internal alignment of the two subagents can only occur if the whole person recognizes not only the instrumental value of each subagent, but respects their needs as ends in themselves. Here is my loose prescription for alignment, which might be attempted with an exercise like Internal Double Crux:

If you are near the extremes (the Unmoored or the Recluse), learn to recognize at least the instrumental value of the suppressed subagent. If you lean heavily towards exploring, recognize that more systematic exploiting can often make you better at exploring in the long run. Similarly, if you lean heavily towards exploiting, recognize that more systematic exploring can often make you better at exploiting in the long run. Hopefully, you will level up into the Dreamer or the Magpie.
If you are near the middle (the Dreamer or the Magpie), learn to respect the needs of the weaker subagent as ends in themselves. If you lean towards exploring, realize that it’s genuinely ok for you to enjoy checking boxes, following rules, and tidying things up. If you lean towards exploiting, realize that it’s genuinely ok for you to enjoy trying crazy things, breaking rules, and making a mess.

Leave a comment

November 24, 2020

Pain is not the unit of Effort

(Content warning: self-harm, parts of this post may be actively counterproductive for readers with certain mental illnesses or idiosyncrasies.)

What doesn’t kill you makes you stronger. ~ Kelly Clarkson.
No pain, no gain. ~ Exercise motto.
The more bitterness you swallow, the higher you’ll go. ~ Chinese proverb.

I noticed recently that, at least in my social bubble, pain is the unit of effort. In other words, how hard you are trying is explicitly measured by how much suffering you put yourself through. In this post, I will share some anecdotes of how damaging and pervasive this belief is, and propose some counterbalancing ideas that might help rectify this problem.

I. Anecdotes

1. As a child, I spent most of my evenings studying mathematics under some amount of supervision from my mother. While studying, if I expressed discomfort or fatigue, my mother would bring me a snack or drink and tell me to stretch or take a break. I think she took it as a sign that I was trying my best. If on the other hand I was smiling or joyful for extended periods of time, she took that as a sign that I had effort to spare and increased the hours I was supposed to study each day. To this day there’s a gremlin on my shoulder that whispers, “If you’re happy, you’re not trying your best.”

2. A close friend who played sports in school reports that training can be harrowing. He told me that players who fell behind the pack during for daily jogs would be singled out and publicly humiliated. One time the coach screamed at my friend for falling behind the asthmatic boy who was alternating between running and using his inhaler. Another time, my friend internalized “no pain, no gain” to the point of losing his toenails.

3. In high school and college, I was surrounded by overachievers constantly making (what seemed to me) incomprehensibly bad life choices. My classmates would sign up for eight classes per semester when the recommended number is five, jigsaw extracurricular activities into their calendar like a dynamic programming knapsack-solver, and then proceed to have loud public complaining contests about which libraries are most comfortable to study at past 2am and how many pages they have left to write for the essay due in three hours. Only later did I learn to ask: what incentives were they responding to?

4. A while ago I became a connoisseur of Chinese webnovels. Among those written for a male audience, there is a surprisingly diverse set of character traits represented among the main characters. Doubtless many are womanizing murderhobos with no redeeming qualities, but others are classical heroes with big hearts, or sarcastic antiheroes who actually grow up a little, or ambitious empire-builders with grand plans to pave the universe with Confucian order, or down-on-their-luck starving artists who just want to bring happiness to the world through song.

If there is a single common virtue shared by all these protagonists, it is their superhuman pain tolerance. Protagonists routinely and often voluntarily dunk themselves in vats of lava, have all their bones broken, shattered, and reforged, get trapped inside alternate dimensions of freezing cold for millennia (which conveniently only takes a day in the outside world), and overdose on level-up pills right up to the brink of death, all in the name of becoming stronger. Oftentimes the defining difference between the protagonist and the antagonist is that the antagonist did not have enough pain tolerance and allowed the (unbearable physical) suffering in his life to drive him mad.

5. I have a close friend who often asks for my perspective on personal problems. A pattern arose in a couple of our conversations:

alkjash: I feel like you’re not actually trying. [Meaning: using all the tools at your disposal, getting creative, throwing money at the problem to make it go away.]
alkjash’s friend: What do you mean I’m not trying? I think I’m trying my best, can’t you tell how hard I’m trying? [Meaning: piling on time, energy, and willpower to the point of burnout.]

After several of these conversations went nowhere, I learned that asking this friend to try harder directly translated in his mind to accusing him of low pain tolerance and asking him to hurt himself more.

II. Antidotes

I often hear on the internet laments like “Why is nobody actually trying?” Once upon a time, I was honestly and genuinely confused by this question. It seemed to me that “actually trying” – aiming the full force of your being at the solution of a problem you care about – is self-evidently motivating and requires zero extra justification if you care about the problem.

I think I finally understand why so few people are “actually trying.” The reason is this pervasive and damaging belief that pain is the unit of effort. With this belief, the injunction “actually try” means “put yourself in as much pain as you can handle.” Similarly, “she’s trying her best” translates to “she’s really hurting right now.” Even worse, people with this belief optimize for the appearance of suffering. Answering emails at midnight and appearing fatigued at meetings are somehow taken to be more credible signals of effort than actual results. And if you think that’s pathological, wait until you meet someone for whom telling them about opportunities actively hurts them, because you’ve just created another knife they feel pressured to cut themselves with.

I see a mob of people walking up to houses and throwing themselves bodily at the closed front doors. I walk up to block one man and ask, “Stop it! Why don’t you try the doorknob first? Have you rung the doorbell?” The man responds in tears, nursing his bloody right shoulder, “I’m trying as hard as I can!” With his one good arm, he shoves me aside and takes a running start to lunge at the door again. Finally, the timber shatters and the man breaks through. The surrounding mob cheers him on, “Look how hard he’s trying!”

Once you understand that pain is how people define effort, the answer to the question “why is nobody actually trying?” becomes astoundingly obvious. I’d like to propose two beliefs to counterbalance this awful state of affairs.

1. If it hurts, you’re probably doing it wrong.

If your wrists ache on the bench press, you’re probably using bad form and/or too much weight. If your feet ache from running, you might need sneakers with better arch support. If you’re consistently sore for days after exercising, you should learn to stretch properly and check your nutrition.

Such rules are well-established in the setting of physical exercise, but their analogs in intellectual work seem to be completely lost on people. If reading a math paper is actively unpleasant, you should find a better-written paper or learn some background material first (most likely both). If you study or work late into the night and it disrupts your Circadian rhythm, you’re trading off long-term productivity and well-being for low-quality work. That’s just bad form.

If it hurts, you’re probably doing it wrong.

2. You’re not trying your best if you’re not happy.

Happiness is really, really instrumentally useful. Being happy gives you more energy, increases your physical health and lifespan, makes you more creative and risk-tolerant, and (even if all the previous effects are unreplicated pseudoscience) causes other people to like you more. Whether you are tackling the Riemann hypothesis, climate change, or your personal weight loss, one of the first steps should be to acquire as much happiness as you can get your hands on. And the good news is: at least anecdotally, it is possible to substantially raise your happiness set-point through jedi mind tricks.

Becoming happy is a fully general problem-solving strategy. And although one can in principle trade off happiness for short bursts of productivity, in practice this is never worth it.

Culturally, we’ve been led to believe that over-stressed and tired people are the ones trying their best. It is right and proper to be kind to such people, but let’s not go so far as to support the delusion that they are inputting as much effort as their joyful, boisterous peers bouncing off the walls.

You’re not trying your best if you’re not happy.

6 Comments

October 26, 2020

Is Success the Enemy of Freedom? (Full)

I. Parables

A. Anna is a graduate student studying p-adic quasicoherent topology. It’s a niche subfield of mathematics where Anna feels comfortable working on neat little problems with the small handful of researchers interested in this topic. Last year, Ann stumbled upon a connection between her pet problem and algebraic matroid theory, solving a big open conjecture in the matroid Langlands program. Initially, she was over the moon about the awards and the Quanta articles, but now that things have returned to normal, her advisor is pressuring her to continue working with the matroid theorists with their massive NSF grants and real-world applications. Anna hasn’t had time to think about p-adic quasicoherent topology in months.

B. Ben is one of the top Tetris players in the world, infamous for his signature move: the reverse double T-spin. Ben spent years perfecting this move, which requires lightning fast reflexes and nerves of steel, and has won dozens of tournaments on its back. Recently, Ben felt like his other Tetris skills needed work and tried to play online without using his signature move, but was greeted by a long string of losses: the Tetris servers kept matching him with the other top players in the world, who absolutely stomped him. Discouraged, Ben gave up on the endeavor and went back to practicing the reverse double T-spin.

C. Clara was just promoted to be the youngest Engineering Director at a mid-sized software startup. She quickly climbed the ranks, thanks to her amazing knowledge of all things object-oriented and her excellent communication skills. These days, she finds her schedule packed with what the company needs: back-to-back high-level strategy meetings preparing for the optics of the next product launch, instead of what she loves: rewriting whole codebases in Haskell++.

D. Deborah started her writing career as a small-time crime novelist, who split her time between a colorful cast of sleuthy protagonists. One day, her spunky children’s character Detective Dolly blew up in popularity due to a Fruit Loops advertising campaign. At the beginning of every month, Deborah tells herself she’s going to finally kill off Dolly and get to work on that grand historical romance she’s been dreaming about. At the end of every month, Deborah’s husband comes home with the mortgage bills for their expensive bayside mansion, paid for with “Dolly money,” and Deborah starts yet another Elementary School Enigma.

E. While checking his email in the wee hours of the morning, Professor Evan Evanson notices an appealing seminar announcement: “A Gentle Introduction to P-adic Quasicoherent Topology (Part the First).” Ever since being exposed to the topic in his undergraduate matroid theory class, Evan has always wanted to learn more. He arrives bright and early on the day of the seminar and finds a prime seat, but as others file into the lecture hall, he’s greeted by a mortifying realization: it’s a graduate student learning seminar, and he’s the only faculty member present. Squeezing in his embarrassment, Evan sits through the talk and learns quite a bit of fascinating new mathematics. For some reason, even though he enjoyed the experience, Evan never comes back for Part the Second.

F. Whenever Frank looks back to his college years, he remembers most fondly the day he was kicked out of the conservative school newspaper for penning a provocative piece about jailing all billionaires. Although he was a mediocre student with a medium-sized drinking problem, on that day Frank felt like a man with principles. A real American patriot in the ranks of Patrick Henry or Thomas Jefferson. After college, Frank met a girl who helped him sort himself out and get sober, and now he’s the proud owner of a small accounting firm and has two beautiful daughters Jenny and Taylor. Yesterday, arsonists set fire to the Planned Parenthood clinic across the street, and his employees have been clamoring for Frank to make a political statement. Frank almost threw caution to the wind and Tweeted #bodilyautonomy from the company account right there, but then the picture on his desk catches his eye: his wife and daughters at Taylor’s elementary school graduation. It’s hard to be a man of principles when you have something to lose.

G. Garrett is a popular radio psychologist who has been pressured by his sponsors into being the face of the yearly Breast Cancer Bike-a-thon. Unfortunately, Garrett has a dark secret: he’s never ridden a bicycle. Too embarrassed to ask anyone for help or even be seen practicing – he is a respected public figure, for god’s sake – Garrett buys a bike and sneaks to an abandoned lot to practice by himself after sunset. He thinks to himself, “how hard can it be?” Garrett shatters his ankle ten minutes into his covert practice session and has to pull out of the event. Fortunately, Garrett’s sponsors find an actual celebrity to fill in for him and breast cancer donations reach record highs.

II. Motivation

What is personal success for?

We say success opens doors. Broadens horizons. Pushes the envelope. Shatters glass ceilings.

Success sets you free.

But what if it doesn’t?

Take a good hard look at the successful people around you. Doctors too busy to see their children on weekdays. Mathematicians too brilliant in one field to switch to another. Businessmen too wealthy to avoid nightly wining and dining. Professional gamers too specialized to learn a new hero. Public figures too popular to change their minds.

Remember that time Michael Jordan took a break from basketball and played professional baseball? They said he would have made an excellent professional player given time. Jordan said baseball was his childhood dream. Even so, in just over a year Jordan was back in basketball. It is hard not to imagine what a baseball player Michael Jordan could have been, had he been less successful going in.

I think it was in college that I first noticed something wasn’t right about this picture. I spent my first semester studying and playing Go for about eight hours a day. I remember setting out a goban on the carpet of my dorm room and studying patterns in the morning as my roommate left for classes; when he returned to the room in the evening, he was surprised to see me still sitting there contemplating the flow of the stones. Because this was not the first or tenth time this had happened, he commented something like, “You must be really smart to not need to study.”

I remember being dumbstruck by that statement. It suggested that my freedom to play board games for eight hours a day was gated by my personal success, and other Harvard students would be able to live like me if only they were smarter. But you know who else can play board games for eight hours a day? Basement-dwelling high school dropouts, who are – for all their unsung virtues – definitely not smarter than Harvard students.

When I entered college, they told me a Harvard education would empower me do anything I want. The world would be my oyster. I took that message to heart in those four years – I fell in love, played every PC game that money could buy, studied programming languages and systems programming, and read more than one Russian novel. When I talked to my peers, however, I was constantly surprised at the overwhelming sameness of their ambitions. Four years later, twenty out of thirty-odd graduating seniors at our House planned to work in finance or consulting.

(Now, it could be that college really empowers these bright young scholars to realize their childhood dreams of arbitraging the yen against the kroner. But this is, as they say in the natural sciences, definitely not the null hypothesis.)

All of this would have made a teenager hate the idea of success altogether. I was not a teenager anymore, so I formulated a slightly more sophisticated answer: Regardless of how successful I become, I resolve to live like a failure.

This is a post about all the forces, real and imagined, that can make success the enemy of personal freedom. As long as these forces exist, and as long as human heart yearns for liberty, few people will ever want wholeheartedly to succeed. Were it not already reality, that is a state of affairs too depressing to contemplate.

(Just to be clear, people are plenty motivated to succeed when basic needs are at stake – to put food on the table, to get laid, to pay for the mortgage. But after those needs get met, success just doesn’t look all that great and only certain sorts of delightful weirdos keep striving. The rest of us mostly just lay back and enjoy the fruits of their labor.)

III. Factorization

I think all of the experiences in Section I can be summed up by the umbrella-term “Sunk Cost Fallacy,” but that theory is a little too low-resolution for my tastes. In this section I identify three main psychological factors of the phenomenon.

1. You rose to meet the challenge. Your peer group rose to meet you.

We are constantly sorted together with people of the same age group, at similar levels of competence, at similar stages in our careers. To keep up with the group, you have to run as fast as you can just to stay in place, as the saying goes. And if you run twice as fast as that, you just end up in a new, even harder-to-impress peer group. When your friends are all level 80, it’s dreadfully difficult to restart at level 1.

Your friends may even be sympathetic, but it rarely helps matters.

Maybe you want to try something totally new, and your friends are too invested in their pet genre to emigrate with you.

Maybe you’re excited to learn a new skill one of your hyper-competent friends is specialized in, and you ask them to coach you. Unfortunately, this turns out to be a massive mistake, because your friend only remembers how she got from level 75 to 80, and sort of assumes everything below is trivial. It’s technically possible to learn area formulas as a special case of integral calculus, but only technically.

Maybe you transition to a new role within the team, you struggle to learn a new set of tricks, and you start hating yourself for not pulling as much weight as you’re used to. You start to see a mix of pity and frustration in your teammates eyes as you drag the whole team down.

2. Yesterday, you were bad at everything, and that really sucked. Today you’re good at one thing, and you’re hanging on for dear life.

It’s hard to move out of your comfort zone when your comfort zone is one hundred square feet on top of Mount Olympus and every cardinal direction points straight off a cliff. Seems like just yesterday you stood at the base of this mountain among the rest of the mortals, craning your neck to get a peek at what it’s like up here.

Kindly god-uncle Zeus calls a special thunderstorm for your arrival. Dionysus pours you a frothy drink and shares a bawdy tale. Hephaestus personally fashions you a blade as a symbol of your newfound status. Aphrodite invites you to her parlor for a night of good old-fashioned philosophy. They all act so welcoming, so natural, so in their element, and you know you’re only up here by a stroke of pure luck.

When Hermes returns the next morning and invites you to fly with him on his winged boots to see the world, you decline graciously. Not because you don’t want to – they’re winged boots! – but because the moment you try anything out of the ordinary you’ll be found out for the impostor that you are and god-uncle Zeus will show you his not-so-kindly side and chain you to a liver-eating eagle or a boulder that only obeys the laws of gravity intermittently.

3. Success gave you something to lose.

They say beware the man with nothing to lose.

I say envy him, because he alone is free.

You fondly recall the good old days of two thousand and two when you could go online and post diatribes against religion as a “militant atheist.” In those days, you had nothing, and you were free. You were unattached. You were intellectually wealthy but financially insolvent. You could see one end of the place you call home from the other.

Now that you’ve made it big, you’d have to carefully position mirrors at the ends of three hallways to see that far. You’re attached to wonderful person(s) of amenable sexual orientation(s). You have a reputation to maintain in the ever-smaller circles that you walk. Children in your community look up to you, or so you tell yourself. And so, even though deep in your heart you still believe that only idiots believe in an old man in the sky your Twitter profile identifies you as “spiritual, yearning, exploring.”

IV. Resolution?

It seems to me we have a problem.

We are not a species known for risk-taking, so human flourishing really depends on the explicit emphasis of exploration and openness to new experience. And yet it seems that the game is set up so that the most successful people are least incentivized to explore further. That all the trying new things and pushing boundaries and calling for revolution is likely to come from those with neither the power to get it done nor the competence to do it correctly.

But it’s not a hopeless case by any means. Many of the most successful people got there precisely by valuing freedom, creativity, and exploration, and still practice these values – so far as they can – within the confines of their walled gardens. We live in an information age where getting good at things is as easy as it’s ever been. And at very least we pay lip service to healthy adages like “Stay hungry, stay foolish.”

But what does one do personally to maintain one’s freedom?

I don’t claim to have a fully general solution to this problem, but here is a rule that’s helped me in the past.

When learning something new, treat yourself like a five-year-old.

If you’ve never spoken a word of Korean in your life, it doesn’t matter if you’re a professor of English Literature. As far as learning Korean goes, you’re a five-year-old. Treat yourself like one. Make yourself a snack for memorizing the vertical vowels. Take a break after reading your first sentence and come back tomorrow. When you’re done for the day, suck your thumb while staring at the first Korean word you’ve ever learned and feel the honest pride well up in your heart.

If you’ve never washed a dish in your life, it doesn’t matter if you’re a professional chef. As far as washing dishes is concerned, you’re a five-year-old. Treat yourself like one. Make yourself a snack for figuring out how to dispense dish soap without getting it everywhere. Take a break after finishing the bowls and come back tomorrow. When you’re all done, take a moment to take in that beautiful empty sink and feel the honest pride well up in your toddler heart.

Do you see how profoundly counterproductive it would be for the Korean learner to beat herself up for not being able to converse fluently with her Asian friends after two weeks? Do you see how completely unkind it would be for the novice dishwasher to call himself a useless piece of shit for not being able to execute the most basic of adult tasks?

Be kind to yourself and adjust your expectations to reality. When learning something new, treat yourself like a five-year-old.

Leave a comment

May 16, 2020

Of Math and Memory, Part 3 (Final)

In Part 1, we noted how extraordinarily taxing mathematics can be on short-term working memory. Being able to hold one extra Greek letter in your head can make the difference between following a lecture and getting completely lost. Having background and mathematical maturity means many standard techniques need not be forcibly remembered, freeing up space for the few genuinely novel ideas.

In Part 2, we gave a simple conceptual model for short-term memory, based on a fundamental principle of information theory: compression is equivalent to prediction. The more predictable data is (or the better we get at predicting it), the less new information you have to store.

What do these ideas mean concretely for mathematicians? In this concluding post, we give a practical algorithm for making the most of your short-term memory in mathematics, which I call dyadic scanning.

We will start with the concrete problem of how to read a paper, and later generalize to how to write papers, how to listen and give talks, and how to have mathematical conversations, all while making the most of our short-term memory.

Dyadic Scanning

Consider the markings on a standard ‘Murican ruler:

original

The two longest vertical lines mark the beginning and end of an inch. It is then divided dyadically into half-inches, quarter-inches, eighth-inches, and sixteenth-inches by progressively smaller “teeth.”

A mathematical paper is organized much like the markings on a ruler: first it is divided into a few main theorems, each of which is divided into several major lemmas, which are then interspersed with minor or technical lemmas and definitions, themselves pieced together from many tedious details. The obvious and standard way of reading a paper is a sequential scan:

Sequential Scan (BAD)

Up until a couple years ago, this was how I attempted to read any given paper: read it through once from beginning to end, pausing on each detail and tracing it down to the lowest level until I could follow it line-by-line. The sequential scan is a fairly useful way to build foundations and mathematical maturity: one spends a lot of time piecing together details and developing a taste for rigor. It is, however, a generally misguided and inefficient approach to reading mathematics in general.

“I could follow line by line, but I have no idea what’s going on” is a common complaint that comes out of reading mathematics like this.

What are the downsides of the sequential scan?

You get easily lost in details, missing the forest for the trees.

You sacrifice agency, accepting the order in which ideas are presented as if from on high.

Most importantly, you don’t know where you’re going.

You can’t ask “Where was condition (a) of Lemma 2 used in the proof of Lemma 3? Can we weaken it?” if you don’t even know that Lemma 2 only exists to prove Lemma 3 later on. This is the kind of question that senior mathematicians are asking all the time to shore up their understanding.

Unless the paper is very cleverly and thoughtfully written, reading it sequentially is going in blind. You will have the hardest possible time predicting each next step, and therefore have to bear the heaviest possible burden remembering every detail. Without knowing what will be used where or how, you will have to default to remembering everything.

Here is a more enlightened way to read a paper, which I call dyadic scanning.

Dyadic Scan (GOOD)

Instead of reading the paper in a single pass, you split your reading into logarithmically many passes of progressively higher resolution. In the first pass, you figure out the overarching organization and the main results. In the second, you locate the main lemmas, how they fit together, and where the genuine innovations are in the paper. In the third, you piece together how all the minor technical lemmas are involved in the proof, and the locale where each one is relevant. Only in the fourth pass, or later, do you dig into the details of the rigorous proofs. More passes are added if the paper is especially dense, or if you’re especially unfamiliar with the field. The longer and more technical the paper is, the longer you should wait before diving into details.

Why make dyadic scans?

You know where you’re going. If you read mathematics this way, you will know what each line of mathematics is for before digging into why it’s true. Knowing the purpose of Lemma 2 lets you figure out which terms are important and which terms are negligible error terms. Knowing when you’re done using that functional equation means you can free up that memory for something new.

You’re forced to develop an eye for what matters. Not every paper is written so that the main lemmas stand out from the minor lemmas stand out from the boring technical details. Doing dyadic scans forces you to develop a taste for what matters, and to discriminate between the innovative and the boilerplate.

You read in an order closer to how mathematics is actually done. Very few proofs are devised in the order they’re presented. There may be an important and difficult technical lemma that requires a seven-page calculation using the calculus of variations, but I can guarantee you that the author did not work out the details of this argument before fleshing out the main arc of the proof first. Reading via dyadic scans reinforces the habit of keeping the big picture in your head at all times. The rigorous correctness of each detail matters less than you think – often any given technical argument can be done in several other ways.

Generalizations

For clarity, I’ve focused on dyadic scanning for reading papers. The method applies equally well to other settings. I will not bore you with the details, but here is a sketch of how.

On paper-writing

A well-written paper should make it easy for the reader to pick out its dyadic structure. To some extent, this is already standard practice: the abstract is an outline for the introduction, which is an outline for the entire paper. A great example where such a structure is pushed further is this paper of Tao which proves an almost version of the infamous Collatz conjecture. It is a fairly dense 49-page paper, so in addition to the standard abstract and introduction, there is a ten-page extended outline detailing the main arc of the proof and highlighting the important ideas.

When arguments get even longer than this, it is not uncommon to see a proof divided among multiple papers, with the first one serving as an extended introduction for the whole series.

This does not mean that a paper should necessarily be written in the explicit order of the dyadic scans: all the main theorems first, then the main lemmas, then the minor lemmas, and then the technical arguments should be bunched up at the very end. Often this will result in a very unnatural structure obfuscating the dependencies between ideas. It may be better to split the paper into subsections which are as functionally independent as possible, and carefully point out the dependencies and relative importance of various parts when they appear.

On giving and receiving talks

Attending talks is even more taxing on working memory than reading papers; the audience will generally have a wider variation in background, they will rarely have the luxury of pen and paper, and regardless of whether the talk is given via slides or a blackboard only a fraction of the total content is visible at any given time.

It is therefore even more essential when giving talks that the audience knows exactly where you’re going. Be sure to have multiple levels of signposting and continuous reminders on how each argument fits into the big picture. Most of the details in the lower levels should be omitted altogether unless they are absolutely essentially.

On the receiving end, my general advice is that trying to follow every word in a talk is a mistake akin to making a single sequential pass in reading a paper. Instead, treat going to a talk as taking a single dyadic pass into a topic, out of potentially many. Which pass to treat it as depends on your current level of exposure. If you are a beginner, watch the talk as if taking the first dyadic pass, noting the key words and main ideas and how they fit together. If you have some exposure to the field, you can treat the talk like a second or third pass, paying attention to the major details and innovations. If you are an expert in the field, almost nothing in the talk will be new to you, and you can really dig into the key details or open problems. Be realistic about how much you expect to get out of the talk, and plan accordingly for what to focus on.

On mathematical conversations

Much of the advice in this post applies just as well to the more informal setting of a mathematical conversation, where often one person must convey a fairly complicated argument verbally to one or more others. The key difference in this setting is that the listener(s) can play a much more active role.

The simplest level of active listening is asking for clarifications and more details when things are not clear. A more sophisticated level of active listening is asking directly for the pieces of the dyadic structure that one is missing: if the speaker dives into the details of a lemma before its purpose is made clear, it is often correct for you to ask for that purpose instead.

In Conclusion

In all of these activities, the fundamental resource is your very limited working memory, and the more you can predict, the less you have to remember. By looking ahead, by asking for clarification, by making multiple passes, we can “cheat” and see the future, freeing up our memory for what really matters.

3 Comments

January 27, 2020

Of Math and Memory, Part 2

Last time, I wrote that having a good memory is essential in mathematics.

Today I will describe my model for working memory.

Compression and Prediction

Data compression is the science of storing information in as few bits as possible. I claim that optimizing your working memory is mainly a problem of data compression: there’s a bounded amount of data you can store over a short period of time, and the problem is to compress the information you need so that this storage is as efficient as possible.

One of the fundamental notions in data compression is that compression is equivalent to prediction. Another way of saying this is: the more you can predict, the less you have to remember.

Here are three examples.

I. Text compression

Cnsdr ths prgrph. ‘v rmvd ll th vwls nd t rmns bsclly rdbl, bcs wth jst th cnsnnts n cn prdct wht th mssng vwls wr. Th vwls wr rdndnt nd cld b cmprssd wy.

All text compression algorithms work basically the same way: they store a smaller amount of data from which the rest of the information can be predicted. The better you are at predicting the future, the less arbitrary data you have to carry around.

II. Memory for Go

Every strong amateur Go player can, after a slow-paced game, reproduce the entire game from memory. An average game consists of between one and two hundred moves, each of which can be placed on any of the 19×19 grid points.

A typical amateur game, midway through.

Anyone who practices playing Go for a year or two will gain this amazing ability. It is not because their general memory improved either: if you showed them a sequence of nonsensical, randomly generated Go moves, they would have almost as hard of a time remembering them as an absolute novice.

The reason it’s so easy to remember your own games is because your own moves are so predictable. Given a game state, you don’t have to actually remember the coordinates where the stone landed. You just have to think “what would I do in this position?” and reproduce the train of thought.

The only moves in the game you really need to explicitly store in memory are the “surprising” moves that you didn’t expect. Surprise, of course, is just another word for entropy. The better you are at prediction, the less surprise (entropy) you’ll meet, and the less you have to remember.

III. Mathematical theorems

A general feature of learning things well is that you get better at predicting. Fill in the blank:

If $a$ and $b$ are both the sum of two squares, then so is ___.

A beginning student looks at this statement and recalls the answer is $ab$ , simply by retrieving this answer directly from memory.

A practiced number theorist doesn’t need to store this exact statement directly in memory; instead, they know that any of an infinite variety of such statements can be reconstructed from a small number of core insights. Here, the two core insights are that a sum of two squares is the norm of a Gaussian integer, and that norms are multiplicative.

Getting better at prediction in mathematics often follows the same general pattern: identifying the small number of core truths from which everything else follows.

We reduced the problem of improving your working memory to the problem of predicting the future. At face value, this reduction seems less than useless, because predicting the future is harder than memorizing flash cards. Thankfully, human beings are embodied agents who can interact with our world. In particular, we can cheat by instead making the world easier to predict.

November 13, 2019

Of Math and Memory, Part 1

Memory is not sexy in mathematics.

“Rote memorization” is the most degrading slur you can fling at a math class. “Reciter of digits of pi” is the most awful caricature of mathematicians in the public eye. In grad school, the cardinal sin is to read a paper with a focus on memorizing names and results: we are bombarded with exhortations like if you learned the Arzelà-Ascoli theorem deeply, it would be impossible to forget. Apparently, if you really understand mathematics, everything (down to the accents on the names of 19th century Italian mathematicians) would be so natural as to render rote memorization completely unnecessary.

All these attitudes can be quite detrimental to the young mathematician who, at the end of the day, needs to memorize an enormous amount of arbitrary data in order to get up to speed in their field. In this post, I will tell some archetypal stories about how memory, especially short-term working memory, is perhaps the scarcest resource in mathematical work.

In a future post, I will attempt to provide some solutions to address this scarcity.

Anecdotes

Have you ever tried to copy a phone or bank account number from one place to another, without the benefit of Ctrl-C? You stare at the number for ten seconds, repeating it back to yourself in a rap-like rhythm. That sick beat, you hope, will help you remember an extra digit or two.

Conjure up that feeling of impending doom as you repeat those numbers back to yourself, knowing full well you can’t move 10 digits in one go. That’s the feeling of not having enough working memory. It’s the same feeling in each of the following scenarios.

These are not technically true stories, but they are all pieced together from literally true events.

A brilliant analytic number theorist is half-way through a riveting talk on the distribution of low-lying zeroes of L-functions. About three-quarters of the way through the blackboard space, the speaker finally switches gears from giving motivation and carefully treads into a long, technical calculation. Every Cauchy-Schwarz application and Fourier transform is clearly explained and surprisingly simple, until –

Uh oh!

The speaker reaches the bottom of the blackboard and begins erasing. You can almost hear the collective sigh of despair as most of the listeners think the same thought.

We’ve reached the end of the line.

After that half of the calculations are erased, only a handful of senior mathematicians who know the subject inside and out follow the rest of the talk.

Three mathematicians are throwing around ideas in a meeting. One is suddenly struck by inspiration, and starts explaining how to carry out a tricky change-of-variables. Another joins in with excitement, quickly catching on and offering a crude approximation which simplifies things significantly. All of this is happening in the air, so to speak. Writing things down would severely hamper their progress.

The third person, a younger graduate student, has a number of questions about the equations everyone’s keeping in their heads. The first time they ask for clarification, they are reminded gently that all calculations are in characteristic $p$ . The second time, they are informed of a standard fact about eigenvalues of random matrices, and given a minute to catch up.

The third time, they can’t remember whether $x$ was defined in the Fourier domain. They don’t ask.

In the next meeting, there are only two mathematicians.

I’m out for lunch, and need to attend a seminar talk afterwards. My weekly meeting with my PhD adviser is two hours away, and I haven’t made any progress this week. Away from pen and paper, I rack my brain and scrape the bottom of the proverbial barrel for any stray thought that might be worth presenting to him.

By some miracle, a casual remark during lunch sets off a series of revelations. I begin methodically working out the details in my head, getting more and more excited that I’m onto something. I completely ignore the seminar talk, running back and forth over the calculations in my mind. I get more and more confident that it works.

I walk into my adviser’s office and try to explain the idea to him, only to realize that I’d forgotten an essential intermediate step and mixed up two important variables. I get up to the board and attempt to work things out from the beginning, but I’m so flustered by this point that I keep forgetting what I’m doing.

We spend the hour going back and forth on minor technicalities, trying to see if there’s anything to my idea. In the end, my adviser becomes pessimistic that there’s anything at all and gently shoos me out for his next meeting.

When I get back to my office afterwards, I pull out pen and paper to try to salvage the idea.

I figure out all the details in fifteen minutes.

Problem Statement

It is difficult to collaborate with someone with significantly more or less short-term memory. Someone with more will appear to skip ahead three steps at a time, and you will continually feel in their debt for asking them to explain details. Conversely, someone with less will often ask you to rewind and write ideas down that you find inessential.

It’s difficult to read a mathematical paper without a good short-term memory. A reader who needs to keep referring back to the statement of Lemma 4.3(a) does not have the mental capacity to think about the big picture. If the paper is improperly structured, introduces clumsy notation, or is liberally sprinkled with abstruse citations, trying to follow it can feel like taking a forgetful random walk. How many times will I flip back to the conventions section before I remember the difference between $S_k$ and $\mathcal{S}_k$ ?

It’s difficult to either follow or give a mathematical talk without a good short-term memory. An audience member can get lost by zoning out briefly and losing track of an important definition or theorem statement. A speaker who doesn’t remember the contents of their slides constantly reads off them and has no attention to pay to the audience. Next, an audience question about a previous slide breaks the artificial flow of the talk and causes a minor catastrophe.

People often worry that they cannot do mathematics because they are not clever enough. This is a very serious worry, because as far as we know everyone is born with a certain amount of clever and nobody really knows how to get more.

I think people should instead worry they cannot do mathematics because their memories are too poor. And I think this is very good news, because memory can be trained, and deficiencies in memory can be optimized around.

To be continued…

6 Comments