<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://joellaity.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://joellaity.com/" rel="alternate" type="text/html" /><updated>2026-01-01T03:26:27+00:00</updated><id>https://joellaity.com/feed.xml</id><title type="html">Joel Laity</title><subtitle></subtitle><entry><title type="html">Music cheat sheet</title><link href="https://joellaity.com/2025/12/29/music-notes.html" rel="alternate" type="text/html" title="Music cheat sheet" /><published>2025-12-29T00:00:00+00:00</published><updated>2025-12-29T00:00:00+00:00</updated><id>https://joellaity.com/2025/12/29/music-notes</id><content type="html" xml:base="https://joellaity.com/2025/12/29/music-notes.html"><![CDATA[<h2 id="keyboard-diagram">Keyboard diagram</h2>

<p><img src="/assets/keyboard.jpg" alt="Keyboard" /></p>

<h2 id="keys">Keys</h2>

<table>
  <thead>
    <tr>
      <th>Major Key</th>
      <th>F♯</th>
      <th>C♯</th>
      <th>G</th>
      <th>D</th>
      <th>A</th>
      <th>E</th>
      <th>B</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Minor key</td>
      <td>D♯</td>
      <td>A♯</td>
      <td>E</td>
      <td>B</td>
      <td>F♯</td>
      <td>C♯</td>
      <td>G♯</td>
    </tr>
    <tr>
      <td>Sharp count</td>
      <td>6</td>
      <td>7</td>
      <td>1</td>
      <td>2</td>
      <td>3</td>
      <td>4</td>
      <td>5</td>
    </tr>
  </tbody>
</table>

<p><img src="/assets/csharp_major.cropped.svg" alt="C# Major" /></p>

<table>
  <thead>
    <tr>
      <th>Major Key</th>
      <th>B♭</th>
      <th>E♭</th>
      <th>A♭</th>
      <th>D♭</th>
      <th>G♭</th>
      <th>C♭</th>
      <th>F</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Minor key</td>
      <td>G</td>
      <td>C</td>
      <td>F</td>
      <td>B♭</td>
      <td>E♭</td>
      <td>A♭</td>
      <td>D</td>
    </tr>
    <tr>
      <td>Flat count</td>
      <td>2</td>
      <td>3</td>
      <td>4</td>
      <td>5</td>
      <td>6</td>
      <td>7</td>
      <td>1</td>
    </tr>
  </tbody>
</table>

<p><img src="/assets/cflat_major.cropped.svg" alt="Cb Major" /></p>

<h2 id="distances-between-consecutive-notes-in-scales">Distances between consecutive notes in scales</h2>

<h3 id="major-scales">Major scales</h3>

<p>The distances between the notes in a major scale are tone, tone, semitone, tone, tone, tone, semitone.</p>

<p><img src="/assets/cmajor_scale.cropped.svg" alt="C Major Scale" /></p>

<h3 id="harmonic-minor-scales">Harmonic minor scales</h3>

<p>The distances between the notes in a harmonic minor scale are tone, semitone, tone, tone, semitone, tone and a half, semitone.</p>

<p><img src="/assets/aminor_scale.cropped.svg" alt="A Minor Scale" /></p>

<h2 id="terms">Terms</h2>

<table>
  <thead>
    <tr>
      <th>Italian term</th>
      <th>Abbrev. or sign</th>
      <th>Meaning</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Andante</td>
      <td>—</td>
      <td>at an easy walking pace</td>
    </tr>
    <tr>
      <td>Moderato</td>
      <td>—</td>
      <td>at a moderate speed</td>
    </tr>
    <tr>
      <td>Allegro</td>
      <td>—</td>
      <td>lively and fast</td>
    </tr>
    <tr>
      <td>Allegretto</td>
      <td>—</td>
      <td>moderately fast</td>
    </tr>
    <tr>
      <td>Rallentando</td>
      <td>rall.</td>
      <td>gradually becoming slower</td>
    </tr>
    <tr>
      <td>Ritardando</td>
      <td>rit. / ritard.</td>
      <td>gradually becoming slower</td>
    </tr>
    <tr>
      <td>A tempo</td>
      <td>—</td>
      <td>return to former speed</td>
    </tr>
    <tr>
      <td>Crescendo</td>
      <td>cresc.</td>
      <td>gradually becoming louder</td>
    </tr>
    <tr>
      <td>Diminuendo</td>
      <td>dim.</td>
      <td>gradually becoming softer</td>
    </tr>
    <tr>
      <td>Forte</td>
      <td>f</td>
      <td>loud</td>
    </tr>
    <tr>
      <td>Piano</td>
      <td>p</td>
      <td>soft</td>
    </tr>
    <tr>
      <td>Mezzo forte</td>
      <td>mf</td>
      <td>moderately loud</td>
    </tr>
    <tr>
      <td>Mezzo piano</td>
      <td>mp</td>
      <td>moderately soft</td>
    </tr>
    <tr>
      <td>Legato</td>
      <td>—</td>
      <td>smooth, well connected</td>
    </tr>
    <tr>
      <td>Staccato</td>
      <td>—</td>
      <td>short and detached</td>
    </tr>
  </tbody>
</table>

<h2 id="scale-degree-names">Scale degree names</h2>

<table>
  <thead>
    <tr>
      <th>Scale Degree Number</th>
      <th>Technical Name</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>1</td>
      <td>Tonic</td>
    </tr>
    <tr>
      <td>2</td>
      <td>Supertonic</td>
    </tr>
    <tr>
      <td>3</td>
      <td>Mediant</td>
    </tr>
    <tr>
      <td>4</td>
      <td>Subdominant</td>
    </tr>
    <tr>
      <td>5</td>
      <td>Dominant</td>
    </tr>
    <tr>
      <td>6</td>
      <td>Submediant</td>
    </tr>
    <tr>
      <td>7</td>
      <td>Leading Note</td>
    </tr>
  </tbody>
</table>

<h2 id="interval-names">Interval names</h2>

<table>
  <thead>
    <tr>
      <th>Major scale</th>
      <th>Minor scale</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Perfect unison</td>
      <td>Perfect unison</td>
    </tr>
    <tr>
      <td>Major 2nd</td>
      <td>Major 2nd</td>
    </tr>
    <tr>
      <td>Major 3rd</td>
      <td>Minor 3rd</td>
    </tr>
    <tr>
      <td>Perfect 4th</td>
      <td>Perfect 4th</td>
    </tr>
    <tr>
      <td>Perfect 5th</td>
      <td>Perfect 5th</td>
    </tr>
    <tr>
      <td>Major 6th</td>
      <td>Minor 6th</td>
    </tr>
    <tr>
      <td>Major 7th</td>
      <td>Major 7th</td>
    </tr>
    <tr>
      <td>Perfect 8ve</td>
      <td>Perfect 8ve</td>
    </tr>
  </tbody>
</table>]]></content><author><name></name></author><summary type="html"><![CDATA[Keyboard diagram]]></summary></entry><entry><title type="html">Book review: Vaxxers</title><link href="https://joellaity.com/2021/08/21/vaxxers-book-review.html" rel="alternate" type="text/html" title="Book review: Vaxxers" /><published>2021-08-21T00:00:00+00:00</published><updated>2021-08-21T00:00:00+00:00</updated><id>https://joellaity.com/2021/08/21/vaxxers-book-review</id><content type="html" xml:base="https://joellaity.com/2021/08/21/vaxxers-book-review.html"><![CDATA[<h1 id="intro">Intro</h1>

<p>In January 2020 I was on a holiday in China with my boyfriend. When we had boarded the plane to Shanghai a week earlier Coronavirus was barely on our radar. Things changed very quickly. We planned to stay for 14 days but by the tenth day into our vacation all non-essential stores were shut, buying masks required hours of searching, and the Chinese government sent a text informing everyone in our area that a major highway 2 hours from our hotel would be closed “indefinitely”. We decided to leave on the evening of Tuesday 28 January, and booked flights for Wednesday morning, 3 days before our original departure.</p>

<figure>
  <p style="text-align:center;"><img src="/assets/starbucks.jpg" alt="Starbucks" width="300" /></p>
  <p style="text-align:center;"><small><i>You know shit has hit the fan when even Starbucks is closed.</i></small></p>
</figure>

<p>If we had waited for our original departure flight, which left on Saturday,  we would have been at the airport in Shanghai when the news broke that the Australian government would not let non-Australians flying from China back into the country. As a New Zealander, I would not be allowed back into the country where I had worked and lived for the past 12 months.</p>

<p>By the end of March, Sydney was in lockdown. And almost all news sources were warning that developing a vaccine would take many years. This seemed reasonable to me at the time; new vaccines usually took at least ten years to develop and roll out. The fastest vaccine ever developed was the mumps vaccine in the 1960s and that took 4 years. Little did I know that at that time Oxford University scientists had already analyzed the virus genome, modified it so it was suitable for a vaccine, and were now in the process of growing enough vaccine-ready material to be able to conduct a clinical trial. The development of the first batch of the AstraZeneca vaccine was nearly complete.</p>

<h1 id="summary-of-the-book">Summary of the book</h1>

<p>Vaxxers is a book that chronicles the effort to develop the AstraZeneca vaccine. But unlike most such books, which are written by a journalist based on interviews, Vaxxers was written by two scientists who lead the effort to create the vaccine, Sarah Gilbert and Catherine Green.</p>

<p>Their main motivation for writing the book was to make vaccine development seem less mysterious and humanize the scientists who developed the vaccine. Their hope was that this would allay fears about receiving a shot, so I was a bit concerned their PR-instincts would take over and the book would be rather bland. In some respects this was true, many sections of the book were targeted at people who were skeptical of vaccines in general. But they also went into great detail explaining what it was like to be in their shoes, racing to develop the vaccine. What were they actually doing day-to-day? How do you go from a sequence of Coronavirus genome on a computer to a working vaccine in a few months? What does it feel like to actually do that work when you know that every decision you make has enormous consequences for all of humanity but you are also in a race against the worst pandemic in living memory, so decisions have to be made quickly? I could feel that both Gilbert and Green were proud and excited to share with the reader exactly how they made the vaccine and what their day-to-day lives felt like during that time.</p>

<h1 id="the-vaccine-technology">The vaccine technology</h1>

<p>Before I read the book I thought that the AZ vaccine was made using old, traditional methods, in contrast to the mRNA vaccines which were a new, general purpose technology.</p>

<p>It turns out this was not true. The method used to create the AZ vaccine is made using a “platform technology” - you can create a vaccine for any virus using the same technology (whether or not it is effective is a different story). The basic idea is to get the genome for COVID-19 virus, identify the part you want (the spike-protein part), do a few minor modifications (on a computer) and then send that data to a commercial lab. It only takes around two weeks (!!) for them to manufacture and send back a test tube with 100 billion real strands of DNA. You then combine that DNA with a non-replicating version of an chimpanzee <a href="https://en.wikipedia.org/wiki/Adenoviridae">Adenovirus</a> and voila! You have the active ingredient in the vaccine. It then took another few months for the Oxford scientists to produce enough of it to conduct clinical trials.</p>
<h1 id="ebola">Ebola</h1>

<p>The Oxford scientists actually used this exact technology to create a vaccine for Ebola in 2014. Unfortunately, the vaccine never got to Phase III clinical trials. Ironically the concern was not that the vaccine would be unsafe, the concern was that in order to conduct a randomised controlled trial you need to give a placebo to half the participants, and by this time Ebola was so bad that it was argued (by the WHO I think?) it would be unethical to give people a placebo. The Oxford scientists were unimpressed with these arguments:</p>

<blockquote>
  <p>Anyone taking part in the trial would be closely monitored and, if infected, would receive care as early as possible, thus improving their chances further. And, the longer discussions about ethical trial design went on, the longer no one was receiving a vaccine that gave them any chance of protection at all.</p>
</blockquote>

<p>Instead they used a “ring vaccination study” where the participants for the clinical trial are chosen in a different way and everyone in the clinical trial gets the vaccine, but some people in the trial get it later than others:</p>

<blockquote>
  <p>So instead, another type of trial design was eventually decided upon: a ring vaccination study with delayed vaccination in half of the rings. In order to use this type of trial design it is necessary to first identify someone who has been infected with Ebola. Then all of that person’s contacts are identified, and the contacts of their contacts, and the limits of the geographical area where they can be found are defined. That group of people forms a ‘ring’ around the initial case. Many rings are identified in this way, and each ring is randomly assigned to receive either immediate vaccination or delayed vaccination.</p>
</blockquote>

<p>I don’t really see how this avoids the ethical concerns. Everybody who is not in the clinical trial is still given nothing and the more complicated trial design combined with the back and forth trying to nail down an acceptable trial design just delayed the vaccine even more.</p>

<blockquote>
  <p>Whilst all of the discussions about how to conduct the phase III trials had been going on, so had the Ebola crisis. It was frustrating to see the process slow down dramatically when a vaccine was so desperately needed…. by April 2015 when the ring vaccination study started – a full year after the outbreak became widespread – the case numbers were low and falling.</p>
</blockquote>

<p>By the time they got approval for the trial Ebola cases were low and there was only enough Ebola going around to test one vaccine. A vaccine developed by a different group was chosen and the Adenovirus-based vaccine was never tested for efficacy.</p>

<p>The Oxford scientists thought this delay was unacceptable. WHO had a different take.</p>

<blockquote>
  <p>Why did it take so long to test for efficacy? It was four months from the outcome of the phase II trials until the start of the phase III study. I made this point at a conference where I had been invited to speak about our vaccine trial, and received a rather angry response from a WHO representative who insisted that everything had been done as fast as possible. But the fault does not lie with individuals not doing their job properly in the thick of things. The problem was a lack of preparation. The fact was that the delays meant not only that it took longer to contain the deadly Ebola virus, but also that only one vaccine ended up being tested for efficacy – a vaccine that required very low temperature storage, making it difficult and expensive to use in hot countries.</p>
</blockquote>

<p>The line “the fault does not lie with individuals but … lack of preparation” seems like a cop-out to me. I understand the impulse not to blame individuals and more preparation for Ebola-like events is justified. But the problem here was that WHO’s “ethical concerns’’ don’t really make sense. WHO’s behavior is better explained by thinking of them as a <a href="https://en.wikipedia.org/wiki/Vetocracy">Vetocracy</a> rather than an organization primarily concerned about ethics.</p>

<h1 id="developing-a-covid-vaccine">Developing a COVID vaccine</h1>

<p>Anyway, the Oxford researchers had this vaccine technology but it had never been subjected to efficacy trials. Then the coronavirus came. Within 48 hours of the COVID-19 genome being released the Oxford group had figured out which part they needed, modified it slightly, and sent the sequence to a commercial company ThermoFisher, to be manufactured. Remarkably, it only takes around a fortnight for these companies to turn DNA sequences stored in a computer into a test tube with around 100 billion strands of real DNA. While waiting for the DNA to come back from ThermoFisher the Oxford scientists had made preparations for how they were going to manufacture the vaccine. The lab available to them had already been precommitted to other projects. Thankfully they decided to say (paraphrasing) “fuck it - we’ll figure out finances later”:</p>

<blockquote>
  <p>My other concern was financial. The CBF [the lab they worked in] is run like a small business within the university, and in order to operate it has to cover its costs of around £1.5 million a year by charging its clients – researchers like Sarah and Tess – who in turn have to apply for grants to fund their research. The projects that Sarah was asking me to delay or deprioritise were already agreed, and their funding was secure right through to manufacture. It would be a big risk to the CBF’s operation to drop those in favour of this new project, when it wasn’t at all clear where the money might come from to pay for any of it.</p>
</blockquote>

<p>They tried two different methods to manufacture the vaccine in parallel. A rapid method which was less likely to work and a slower but more reliable method.</p>

<p>The rapid method was originally developed to help fight cancer. You get some mutated DNA from a tumor, create a vaccine, give it to a cancer patient and their immune system will learn to recognize the mutated DNA and attack it. A cancer patient’s own immune system could learn to attack the tumour. This immune response sometimes happens without pharmaceutical intervention so the mechanism could plausibly work, but you need a vaccine personalized for every cancer patient’s exact tumour mutation. The only way this would be economically viable is with very fast vaccine manufacture.</p>

<p>In both methods the basic process is the same. They inserted some of the adenovirus/COVID-19 DNA into human cells. Remarkably, these cells all originate from the kidney of a single fetus that was aborted in the Netherlands in the 1970s. These human cells have been sitting in labs replicating for the past 50 years.</p>

<p>These human cells start producing the virus that would become the main ingredient in the vaccines. It’s not exactly clear to me exactly why the virus can replicate in these cells but can not replicate when injected into my arm but I think it’s because they do a special procedure to get it inside the human cells.</p>

<p>They then carefully help these human cells replicate until they have enough of the virus. In the end they made 300mls of fluid with this virus. That cup of fluid was “destined to seed the manufacture of every dose of the Oxford vaccine ever produced.” (Imagine holding that in your hand knowing that if you break it your clumsiness could be measured in millions of lives.)</p>

<p>Sadly the rapid method did not work. Fortunately, the traditional one did. This was one example of a broader theme in the book where they tried many different things in parallel, fully expecting to have some “wasted” effort.</p>

<blockquote>
  <p>To move quickly, we would still perform all the same tests as usual, we just wouldn’t wait for the results before moving on to the next part of the process. If the starting material failed any of its tests, we would have to throw out anything we had made from it. But that risk – a risk of wasted time and effort and serious money, but not of quality – was one we were prepared to take.</p>
</blockquote>

<blockquote>
  <p>The next few weeks were some of the most hectic and surreal of my life. I was running in parallel half a dozen stages of vaccine development that would usually happen over years and in sequence.</p>
</blockquote>

<blockquote>
  <p>Because we had never used this method before, we also made preparations with lots of different conditions: different ratios of adenovirus DNA to spike protein DNA, different ratios of cells to DNA, and so on.</p>
</blockquote>

<p>Thank God they appreciated early on just how important speed was.</p>
<h1 id="partnering-with-astrazeneca">Partnering with AstraZeneca</h1>

<p>The Oxford group then had to partner with a company to scale up the manufacturing. Surprisingly, the scientist themselves did not have input into selecting the company.</p>

<blockquote>
  <p>But Andy’s email was the first time I heard mention that it would be AstraZeneca, and it was something of a surprise. I knew they were big in cancer medicines and they were obviously a name in the pharma world, but they did not have a particular reputation for vaccine manufacture. We felt a bit disconnected: as though decisions that would really affect our working lives (by this point all of us were working on this project all of the time and it had completely taken over every waking and sleeping hour) were being taken at the highest level of the university with no consultation with those of us who actually knew how to make this vaccine.</p>
</blockquote>

<p>Working with a large multinational was the exact opposite of what they were used to: a close-knit group of scientists who all understood what everyone else was working on.</p>

<blockquote>
  <p>AstraZeneca is an enormous entity, with multiple teams across the UK and the US with quite specialised roles, whereas everyone at our end was involved in and knew about everything. Also, they had no experience of manufacturing viral vectors, so the technical aspects of producing viral vectors, and the quality tests needed for these kinds of products, were all new to them. It was frustrating to have to keep repeating ourselves to slightly different combinations of AstraZeneca people.</p>
</blockquote>

<p>But the Oxford scientists did come to appreciate their corporate partners. Companies have certain strengths that tend to complement research scientists.</p>

<blockquote>
  <p>I remember being in a meeting very early on, probably in May, when someone at AstraZeneca confidently used the phrase ‘billions of doses’. That’s a real shock to the system when a really big day for you is manually putting 500 doses into vials. They were prepared to throw everything at it straightaway, rather than waiting for results from clinical trials before they fully invested.</p>
</blockquote>

<p>Companies also just have way more money to spend. (Provided the benefit/cost ratio is high enough).</p>

<blockquote>
  <p>The vaccine was still in Italy, the trials were in the UK – and there were no commercial flights operating between the two. We were stuck.
…
It turns out chartering a private jet costs around £20,000, normally well beyond the budget of a small academic clinical trial. But by this time we had the might of a global pharma company behind us. All those meetings with our AstraZeneca colleagues were starting to come good: we got permission to proceed. The jet arrived in London the next day with no passengers, just a large box of dry ice and 500 precious vials for next-day distribution across the UK. The trial must go on.</p>
</blockquote>

<h1 id="clinical-trials-and-regulatory-agencies">Clinical trials and regulatory agencies</h1>

<p>The clinical trials were similar to clinical trials in normal times. Except that the gaps between the stage I, II and III trials were much smaller and all the data was processed much faster. Recruiting volunteers was also much easier.</p>

<blockquote>
  <p>The recruitment of volunteers to clinical trials is often quite a challenge…. [But] Within hours of announcing that we were recruiting volunteers for trials, we had thousands of applications.</p>
</blockquote>

<p>The scientists praise the MHRA (the UK health regulator) in the book. They said the MHRA was quite cooperative and was willing to weigh the harms of using a slightly different process than usual against the benefits of manufacturing a vaccine faster. For example, there were some issues with the dosing in the Phase I trials and the MHRA was reasonably flexible. I take what the authors say with a pinch of salt because the MHRA acts as a gatekeeper for all their work. So it is probably unwise for them to heavily criticize the MHRA and sour their relationship with the very same people who they need to cooperate with for the rest of their careers. On the other hand, anecdotes like this are good evidence the praise is genuine:</p>

<blockquote>
  <p>The biggest risk (albeit a very small one) was that the Covid-19 vaccine might get contaminated with a bit of the previously manufactured product. We had a very sensitive and specific test for this valuable product, so we knew we would be able to test our final vaccine to check if there had been any contamination. Proceeding without fumigation would save at least three weeks. We drew up a formal risk assessment and submitted it to the Medicines and Healthcare products Regulatory Agency (MHRA, the body responsible for approving every step of our vaccine development process and ultimately for deciding whether to allow it to be used), who agreed our approach. (This was the first of a very large number of communications we would have with the MHRA over the coming months. That relationship, and the MHRA’s proactive approach, is a critical part of this story.)</p>
</blockquote>

<p>Compare this to the FDA (the US health regulator). The FDA has been <a href="https://marginalrevolution.com/marginalrevolution/2021/08/81683.html">criticized heavily</a> throughout the pandemic for being too biased in favor of inaction and inflexibly following pre-pandemic processes. This anecdote was telling:</p>

<blockquote>
  <p>The FDA approach was more process-driven, whereas the MHRA’s approach was more interactive, and more focused on gathering the evidence needed to assess the risks and answer the scientific questions. By way of illustration, many years previously we had been asked to collaborate with a US group working on malaria vaccine development. We had already completed a phase I clinical trial on a vaccine. It had been well tolerated, but – as happens a lot in vaccine development – the immune response was not as high as we had hoped, and we were not planning to proceed any further with it. But we did still have some of the batch left and the US group wanted to do a trial using our vaccine in combination with another one to see if that might improve the immune response. The issue we came up against was that although we had completed a clinical trial successfully in the UK, the FDA required toxicology studies to have been completed in two different species whereas in the UK we only have to complete a toxicology study in one species. We had done that and proceeded to human trials, and shown no safety concerns. On a call with the FDA, we explained that we had safety data from mice, and also from humans, which are a species after all, so would that work for them? The answer was no. They needed toxicology studies from another animal species – a rat or a rabbit. The problem was that if we did a toxicology study in rats or rabbits, it would use up the limited amount of vaccine that was remaining, and we wouldn’t then be able to do the clinical trial.</p>
</blockquote>

<blockquote>
  <p>…</p>
</blockquote>

<blockquote>
  <p>We were, however, unable to come to an agreement so the clinical trial was never carried out.</p>
</blockquote>

<p>It was hard to read passages like the one above. The TGA (the Australian health regulator) has somehow been <a href="https://marginalrevolution.com/marginalrevolution/2021/08/the-tga-is-worse-than-the-fda-and-the-australian-lockdown.html">even slower than the FDA</a>. As a result, I’m writing this during a lockdown in Australia where (as of Aug 21) our vaccination rate is <a href="/assets/vaccination_rates_by_country.PNG">lower than 36/38 OECD countries</a>. These excerpts from an article by <a href="https://www.afr.com/policy/health-and-education/on-covid-19-the-medical-regulatory-complex-has-failed-us-20210809-p58haw">Steven Hamilton and Richard Holden</a> give a summary for those of you who are unfamiliar with Australia:</p>

<blockquote>
  <p>At the end of 2020, as vaccines were rolling out en masse in the Northern Hemisphere, the TGA [Therapeutic Goods Administration, AT] flatly refused to issue the emergency authorisations other regulators did. As a result, the TGA didn’t approve the Pfizer vaccine until January 25, more than six weeks after the US Food and Drug Administration (FDA), itself not exactly the poster child of expeditiousness.</p>
</blockquote>

<blockquote>
  <p>Similarly, the TGA didn’t approve the AstraZeneca vaccine until February 16, almost seven weeks after the UK.</p>
</blockquote>

<blockquote>
  <p>In case you’re wondering “what difference does six weeks make?“, think again. Were our rollout six weeks faster, the current Sydney outbreak would likely never have exploded, saving many lives and livelihoods. In the face of an exponentially spreading virus that has become twice as infectious, six weeks is an eternity. And, indeed, nothing has changed. The TGA approved the Moderna vaccine this week, eight months after the FDA.</p>
</blockquote>

<blockquote>
  <p>It approved looser cold storage requirements for the Pfizer vaccine, which would allow the vaccine to be more widely distributed and reduce wastage, on April 8, six weeks after the FDA. And it approved the Pfizer vaccine for use by 12 to 15-year-olds on July 23, more than 10 weeks after the FDA.</p>
</blockquote>

<blockquote>
  <p>Where’s the approval of the mix-and-match vaccine regimen, used to great effect in Canada, where AstraZeneca is combined with Pfizer to expand supply and increase efficacy? Where’s the guidance for those who’ve received two doses of AstraZeneca that they’ll be able to receive a Pfizer booster later?</p>
</blockquote>

<blockquote>
  <p>But the slow, insular, and excessively cautious advice of our medical regulatory complex, which comprehensively failed to grasp the massive consequences of delay and inaction, must be right at the top of that list.</p>
</blockquote>

<h1 id="conclusion">Conclusion</h1>

<p>The book has great anecdotes from the scientists and delves into lots of other topics in detail:</p>
<ul>
  <li>The media and political response to the vaccine, e.g. Macron said the vaccine “‘seems quasi-ineffective on people older than 65” based on a news article that claimed the vaccine was “only 8% effective” for old people. This was completely made up.</li>
  <li>The infamous <a href="https://www.bbc.com/news/health-55086927">dosing problem</a> in the clinical trial.</li>
  <li>Security against anti-vax protestors.</li>
  <li>How much time scientists spend securing funding. Gilbert writes “Actually, raising funds had been my main activity for years”.</li>
  <li>Just how stressful this was for everyone involved.</li>
</ul>

<p>Unfortunately there’s not that much discussion of the blood clot concerns because that was a relatively recent development. (Personally, I think those concerns are overblown and have been vaccinated with AZ myself.)</p>

<p>The take-home message of the book was that we need to be prepared to make vaccines for a pandemic (duh) and we need to be prepared to make them <em>quickly</em>. In particular, the authors suggest working on an annual flu vaccine development with the same emphasis on speed that we would expect in a pandemic. This can be used to validate that we have the capability to rapidly create a vaccine and scale up manufacturing before the next pandemic hits us.</p>

<blockquote>
  <p>For example, might it be cost-effective, given how much flu costs the economy, to work on flu vaccine development with as much urgency as we applied to the Covid vaccine? It would require more funding upfront, and the acceptance that not everything that was tried would work, but it might be the way to make some real progress rather than continuing to limp along as we have in the past, with small projects and no joined-up approach.</p>
</blockquote>

<p>Most of the delay when creating the vaccine was logistical: securing funding, getting regulatory approval, conducting clinical trials and collaborating with multinational companies. An end-to-end test of rapid vaccine production, using the flu as a test case, is a great way to make sure every part of the pipeline can move quickly during a future pandemic. Besides, the normal flu is pretty deadly so we should arguably be putting a lot more money into fighting it anyway.</p>

<p><a href="https://www.amazon.com.au/Vaxxers-Inside-AstraZeneca-Vaccine-Against/dp/1529369878">Buy the book here.</a></p>]]></content><author><name></name></author><summary type="html"><![CDATA[Intro]]></summary></entry><entry><title type="html">libc++’s implementation of std::string</title><link href="https://joellaity.com/2020/01/31/string.html" rel="alternate" type="text/html" title="libc++’s implementation of std::string" /><published>2020-01-31T00:00:00+00:00</published><updated>2020-01-31T00:00:00+00:00</updated><id>https://joellaity.com/2020/01/31/string</id><content type="html" xml:base="https://joellaity.com/2020/01/31/string.html"><![CDATA[<h2 id="i-introduction">I. Introduction</h2>

<p><a href="http://libcxx.llvm.org/">libc++</a> is the <a href="http://llvm.org">LLVM</a> project’s implementation of the C++ standard library.
libc++’s implementation of <code class="language-plaintext highlighter-rouge">std::string</code> is a fascinating case study of how to optimize container classes.
Unfortunately, the source code is very hard to read because it is extremely:</p>

<ul>
  <li>Optimized.
 Even for relatively niche use-cases.</li>
  <li>General.
 The <code class="language-plaintext highlighter-rouge">std::string</code> class is a specialization of <code class="language-plaintext highlighter-rouge">basic_string</code>.
 <code class="language-plaintext highlighter-rouge">basic_string</code> can accept a custom character type and custom allocator.</li>
  <li>Portable.
 This leads to <a href="https://en.cppreference.com/w/cpp/preprocessor/conditional_"><code class="language-plaintext highlighter-rouge">#ifdef</code></a> macros everywhere.</li>
  <li>Resilient.
 Every non-public identifier is prefixed with underscores to prevent name clashes with other code.
 This is necessary even for local variables since macros defined by the user of the library could modify the library’s header file.</li>
  <li>Undocumented.
 There are very few comments in the <code class="language-plaintext highlighter-rouge">&lt;string&gt;</code> header.
 I assume this is because library vendors would prefer it if users did not rely on internal implementation details of their classes, and not documenting internal helper functions is a desperate effort to mitigate <a href="https://www.hyrumslaw.com/">Hyrum’s law</a>.</li>
</ul>

<p>This post examines the implementation of libc++’s <code class="language-plaintext highlighter-rouge">std::string</code>.
To keep it simple I will assume you are using a modern compiler and a modern x86 processor<sup id="a1"><a href="#f1">1</a></sup>.
Keep in mind that the way objects are laid out in memory is very specific to the compiler, CPU archictecture and standard library used; everything I describe below is an implementation detail and not defined by the C++ standard.</p>

<h2 id="ii-data-layout">II. Data layout</h2>

<p><code class="language-plaintext highlighter-rouge">std::string</code> has two modes: long string and short string. It uses a <a href="https://en.cppreference.com/w/cpp/language/union">union</a> to reuse the same bytes for both modes. Short string mode is an optimization which makes it possible to store up to 22 characters without <a href="https://en.wikipedia.org/wiki/Memory_management#Dynamic_memory_allocation">heap allocation</a>.</p>

<h3 id="long-string-mode">Long string mode</h3>

<p><img src="/assets/long_string.jpg" alt="Long string" /></p>

<p>The long string mode is a pretty standard string implementation. There are three members:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">size_t __cap_</code> - The amount of space in the underlying character buffer. If the string grows enough that length of the string (including the null-terminator) exceeds <code class="language-plaintext highlighter-rouge">__cap_</code> then the buffer must be reallocated. <code class="language-plaintext highlighter-rouge">__cap_</code> is an unsigned 64 bit integer. The least significant bit of <code class="language-plaintext highlighter-rouge">__cap_</code> is used as a flag, see the discussion below.</li>
  <li><code class="language-plaintext highlighter-rouge">size_t __size_</code> - The size of the current string, not including the <a href="https://en.wikipedia.org/wiki/Null-terminated_string">null terminator</a>. This is also an unsigned 64 bit integer.</li>
  <li><code class="language-plaintext highlighter-rouge">char* __data_</code> - A pointer to the underlying buffer where the characters of the string are stored. This is 64 bits wide.</li>
</ul>

<p>Since each member is 8 bytes, <code class="language-plaintext highlighter-rouge">sizeof(std::string) == 24</code>.</p>

<p><code class="language-plaintext highlighter-rouge">std::string</code> uses the least significant bit of <code class="language-plaintext highlighter-rouge">__cap_</code> to distinguish whether it is in long string mode or short string mode.
If the least significant bit is set to 1, then it is in long string mode. If it is set to zero, then it is in short string mode.
It is possible to use the least significant bit in this way because the size of the buffer is guaranteed by the implementation to always be an even number - so the true value for the capacity always has a 0 in the least significant bit.
The method <code class="language-plaintext highlighter-rouge">std::string::capacity()</code> has an implementation that is equivalent to this (the real code looks quite different):</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">size_t</span> <span class="nf">capacity</span><span class="p">()</span> <span class="p">{</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">__cap_</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span> <span class="c1">// Long string mode.</span>
    <span class="c1">// buffer_size holds the true size of the underlying buffer pointed</span>
    <span class="c1">// to by data_. The size of the buffer is always an even number. The</span>
    <span class="c1">// least significant bit of __cap_ is cleared since it is just used</span>
    <span class="c1">// as a flag to indicate that we are in long string mode.</span>
    <span class="kt">size_t</span> <span class="n">buffer_size</span> <span class="o">=</span> <span class="n">__cap_</span> <span class="o">&amp;</span> <span class="o">~</span><span class="mi">1ul</span><span class="p">;</span>
    <span class="c1">// Subtract 1 because the null terminator takes up one spot in the</span>
    <span class="c1">// character buffer.</span>
    <span class="k">return</span> <span class="n">buffer_size</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span> 
  <span class="p">}</span>

  <span class="c1">// &lt;Handle short string mode.&gt;</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="short-string-mode">Short string mode</h3>

<p><img src="/assets/short_string.jpg" alt="Short string" /></p>

<p>The short string mode uses the same 24 bytes to mean something completely different. There are two members:</p>
<ul>
  <li><code class="language-plaintext highlighter-rouge">unsigned char __size_</code> - The size of the string, left-shifted by one (<code class="language-plaintext highlighter-rouge">__size_ == (true_size &lt;&lt; 1)</code>).
The true size of the string is left-shifted by one because the least significant bit of the first byte is used as a flag.
The least significant bit must be set to 0 in short string mode.</li>
  <li><code class="language-plaintext highlighter-rouge">char __data_[23]</code> - A buffer to hold the characters of the string.</li>
</ul>

<p><code class="language-plaintext highlighter-rouge">__size_</code> stores the size of the string left shifted by 1, so the method <code class="language-plaintext highlighter-rouge">std::string::size()</code> has an implementation equivalent to this:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">size_t</span> <span class="nf">size</span><span class="p">()</span> <span class="p">{</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">__size_</span> <span class="o">&amp;</span> <span class="mi">1u</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>  <span class="c1">// Short string mode.</span>
    <span class="k">return</span> <span class="n">__size_</span> <span class="o">&gt;&gt;</span> <span class="mi">1</span><span class="p">;</span>
  <span class="p">}</span>
  <span class="c1">// &lt;Handle long string mode.&gt;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Because we are assuming the target architecture is little-endian, the least significant bit of <code class="language-plaintext highlighter-rouge">__cap_</code> is in the same position as the least significant bit of <code class="language-plaintext highlighter-rouge">__size_</code>.</p>

<h2 id="iii-implementation">III. Implementation</h2>

<p>To see how the libc++ implementation achieves the data layout described above, I’m going to copy and paste real code snippets from libc++ and add comments.</p>

<p>Long mode is reasonably straightforward, it’s implemented like this:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// size_type and pointer are type aliases.</span>
<span class="k">struct</span> <span class="n">__long</span> <span class="p">{</span>
  <span class="n">size_type</span> <span class="n">__cap_</span><span class="p">;</span>
  <span class="n">size_type</span> <span class="n">__size_</span><span class="p">;</span>
  <span class="n">pointer</span> <span class="n">__data_</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>

<p>Short mode looks like this:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="k">const</span> <span class="n">size_type</span> <span class="n">__short_mask</span> <span class="o">=</span> <span class="mh">0x01</span><span class="p">;</span>
<span class="k">static</span> <span class="k">const</span> <span class="n">size_type</span> <span class="n">__long_mask</span> <span class="o">=</span> <span class="mh">0x1ul</span><span class="p">;</span>

<span class="k">enum</span> <span class="p">{</span>
  <span class="n">__min_cap</span> <span class="o">=</span> <span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">__long</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">value_type</span><span class="p">)</span> <span class="o">&gt;</span> <span class="mi">2</span>
                  <span class="o">?</span> <span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">__long</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">value_type</span><span class="p">)</span>
                  <span class="o">:</span> <span class="mi">2</span>
<span class="p">};</span>

<span class="k">struct</span> <span class="n">__short</span> <span class="p">{</span>
  <span class="k">union</span> <span class="p">{</span>
    <span class="kt">unsigned</span> <span class="kt">char</span> <span class="n">__size_</span><span class="p">;</span>
    <span class="n">value_type</span> <span class="n">__lx</span><span class="p">;</span>
  <span class="p">};</span>
  <span class="n">value_type</span> <span class="n">__data_</span><span class="p">[</span><span class="n">__min_cap</span><span class="p">];</span>
<span class="p">};</span>
</code></pre></div></div>

<p>According to <a href="https://www.reddit.com/r/cpp/comments/blnwra/stdstring_implementation_in_libc/emqrpz8/">this</a> Reddit comment, <code class="language-plaintext highlighter-rouge">__lx</code> is needed to ensure any <a href="https://en.wikipedia.org/wiki/Data_structure_alignment">padding</a> goes after <code class="language-plaintext highlighter-rouge">__size_</code>, but has no other purpose (I don’t fully understand <em>why</em> this forces the padding to go after <code class="language-plaintext highlighter-rouge">__size_</code> 🤷‍♂).
<code class="language-plaintext highlighter-rouge">__min_cap</code> is 23 on the platforms we are considering (64-bit).</p>

<p>So the first byte of <code class="language-plaintext highlighter-rouge">__short</code> is occupied by <code class="language-plaintext highlighter-rouge">__size_</code>, and the next 23 are occupied the <code class="language-plaintext highlighter-rouge">__data_</code> array.</p>

<p>The string is then represented like this:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// __ulx is only used to calculate __n_words.</span>
<span class="k">union</span> <span class="n">__ulx</span> <span class="p">{</span>
  <span class="n">__long</span> <span class="n">__lx</span><span class="p">;</span>
  <span class="n">__short</span> <span class="n">__lxx</span><span class="p">;</span>
<span class="p">};</span>

<span class="k">enum</span> <span class="p">{</span> <span class="n">__n_words</span> <span class="o">=</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">__ulx</span><span class="p">)</span> <span class="o">/</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">size_type</span><span class="p">)</span> <span class="p">};</span>

<span class="k">struct</span> <span class="n">__raw</span> <span class="p">{</span>
  <span class="n">size_type</span> <span class="n">__words</span><span class="p">[</span><span class="n">__n_words</span><span class="p">];</span>
<span class="p">};</span>

<span class="k">struct</span> <span class="n">__rep</span> <span class="p">{</span>
  <span class="k">union</span> <span class="p">{</span>
    <span class="n">__long</span> <span class="n">__l</span><span class="p">;</span>
    <span class="n">__short</span> <span class="n">__s</span><span class="p">;</span>
    <span class="n">__raw</span> <span class="n">__r</span><span class="p">;</span>
  <span class="p">};</span>
<span class="p">};</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">__rep_</code> struct represents the string. It is a union of <code class="language-plaintext highlighter-rouge">__long</code> and <code class="language-plaintext highlighter-rouge">__short</code> as expected.</p>

<p>The <code class="language-plaintext highlighter-rouge">__raw</code> struct is just an array of size 24 which allows some of the methods to consider the string as a sequence of bytes without having to care about whether the string is in long or short mode. For example, after a string is moved-from it is zeroed out, and the <code class="language-plaintext highlighter-rouge">__zero()</code> method is implemented like this:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">__zero</span><span class="p">()</span> <span class="p">{</span>
  <span class="n">size_type</span> <span class="p">(</span><span class="o">&amp;</span><span class="n">__a</span><span class="p">)[</span><span class="n">__n_words</span><span class="p">]</span> <span class="o">=</span> <span class="n">__r_</span><span class="p">.</span><span class="n">first</span><span class="p">().</span><span class="n">__r</span><span class="p">.</span><span class="n">__words</span><span class="p">;</span>
  <span class="k">for</span> <span class="p">(</span><span class="kt">unsigned</span> <span class="n">__i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">__i</span> <span class="o">&lt;</span> <span class="n">__n_words</span><span class="p">;</span> <span class="o">++</span><span class="n">__i</span><span class="p">)</span>
    <span class="n">__a</span><span class="p">[</span><span class="n">__i</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Finally, the only member variable in <code class="language-plaintext highlighter-rouge">std::string</code> is declared like this:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// allocator_type is the allocator defined by the user of basic_string</span>
<span class="n">__compressed_pair</span><span class="o">&lt;</span><span class="n">__rep</span><span class="p">,</span> <span class="n">allocator_type</span><span class="o">&gt;</span> <span class="n">__r_</span><span class="p">;</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">__compressed_pair</code> behaves like <code class="language-plaintext highlighter-rouge">std::pair</code>, except it has an optimization where if one of the templates in the pair is an empty class then that class will not contribute to the size of the pair. <code class="language-plaintext highlighter-rouge">std::pair</code> is larger than it needs to be, for example:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;utility&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;iostream&gt;</span><span class="cp">
</span>
<span class="k">struct</span> <span class="n">E</span> <span class="p">{};</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
  <span class="n">std</span><span class="o">::</span><span class="n">pair</span><span class="o">&lt;</span><span class="kt">int</span><span class="p">,</span> <span class="n">E</span><span class="o">&gt;</span> <span class="n">p</span><span class="p">;</span>
  <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">int</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>  <span class="c1">// Outputs 4.</span>
  <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">E</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>  <span class="c1">// Outputs 1.</span>
  <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">p</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>  <span class="c1">// Outputs 8.</span>
  <span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o">&lt;&lt;</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">__compressed_pair</span><span class="o">&lt;</span><span class="kt">int</span><span class="p">,</span> <span class="n">E</span><span class="o">&gt;</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>  <span class="c1">// Outputs 4.</span>
<span class="p">}</span>

</code></pre></div></div>

<p>The reason <code class="language-plaintext highlighter-rouge">E</code> uses any space in the example above is for language-technical reasons: every object must have a unique memory address. (This will change in C++20, see <a href="https://en.cppreference.com/w/cpp/language/attributes/no_unique_address">here</a> and <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0840r2.html">here</a>.) <code class="language-plaintext highlighter-rouge">std::pair</code> stores the objects next to each other in memory, and padding means that the <code class="language-plaintext highlighter-rouge">E</code> struct in the example above contributes 4 bytes to the pair.</p>

<p><code class="language-plaintext highlighter-rouge">__compressed_pair</code> will not use any extra space if <code class="language-plaintext highlighter-rouge">allocator_type</code> is empty.</p>

<p>And that’s all there is to it! The implementation of <code class="language-plaintext highlighter-rouge">std::string</code> looks like this (with <code class="language-plaintext highlighter-rouge">#ifdef</code>s removed):</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">template</span> <span class="o">&lt;</span><span class="n">class</span> <span class="n">_CharT</span><span class="p">,</span> <span class="n">class</span> <span class="n">_Traits</span><span class="p">,</span> <span class="n">class</span> <span class="n">_Allocator</span><span class="o">&gt;</span>
<span class="n">class</span> <span class="n">_LIBCPP_TEMPLATE_VIS</span> <span class="n">basic_string</span> <span class="o">:</span> <span class="n">private</span> <span class="n">__basic_string_common</span><span class="o">&lt;</span><span class="nb">true</span><span class="o">&gt;</span> <span class="p">{</span>
  <span class="c1">// &lt;Code omitted.&gt;</span>

<span class="nl">private:</span>
  <span class="k">struct</span> <span class="n">__long</span> <span class="p">{</span>
    <span class="n">size_type</span> <span class="n">__cap_</span><span class="p">;</span>
    <span class="n">size_type</span> <span class="n">__size_</span><span class="p">;</span>
    <span class="n">pointer</span> <span class="n">__data_</span><span class="p">;</span>
  <span class="p">};</span>

  <span class="k">static</span> <span class="k">const</span> <span class="n">size_type</span> <span class="n">__short_mask</span> <span class="o">=</span> <span class="mh">0x01</span><span class="p">;</span>
  <span class="k">static</span> <span class="k">const</span> <span class="n">size_type</span> <span class="n">__long_mask</span> <span class="o">=</span> <span class="mh">0x1ul</span><span class="p">;</span>

  <span class="k">enum</span> <span class="p">{</span>
    <span class="n">__min_cap</span> <span class="o">=</span> <span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">__long</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">value_type</span><span class="p">)</span> <span class="o">&gt;</span> <span class="mi">2</span>
                    <span class="o">?</span> <span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">__long</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">value_type</span><span class="p">)</span>
                    <span class="o">:</span> <span class="mi">2</span>
  <span class="p">};</span>

  <span class="k">struct</span> <span class="n">__short</span> <span class="p">{</span>
    <span class="k">union</span> <span class="p">{</span>
      <span class="kt">unsigned</span> <span class="kt">char</span> <span class="n">__size_</span><span class="p">;</span>
      <span class="n">value_type</span> <span class="n">__lx</span><span class="p">;</span>
    <span class="p">};</span>
    <span class="n">value_type</span> <span class="n">__data_</span><span class="p">[</span><span class="n">__min_cap</span><span class="p">];</span>
  <span class="p">};</span>

  <span class="k">union</span> <span class="n">__ulx</span> <span class="p">{</span>
    <span class="n">__long</span> <span class="n">__lx</span><span class="p">;</span>
    <span class="n">__short</span> <span class="n">__lxx</span><span class="p">;</span>
  <span class="p">};</span>

  <span class="k">enum</span> <span class="p">{</span> <span class="n">__n_words</span> <span class="o">=</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">__ulx</span><span class="p">)</span> <span class="o">/</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">size_type</span><span class="p">)</span> <span class="p">};</span>

  <span class="k">struct</span> <span class="n">__raw</span> <span class="p">{</span>
    <span class="n">size_type</span> <span class="n">__words</span><span class="p">[</span><span class="n">__n_words</span><span class="p">];</span>
  <span class="p">};</span>

  <span class="k">struct</span> <span class="n">__rep</span> <span class="p">{</span>
    <span class="k">union</span> <span class="p">{</span>
      <span class="n">__long</span> <span class="n">__l</span><span class="p">;</span>
      <span class="n">__short</span> <span class="n">__s</span><span class="p">;</span>
      <span class="n">__raw</span> <span class="n">__r</span><span class="p">;</span>
    <span class="p">};</span>
  <span class="p">};</span>

  <span class="n">__compressed_pair</span><span class="o">&lt;</span><span class="n">__rep</span><span class="p">,</span> <span class="n">allocator_type</span><span class="o">&gt;</span> <span class="n">__r_</span><span class="p">;</span>

<span class="nl">public:</span>
  <span class="c1">// &lt;Code omitted.&gt;</span>
<span class="p">};</span>

<span class="c1">// In another file:</span>
<span class="k">typedef</span> <span class="n">basic_string</span><span class="o">&lt;</span><span class="kt">char</span><span class="o">&gt;</span> <span class="n">string</span><span class="p">;</span>
</code></pre></div></div>

<p><a href="https://github.com/llvm-mirror/libcxx/blob/master/include/string">Here</a> is the full source on GitHub if you want to take a look.</p>

<p><a href="https://news.ycombinator.com/item?id=22198158">Comment on Hacker News</a></p>

<hr />

<p><b id="f1">1</b> In particular, I will assume that (1) you are using the <a href="https://github.com/llvm-mirror/libcxx/blob/78d6a7767ed57b50122a161b91f59f19c9bd0d19/include/__config#L70">standard ABI layout</a>, (2) your computer is 64-bit and <a href="https://en.wikipedia.org/wiki/Endianness#Little-endian">little endian</a> and (3) the <code class="language-plaintext highlighter-rouge">char</code> type is signed and <a href="https://en.cppreference.com/w/cpp/types/climits"><code class="language-plaintext highlighter-rouge">CHAR_BIT</code></a> is 8. (There may be something else I missed. In practice I’m just assuming the layout on your machine is the same as on my machine.)  <a href="#a1">↩</a></p>]]></content><author><name></name></author><summary type="html"><![CDATA[I. Introduction]]></summary></entry><entry><title type="html">How linking works</title><link href="https://joellaity.com/2020/01/25/linking.html" rel="alternate" type="text/html" title="How linking works" /><published>2020-01-25T00:00:00+00:00</published><updated>2020-01-25T00:00:00+00:00</updated><id>https://joellaity.com/2020/01/25/linking</id><content type="html" xml:base="https://joellaity.com/2020/01/25/linking.html"><![CDATA[<h2 id="i-introduction">I. Introduction</h2>

<p>C++ programs must be compiled and linked before they can be executed.
Compilation takes each human-readable <code class="language-plaintext highlighter-rouge">.cc</code> file as input and produces a machine readable <code class="language-plaintext highlighter-rouge">.o</code> file  as output.
Since a <code class="language-plaintext highlighter-rouge">.cc</code> file can use a function defined in another file, linking is necessary to match up the call sites of a function with its definition and produce the final executable.</p>

<p>It is not always obvious that linking is a separate step from compilation because command line tools like <code class="language-plaintext highlighter-rouge">g++</code> do <em>both</em> the compilation and linking in one go.</p>

<h2 id="ii-separate-compilation-and-linking-example">II. Separate compilation and linking example</h2>

<p>We’ll use this simple program as a running example.</p>

<script src="https://gist.github.com/joelypoley/24ebabf2958db33c2bedcc8b479d0be7.js"></script>

<p>To produce an executable file from the source code above, type:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>g++ square.cc main.cc  <span class="c"># Compile and link square.cc and main.cc.</span>
</code></pre></div></div>

<p>If you want to separately compile and then link you can type:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>g++ <span class="nt">-c</span> square.cc  <span class="c"># Compile square.cc to machine code.</span>
<span class="nv">$ </span>g++ <span class="nt">-c</span> main.cc    <span class="c"># Compile main.cc to machine code.</span>
<span class="nv">$ </span>g++ square.o main.o  <span class="c"># Link square.o and main.o.</span>
</code></pre></div></div>

<p><img src="/assets/linking.jpeg" alt="Linking diagram" /></p>

<p>The <code class="language-plaintext highlighter-rouge">-c</code> flag tells <code class="language-plaintext highlighter-rouge">g++</code> to compile the file without linking.
When you pass <code class="language-plaintext highlighter-rouge">.o</code> files to <code class="language-plaintext highlighter-rouge">g++</code> it will link them together.</p>

<p><code class="language-plaintext highlighter-rouge">g++ -c square.cc</code> takes the source code in <code class="language-plaintext highlighter-rouge">square.cc</code>, converts in to machine code that can be executed by your computer and finally puts that exectuable code in the <code class="language-plaintext highlighter-rouge">square.o</code> file along with some bookkeeping information.</p>

<p>The <code class="language-plaintext highlighter-rouge">square</code> function is <em>declared</em> in <code class="language-plaintext highlighter-rouge">main.cc</code> but it is not <em>defined</em> in <code class="language-plaintext highlighter-rouge">main.cc</code>.</p>

<p>A declaration looks like <code class="language-plaintext highlighter-rouge">int square(int x);</code>, it tells the compiler the types of the return values and the arguments of the function.
This allows the compiler to type check the function call <code class="language-plaintext highlighter-rouge">square(3)</code> without having to know how <code class="language-plaintext highlighter-rouge">square</code> is implemented<sup id="a1"><a href="#f1">1</a></sup>.
Typically declarations will be in a header file.</p>

<p>A definition contains the actual body of the function.
In our example, the <code class="language-plaintext highlighter-rouge">square</code> function is defined in <code class="language-plaintext highlighter-rouge">square.cc</code>.</p>

<p>After <code class="language-plaintext highlighter-rouge">main.cc</code> is compiled, the <code class="language-plaintext highlighter-rouge">main.o</code> file contains the machine code for the <code class="language-plaintext highlighter-rouge">main</code> function and some metadata which records that the <code class="language-plaintext highlighter-rouge">square</code> function is declared, but not defined, in <code class="language-plaintext highlighter-rouge">main.o</code>.</p>

<p>In the final step, <code class="language-plaintext highlighter-rouge">g++ square.o main.o</code> links the object files into an executable program by matching up the function declaration in <code class="language-plaintext highlighter-rouge">main.o</code> with the function defined in <code class="language-plaintext highlighter-rouge">square.o</code>.</p>

<h2 id="iii-linking-with-system-libraries">III. Linking with system libraries</h2>

<p>Most libraries will have many source files, and therefore many <code class="language-plaintext highlighter-rouge">.o</code> files.
When distributing a library on the internet, it is typical for the object files in the library to be bundled together into an <em>archive file</em> ending in the <code class="language-plaintext highlighter-rouge">.a</code> extension.
A <code class="language-plaintext highlighter-rouge">.a</code> file bundles a bunch of <code class="language-plaintext highlighter-rouge">.o</code> files together for convenient linking.</p>

<p>When you install a C library on Linux, the headers for the library are typically placed in <code class="language-plaintext highlighter-rouge">/usr/local/include</code> and the <code class="language-plaintext highlighter-rouge">.a</code> file in <code class="language-plaintext highlighter-rouge">/usr/local/lib</code>.</p>

<p>The compiler will automatically look in <code class="language-plaintext highlighter-rouge">/usr/local/include</code> for headers (again this only applies to Linux, type <code class="language-plaintext highlighter-rouge">g++ -E -Wp,-v -</code> to see the full include path).</p>

<p>To tell the compiler to link with a <code class="language-plaintext highlighter-rouge">.a</code> in <code class="language-plaintext highlighter-rouge">/usr/local/lib</code>, pass the flag <code class="language-plaintext highlighter-rouge">-l&lt;name of library&gt;</code> to the compiler (type <code class="language-plaintext highlighter-rouge">g++ -print-search-dirs</code> to see the full link path).</p>

<p>For example, I recently installed <a href="http://www.fftw.org/">FFTW</a> on my computer.
It added <code class="language-plaintext highlighter-rouge">libfftw3.a</code> to my <code class="language-plaintext highlighter-rouge">/usr/local/lib</code> directory.
To link with this library I type:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>g++ main.cc -lfftw3
</code></pre></div></div>

<p>Note the <code class="language-plaintext highlighter-rouge">lib</code> prefix is omitted, there is no space between <code class="language-plaintext highlighter-rouge">-l</code> and <code class="language-plaintext highlighter-rouge">fftw3</code> and the <code class="language-plaintext highlighter-rouge">-lfftw3</code> flag is after the source file which uses it.</p>

<h2 id="iv-inspecting-object-files">IV. Inspecting object files</h2>

<p>When researching this blog post, I found it really helpful to actually inspect the object files produced by the compiler.
<code class="language-plaintext highlighter-rouge">objdump</code> can do this.</p>

<p>First, add some global variables to <code class="language-plaintext highlighter-rouge">square.cc</code> to make it more interesting.</p>

<script src="https://gist.github.com/joelypoley/cf72b7a203d481731f295071ab3697a7.js"></script>

<p>To use <code class="language-plaintext highlighter-rouge">objdump</code>, type:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>g++ <span class="nt">-c</span> square.cc
<span class="nv">$ </span>objdump <span class="nt">--disassemble</span> <span class="nt">--full-contents</span> <span class="nt">--all-headers</span> <span class="nt">--section</span><span class="o">=</span>.text <span class="nt">--section</span><span class="o">=</span>.rodata <span class="nt">--section</span><span class="o">=</span>.data square.o
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">--disassemble</code> flag shows the assembly for our functions, <code class="language-plaintext highlighter-rouge">--full-contents</code> shows the contents of each section of the object file in both hex and ascii, <code class="language-plaintext highlighter-rouge">all-headers</code> shows the symbol table and sections, <code class="language-plaintext highlighter-rouge">--section=.text --section=.rodata --section=.data</code> filters the results to only include the functions we defined, global read-only data and global data.</p>

<p>The output is:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>square.o:     file format elf64-x86-64
square.o
architecture: i386:x86-64, flags 0x00000011:
HAS_RELOC, HAS_SYMS
start address 0x0000000000000000

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00000010  0000000000000000  0000000000000000  00000040  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .data         00000004  0000000000000000  0000000000000000  00000050  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  3 .rodata       0000000e  0000000000000000  0000000000000000  00000058  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
SYMBOL TABLE:
0000000000000000 l    d  .text	0000000000000000 .text
0000000000000000 l    d  .data	0000000000000000 .data
0000000000000000 l    d  .rodata	0000000000000000 .rodata
0000000000000000 l     O .rodata	000000000000000e _ZL8greeting
0000000000000000 g     O .data	0000000000000004 x
0000000000000000 g     F .text	0000000000000010 _Z6squarei


Contents of section .text:
 0000 554889e5 897dfc8b 45fc0faf 45fc5dc3  UH...}..E...E.].
Contents of section .data:
 0000 03000000                             ....            
Contents of section .rodata:
 0000 48656c6c 6f2c2077 6f726c64 2100      Hello, world!.  

Disassembly of section .text:

0000000000000000 &lt;_Z6squarei&gt;:
   0:	55                   	push   %rbp
   1:	48 89 e5             	mov    %rsp,%rbp
   4:	89 7d fc             	mov    %edi,-0x4(%rbp)
   7:	8b 45 fc             	mov    -0x4(%rbp),%eax
   a:	0f af 45 fc          	imul   -0x4(%rbp),%eax
   e:	5d                   	pop    %rbp
   f:	c3                   	retq   

Disassembly of section .data:

0000000000000000 &lt;x&gt;:
   0:	03 00 00 00                                         ....

Disassembly of section .rodata:

0000000000000000 &lt;_ZL8greeting&gt;:
   0:	48 65 6c 6c 6f 2c 20 77 6f 72 6c 64 21 00           Hello, world!.

</code></pre></div></div>

<p>The first line</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>square.o:     file format elf64-x86-64
</code></pre></div></div>

<p>tells us that the file is in the Executable and Linkable Format (ELF).
This is the default object code format on Linux.
On macOS it is Mach-O and on Windows there is COM, PE and PE32+.</p>

<p>The object file is organized into sections.
Metadata about the sections are displayed in a table.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00000010  0000000000000000  0000000000000000  00000040  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .data         00000004  0000000000000000  0000000000000000  00000050  2**2
                  CONTENTS, ALLOC, LOAD, DATA
  3 .rodata       0000000e  0000000000000000  0000000000000000  00000058  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
</code></pre></div></div>

<p>The symbol table is a table which contains metadata about every global variable and function.
For example, the entry <code class="language-plaintext highlighter-rouge">_Z6squarei</code> is the entry for the <code class="language-plaintext highlighter-rouge">square</code> function.
The compiler transforms the names of functions in a process called <a href="https://en.wikipedia.org/wiki/Name_mangling">name mangling</a> to encode type information (and potentially other data such as which namespace the function was declared in) into the function name.
This ensures that even if we declare two different functions with same name such as <code class="language-plaintext highlighter-rouge">int square(int x)</code> and <code class="language-plaintext highlighter-rouge">double square(double x)</code>, every entry in the symbol table will have a unique name.
You can add the flag <code class="language-plaintext highlighter-rouge">--demangle</code> to <code class="language-plaintext highlighter-rouge">objdump</code> to make the names more human-readable.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SYMBOL TABLE:
0000000000000000 l    d  .text	0000000000000000 .text
0000000000000000 l    d  .data	0000000000000000 .data
0000000000000000 l    d  .rodata	0000000000000000 .rodata
0000000000000000 l     O .rodata	000000000000000e _ZL8greeting
0000000000000000 g     O .data	0000000000000004 x
0000000000000000 g     F .text	0000000000000010 _Z6squarei
</code></pre></div></div>

<p>The contents of each of the sections is shown in <a href="https://en.wikipedia.org/wiki/Hexadecimal">hexadecimal</a> and ASCII.
The <code class="language-plaintext highlighter-rouge">.text</code> section contains the machine code for the <code class="language-plaintext highlighter-rouge">square</code> function, the <code class="language-plaintext highlighter-rouge">.data</code> section contains our global variable <code class="language-plaintext highlighter-rouge">x</code> (which has value 3) and  the <code class="language-plaintext highlighter-rouge">.rodata</code> section (read-only data) has our <code class="language-plaintext highlighter-rouge">greeting</code> variable (which has value “Hello, world!”).</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Contents of section .text:
 0000 554889e5 897dfc8b 45fc0faf 45fc5dc3  UH...}..E...E.].
Contents of section .data:
 0000 03000000                             ....            
Contents of section .rodata:
 0000 48656c6c 6f2c2077 6f726c64 2100      Hello, world!.  
</code></pre></div></div>

<p>The assembly code for the <code class="language-plaintext highlighter-rouge">square</code> function is shown at the end of the output.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Disassembly of section .text:

0000000000000000 &lt;_Z6squarei&gt;:
   0:	55                   	push   %rbp
   1:	48 89 e5             	mov    %rsp,%rbp
   4:	89 7d fc             	mov    %edi,-0x4(%rbp)
   7:	8b 45 fc             	mov    -0x4(%rbp),%eax
   a:	0f af 45 fc          	imul   -0x4(%rbp),%eax
   e:	5d                   	pop    %rbp
   f:	c3                   	retq   
</code></pre></div></div>

<h2 id="v">V.</h2>

<p>Many statically typed languages such as C, Rust and Swift follow the same model as C++: separate compilation of source files followed by linking.
This means you can call e.g. C functions from Swift by compiling the human-readable source files into object files and linking them together.
It’s useful to know about linking if you want to interop between C and more modern languages.</p>

<p>Even if you just stick to C++, some of the <a href="https://en.wikipedia.org/wiki/One_Definition_Rule#Example_showing_unexpected_side_effects">darker corners of language</a> and the more <a href="https://stackoverflow.com/questions/12573816/what-is-an-undefined-reference-unresolved-external-symbol-error-and-how-do-i-fix">inscrutable error messages</a> from the compiler are due to C++’s linking model.
Having a good mental model of linking and compilation can save hours of debugging.</p>

<p><small>
P.S.

To keep things simple I didn't discuss link-time optimization at all in this post.
For a great overview of link-time optimization and its implementation in LLVM, see Teresa Johnson's talk <a href="https://youtu.be/p9nH2vZ2mNo">ThinkLTO: Scalable and Incremental Link-Time Optimization</a>.
(This is one of my favorite CppCon talks ever!)
</small></p>

<p><a href="https://news.ycombinator.com/item?id=22145543">Comment on Hacker News</a></p>

<hr />
<p><b id="f1">1</b> C does not require explicit declarations, C++ does.
A C program calling a function that has not been declared will not compile with a C++ compiler. <a href="#a1">↩</a></p>]]></content><author><name></name></author><summary type="html"><![CDATA[I. Introduction]]></summary></entry><entry><title type="html">Discrete Fourier analysis notes</title><link href="https://joellaity.com/2019/03/02/discrete-fourier-analysis.html" rel="alternate" type="text/html" title="Discrete Fourier analysis notes" /><published>2019-03-02T00:00:00+00:00</published><updated>2019-03-02T00:00:00+00:00</updated><id>https://joellaity.com/2019/03/02/discrete-fourier-analysis</id><content type="html" xml:base="https://joellaity.com/2019/03/02/discrete-fourier-analysis.html"><![CDATA[<p><a href="/assets/notes/discrete_fourier_analysis.pdf">These</a> are my notes on discrete Fourier analysis. It’s basically just an expanded version of the first chapter of my master’s thesis.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[These are my notes on discrete Fourier analysis. It’s basically just an expanded version of the first chapter of my master’s thesis.]]></summary></entry><entry><title type="html">Network flow notes</title><link href="https://joellaity.com/2019/03/02/network-flow-notes.html" rel="alternate" type="text/html" title="Network flow notes" /><published>2019-03-02T00:00:00+00:00</published><updated>2019-03-02T00:00:00+00:00</updated><id>https://joellaity.com/2019/03/02/network-flow-notes</id><content type="html" xml:base="https://joellaity.com/2019/03/02/network-flow-notes.html"><![CDATA[<p><a href="/assets/notes/network_flow.pdf">These</a> are my notes on network flow. The max-flow min-cut theorem is proved at the end.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[These are my notes on network flow. The max-flow min-cut theorem is proved at the end.]]></summary></entry><entry><title type="html">Checkmate, undefined behavior</title><link href="https://joellaity.com/2019/02/28/checkmate-undefined-behavior.html" rel="alternate" type="text/html" title="Checkmate, undefined behavior" /><published>2019-02-28T00:00:00+00:00</published><updated>2019-02-28T00:00:00+00:00</updated><id>https://joellaity.com/2019/02/28/checkmate-undefined-behavior</id><content type="html" xml:base="https://joellaity.com/2019/02/28/checkmate-undefined-behavior.html"><![CDATA[<p>Undefined behavior is the bane of C and C++ programmers. The compiler can choose to do whatever it wants if a program has undefined behavior. This is normally not a good thing, but I recently wrote some code with undefined behavior and amazingly the compiler chose to do exactly what I had intended, not what I told it to do.</p>

<p>I have spent the last week working on a <a href="https://github.com/joelypoley/pawn_grabber">chess engine</a> in C++. Most chess engines take advantage of the convenient coincidence that the number of squares on a chess board, 64, is the same as the word size on modern processors. So, you can do things like store the location of all the white pawns with a single 64 bit integer: you just set the i-th bit to 1 if there is a white pawn on the i-th square. This technique allows you to do neat tricks, such as move all pieces up one square by left shifting the integer by 8.</p>

<p>I wrote a simple utility function that takes the name of the square as a string and returns the corresponding 64 bit integer. Chess players use a simple naming convention for the squares on a chessboard: the rows are labeled 1-8 and the columns are labelled a-h, so the square in the bottom left hand corner is the a1 square.</p>

<p><img src="/assets/algebraic_notation.png" alt="chessboard" /></p>

<p>Here is (roughly) how I implemented my string to 64 bit integer function. Can you see what’s wrong with it?</p>

<figure class="highlight"><pre><code class="language-c--" data-lang="c++"><span class="c1">// At the top of the file.</span>
<span class="k">constexpr</span> <span class="kt">int</span> <span class="n">board_size</span> <span class="o">=</span> <span class="mi">8</span><span class="p">;</span>

<span class="c1">// algebraic_square would be one of "a1", "a2", ..., "h7", "h8".</span>
<span class="kt">uint64_t</span> <span class="nf">str_to_square</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">string_view</span> <span class="n">algebraic_square</span><span class="p">)</span> <span class="p">{</span>
  <span class="k">const</span> <span class="kt">char</span> <span class="n">column</span> <span class="o">=</span> <span class="n">algebraic_square</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
  <span class="k">const</span> <span class="kt">char</span> <span class="n">row</span> <span class="o">=</span> <span class="n">algebraic_square</span><span class="p">[</span><span class="mi">1</span><span class="p">];</span>
  <span class="k">const</span> <span class="kt">int</span> <span class="n">column_index</span> <span class="o">=</span> <span class="n">column</span> <span class="o">-</span> <span class="sc">'a'</span><span class="p">;</span>
  <span class="k">const</span> <span class="kt">int</span> <span class="n">row_index</span> <span class="o">=</span> <span class="n">row</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span>
  <span class="k">return</span> <span class="kt">uint64_t</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="p">((</span><span class="n">row_index</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">*</span> <span class="n">board_size</span> <span class="o">-</span> <span class="n">column_index</span> <span class="o">-</span> <span class="mi">1</span><span class="p">);</span>
<span class="p">}</span></code></pre></figure>

<p>I forgot to put quotes around the <code class="language-plaintext highlighter-rouge">1</code> in the line <code class="language-plaintext highlighter-rouge">const int row_index = row - 1;</code>! Instead of subtracting the character <code class="language-plaintext highlighter-rouge">'1'</code>, I subtracted the integer <code class="language-plaintext highlighter-rouge">1</code>. Since the ascii encoding of the character <code class="language-plaintext highlighter-rouge">'1'</code> is 49, the <code class="language-plaintext highlighter-rouge">row_index</code> is always off by 48.</p>

<p>This bug disturbed me, not because bugs like this are so unusual, but because none of my tests caught this and I only discovered the bug when I was tidying up some of the surrounding code. I was left shifting a 64 bit integer by at least 384 every time I called this function and yet it seemingly caused none of my tests to fail. After some investigation I concluded that for <em>every</em> single square on the chess board my code gave the right answer. This was unexpected to say the least.</p>

<p>I was already aware that left shifting off the end of a <em>signed</em> integer is undefined behavior but I thought that left shifting off the end of unsigned integers was perfectly well defined, the most significant bits just get discarded. From <a href="https://en.cppreference.com/w/">cpprefence.com</a>:</p>

<blockquote>
  <p>For unsigned a, the value of a « b is the value of a * 2<sup>b</sup>, reduced modulo 2<sup>N</sup> where N is the number of bits in the return type (that is, bitwise left shift is performed and the bits that get shifted out of the destination type are discarded).</p>
</blockquote>

<p>According to cppreference, my function should simply push the single set bit <code class="language-plaintext highlighter-rouge">uint64_t(1)</code> off the end and return 0 every time. Since <code class="language-plaintext highlighter-rouge">str_to_square</code> clearly wasn’t doing this, my next step was to run my program with the <a href="https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html">UndefinedBehaviorSanitizer</a>. I got the following warning.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>runtime error: shift exponent 384 is too large for 64-bit type 'uint64_t' (aka 'unsigned long')
</code></pre></div></div>

<p>Which confirmed that I was indeed invoking undefined behavior.</p>

<p>After consulting the <a href="http://www.open-std.org/Jtc1/sc22/wg21/docs/papers/2014/n4296.pdf">C++ standard</a> (something I had been trying to avoid doing) I still did not understand. Paragraph 5.8.2 says:</p>

<blockquote>
  <p>5.8.2 The value of E1 « E2 is E1 left-shifted E2 bit positions; vacated bits are zero-filled. If E1 has an unsigned type, the value of the result is E1 × 2<sup>E2</sup>, reduced modulo one more than the maximum value representable in the result type. Otherwise, if E1 has a signed type and non-negative value, and E1 × 2<sup>E2</sup> is representable in the corresponding unsigned type of the result type, then that value, converted to the result type, is the resulting value; otherwise, the behavior is undefined.</p>
</blockquote>

<p>This paragraph only mentions undefined behavior for signed integers, but I was using unsigned integers so it shouldn’t affect me.</p>

<p>I was just about to give up. It was getting late, and although it was a remarkable coincidence that forgetting the quote marks didn’t affect the behavior of my program, I had already fixed the bug. Then I noticed the paragraph above 5.8.2:</p>

<blockquote>
  <p>5.8.1. The shift operators « and » group left-to-right. … The behavior is undefined if the right operand is negative, or greater than or equal to the length in bits of the promoted left operand.</p>
</blockquote>

<p>I finally had my answer! It is undefined behavior to shift a 64 bit integer by 64 or greater.</p>

<p>All bets are off once your program has undefined behavior, but it was remarkable that my program was seemingly doing what I intended it to do, rather than what I had actually told it to do. I thought that left shifting by more than the “length in bits of the promoted left operand” would result in zero, but instead I was getting the correct answer each time.</p>

<p>To see what was going on I copy and pasted my function into <a href="https://godbolt.org/z/z1Vobs">compiler explorer</a>, turned optimizations up to <code class="language-plaintext highlighter-rouge">-O3</code> so the output was less noisy, and got:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>str_to_square(std::basic_string_view&lt;char, std::char_traits&lt;char&gt; &gt;): # @str_to_square(std::basic_string_view&lt;char, std::char_traits&lt;char&gt; &gt;)
        movzx   eax, byte ptr [rsi]
        movzx   ecx, byte ptr [rsi + 1]
        mov     edx, 96
        sub     edx, eax
        lea     ecx, [rdx + 8*rcx]
        mov     eax, 1
        shl     rax, cl
        ret
</code></pre></div></div>

<p>The left shift is being done by the <code class="language-plaintext highlighter-rouge">shl</code> instruction. Helpfully, if you right click on an assembly instruction in compiler explorer it points you to the documentation for that instruction, which said:</p>

<blockquote>
  <p>The destination operand can be a register or a memory location. The count operand can be an immediate value or the CL register. The count is masked to 5 bits (or 6 bits if in 64-bit mode and REX.W is used).</p>
</blockquote>

<p>Masking by 6 bits is the same as reducing modulo 64 and by coincidence, <code class="language-plaintext highlighter-rouge">((row - 1) + 1) * board_size</code> is the same as the correct value <code class="language-plaintext highlighter-rouge">(row - '1' + 1) * board_size</code> modulo 64 (because <code class="language-plaintext highlighter-rouge">(('1' - 1) * board_size) % 64 == 0</code>).</p>

<p>The undefined behavior gods must have been smiling down on me.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[Undefined behavior is the bane of C and C++ programmers. The compiler can choose to do whatever it wants if a program has undefined behavior. This is normally not a good thing, but I recently wrote some code with undefined behavior and amazingly the compiler chose to do exactly what I had intended, not what I told it to do.]]></summary></entry><entry><title type="html">Principal component analysis: pictures, code and proofs</title><link href="https://joellaity.com/2018/10/18/pca.html" rel="alternate" type="text/html" title="Principal component analysis: pictures, code and proofs" /><published>2018-10-18T00:00:00+00:00</published><updated>2018-10-18T00:00:00+00:00</updated><id>https://joellaity.com/2018/10/18/pca</id><content type="html" xml:base="https://joellaity.com/2018/10/18/pca.html"><![CDATA[<p><small>The code used to generate the plots for this post can be found <a href="https://github.com/joelypoley/joelypoley.github.io/blob/master/assets/notebooks/pca.ipynb">here</a>.</small></p>

<h2 id="i">I.</h2>

<p>Principal component analysis is a form of <a href="https://en.wikipedia.org/wiki/Feature_engineering">feature engineering</a> that reduces the number of dimensions needed to represent your data. If a neural network has fewer inputs then there are less weights to train, which makes it easier and faster to train the model.</p>

<p><img src="/assets/plot1.png" alt="scatter plot" /></p>

<p>The data above is two dimensional, but it is “almost” one dimensional in the sense that every point is close to a line.</p>

<p><img src="/assets/plot2.png" alt="scatter plot" /></p>

<p>The first step in principal component analysis is to center the data. Given the list of 2d points, \(x_1, x_2, \dots , x_n \in \mathbb{R}^2\) we first center the data by calculating the mean \(\overline{x} = \frac{1}{n}\sum_{i=1}^n x_i\) and replacing each \(x_i\) with \(x_i - \overline{x}\). Now the data looks like this.</p>

<p><img src="/assets/plot3.png" alt="scatter plot" /></p>

<p>We then put the data in a matrix
\(X = \begin{pmatrix}
| &amp; | &amp;  &amp; | \\
x_1 &amp; x_2 &amp;\cdots &amp; x_n \\
| &amp; | &amp;  &amp; |\end{pmatrix}.\)
And calculate the eigenvectors and eigenvalues of the <em>covariance matrix</em> \(\frac{1}{n-1}XX^\top\).</p>

<p><img src="/assets/plot4.png" alt="scatter plot" /></p>

<p>The eigenvectors tell us the <em>direction</em> of the data. The first eigenvector in the picture above has the same slope as the data and the second eigenvector is perpendicular to the first. Now let’s scale each of the eigenvectors by its corresponding eigenvalue <sup id="a1"><a href="#f1">1</a></sup>.</p>

<p><img src="/assets/plot5.png" alt="scatter plot" /></p>

<p>And draw an ellipse around the eigenvectors.</p>

<p><img src="/assets/plot6.png" alt="scatter plot" /></p>

<p>The eigenvalues tell us how spread out the data is in the direction of that particular eigenvector. Thus we can reduce the dimension of the data by projecting onto the line given by the largest eigenvalue.</p>

<p><img src="/assets/plot7.png" alt="scatter plot" /></p>

<p>The data is now one dimensional since it fits on a single line. Each point has not moved too far from its original spot, so these new points still represent the data well.</p>

<p>In two dimensions this is the same as projecting onto the line of best fit, but this technique generalizes. If your data is \(n\)-dimensional then PCA lets you find the best \(m\)-dimensional subspace to project the data down onto; you just project your data onto the subspace spanned by the \(m\) eigenvectors with the largest eigenvalues. If \(m &lt; &lt; n\) this can compress your data a lot, and PCA guarantees that this \(m\) dimensional subspace is optimal, in the sense that it minimizes the mean squared error between the original data points and the projected data points.</p>

<h1 id="ii">II.</h1>

<p>The data in the plots above was generated using a random number generator. Let’s try PCA on a real dataset.</p>

<p>We will use the MNIST dataset, which is a collection of grayscale, 28x28 images of hand written digits. To simplify the analysis we will discard images of 2,3,4,5,6,7,8,9 and only look at images of 0 and 1. Below are some examples of the images from MNIST.</p>

<p><img src="/assets/plot8.png" alt="scatter plot" /></p>

<p>To process the images we will:</p>

<ul>
  <li>Flatten each image into a \(784 = 28\times 28\) dimensional vector.</li>
  <li>Use PCA to project each 784-dimensional vector to a 2-dimensional vector.</li>
  <li>Plot the 2 dimensional vectors, with images of ‘0’ in red and images of ‘1’ in blue.</li>
</ul>

<p>The result looks like this.</p>

<p><img src="/assets/plot9.png" alt="scatter plot" /></p>

<p>You can see that the zeros are clustered to the left, and the ones are clustered to the right. We could create a reasonable classifier by drawing a vertical line at \(x = - 250\), and all we did was linearly project the raw pixels down to a two dimensional subspace!</p>

<p>We can project onto any number of dimensions. Here is the three dimensional projection.</p>

<p><img src="/assets/plot11.png" alt="scatter plot" /></p>

<h2 id="iii">III.</h2>

<p>It’s not obvious why the eigenvalues and eigenvectors of the covariance matrix have all these useful properties. There are proofs at the end of the post, but they’re not particularly enlightening. Thankfully there’s a more intuitive way of thinking about it.</p>

<p>Continuing with the MNIST example, let \(p_1\) be the vector where the \(i\)-th entry is the first pixel in the \(i\)-th image. Simlarly let \(p_2, p_3, \dots , p_{784}\) be the vectors consisting of the 2nd, 3rd … , 784th pixels across all images. Then</p>

\[XX^\top = 
    \begin{pmatrix}
    \langle p_1, p_1 \rangle &amp; \langle p_1, p_2 \rangle &amp; \cdots &amp; \langle p_1, p_{784} \rangle \\
    \langle p_2, p_1 \rangle &amp; \langle p_2, p_2 \rangle &amp; \cdots&amp;  \langle p_2, p_{784} \rangle \\
    \vdots &amp; \vdots &amp; \ddots &amp; \vdots \\
    \langle p_{784}, p_1 \rangle &amp; \langle p_{784}, p_2 \rangle &amp; \cdots &amp; \langle p_{784}, p_{784} \rangle \\
    \end{pmatrix}.\]

<p>This matrix can be diagonalized \(XX^\top = UDU^{-1}\) where \(U\) is a change of basis matrix and \(D = \operatorname{diag}(\lambda_1, \cdots , \lambda_n)\) is diagonal.
We can view the change of basis as creating new features \(p_1’, p_2’, \dots , p_{748}’\) from the original pixels. And the diagonal matrix is the covariance matrix for these new features.</p>

<p>Since \(\langle p_i’, p_j’ \rangle = 0\) for \(i \neq j\) the features are independent, and the variance of \(p_i’\) is \(\langle p_i’, p_i’ \rangle = \lambda_i\).</p>

<p>So given a vector of pixels \(x\), we can convert \(x\) into a vector of new features \(x’\) by applying a change of basis. Then the eigenvalues \(\lambda_i\) are the variances of the new features, it seems reasonable that the features with the largest variance are the most important, while the features with the smallest variance can be discarded.</p>

<h2 id="iv">IV.</h2>

<p>Now that we have some intuition, the preceding discussion can be formalized into a theorem.</p>

<p><strong>Theorem:</strong> Let \(x_1, \dots , x_n \in \mathbb{R}^d\) be a sequence of data points. 
Let</p>

\[X = \begin{pmatrix}
| &amp; | &amp;  &amp; | \\
x_1 &amp; x_2 &amp;\cdots &amp; x_n \\
| &amp; | &amp;  &amp; |\end{pmatrix}\]

<p>be the \(d \times n\) matrix where each column is a data point.
Let \(W = XX^\top\) (the \(\frac{1}{n-1}\) factor from before does not affect the eigenvectors or the relative order of the eigenvalues).
Then \(W\) is <a href="https://en.wikipedia.org/wiki/Positive-definite_matrix#Positive_semidefinite">positive semidefinite</a> and hence has eigenvectors \(u_1, \dots , u_d\) which form an <a href="https://en.wikipedia.org/wiki/Orthonormal_basis">orthonormal basis</a> for \(\mathbb{R}^d\).
Let \(\lambda_1, \dots , \lambda_d\) be the corresponding eigenvalues and without loss of generality assume \(\lambda_1 \geq \lambda_2 \cdots \geq \lambda_d\).
The <em>projection error</em> for \(x_i\) onto a subspace \(V \subset \mathbb{R}^d\) is defined as \(\|x_i - P_Vx_i\|_2^2\) where \(P_V:\mathbb{R}^d \to \mathbb{R}^d\) is the projection-onto-\(V\) operator.
Then for any positive integer \(m &lt; d\) the subspace \(U_m := \operatorname{span}\{u_1, \dots , u_m\}\) minimizes the sum of the projection errors. In symbols,</p>

\[\sum_{i=1}^n \|x_i - P_{U_m}x_i\|_2^2 = \min_{\substack{V \subset \mathbb{R}^d \\ \operatorname{dim}V = m}} \sum_{i=1}^n \|x_i - P_Vx_i\|_2^2.\]

<p><em>Proof:</em></p>

<p>Fix \(m &lt; d\) and let \(V \subset \mathbb{R}^d\) be an \(m\)-dimensional subspace. Define the \(d \times n\) error matrix
\[
E = 
\begin{pmatrix}
    | &amp; | &amp;  &amp; |\\
     x_1 - P_Vx_1  &amp; x_2 - P_Vx_2 &amp; \cdots &amp; x_n - P_Vx_n\\
     | &amp; | &amp;  &amp; | \\
\end{pmatrix}
= X - P_VX.
\]
We want to minimize
\[
\sum_{i=1}^n \|x_i - P_Vx_i\|_2^2 = \|E\|_F^2
\]
where \(\|\cdot \|_F\) is the <a href="https://en.wikipedia.org/wiki/Matrix_norm#Frobenius_norm">Frobenius norm</a>.
We now rewrite the error using matrix algebra
\[
\begin{align}\newcommand{\tr}{\mathrm{tr}}
\|E \|_F^2 
&amp;= \| X- P_VX\|_F^2 \\
&amp;=\tr\left(( X- P_VX)( X- P_VX)^\top\right) &amp; (\|A \|_F^2 = \tr(A^\top A)) \\
&amp;=\tr\left(( X- P_VX)( X^\top - X^\top P_V^\top)\right) \\
&amp;=\tr\left(XX^\top - XX^\top P_V^\top - P_VXX^\top + P_VXX^\top P_V^\top \right) \\
&amp;=\tr\left(W- W P_V^\top - P_VW + P_VW P_V^\top \right) &amp; (W = XX^\top)\\
&amp;=\tr\left(W- W P_V - P_VW + P_VW P_V \right) &amp; (P_V = P_V^\top )\\
&amp;=\tr(W)- \tr(W P_V) - \tr(P_VW) + \tr(P_VW P_V ) \\
&amp;=\tr(W)- \tr(P_VW ) - \tr(P_VW) + \tr(P_VW) &amp; (\tr(AB) = \tr(BA) \text{ and } P_V^2 = P_V)\\
&amp;=\tr(W)- \tr(P_VW).
\end{align}
\]</p>

<p>The quantity \(\mathrm{tr}(W)\) is a constant, so minimizing \(\|E \|_F^2\) is the same as maximizing \(\tr(P_VW)\). Let \(\{v_1, \dots , v_m\} \subset \mathbb{R}^d\) be an orthonormal basis for \(V\). Then 
\[
P_V = \sum_{i = 1}^m v_iv_i^\top
\]
so
\[
\begin{align}\newcommand{\tr}{\mathrm{tr}}
    \tr(P_VW)
    &amp;= \tr\left(\sum_{i = 1}^m v_iv_i^\top W \right) \\
    &amp;= \sum_{i=1}^m \tr\left(v_iv_i^\top W\right) \\
    &amp;= \sum_{i=1}^m \tr(v_i^\top W v_i) &amp; (\tr(AB) = \tr(BA)).
\end{align}
\]</p>

<p>Let 
\[
U = \begin{pmatrix}
| &amp; | &amp;  &amp; | \\
u_1 &amp; u_2 &amp;\cdots &amp; u_d \\
| &amp; | &amp;  &amp; |\end{pmatrix}
\]
where the \(u_i \in \mathbb{R}^d\) are the eigenvectors of \(W\) as stated in the theorem. The matrix \(U\) diagonalizes \(W\) so \(W = UDU^{-1} =  UDU^\top\) where 
\[D = \begin{pmatrix}
\lambda_1 &amp; 0 &amp; \dots &amp; 0 \\
0 &amp; \lambda_2 &amp; \dots &amp; 0 \\
\vdots &amp; \vdots &amp; \ddots &amp; \vdots \\
0 &amp; 0 &amp; \dots &amp; \lambda_n
\end{pmatrix}.
\]
Now
\[
\begin{align}\newcommand{\tr}{\mathrm{tr}}
\tr(P_VW)
&amp;= \sum_{i=1}^m \tr(v_i^\top W v_i)  \\
&amp;= \sum_{i=1}^m \tr(v_i^\top UDU^\top v_i)  \\
&amp;= \sum_{i=1}^m  \tr((U^\top v_i)^\top D (U^\top v_i) 
\end{align}
\]</p>

<p>If \(v_i = u_i\) for all \(1 \leq i \leq m\) then 
\[U^\top v_i = U^\top u_i = (0, \dots, 0, 1, 0, \dots, 0)^\top\]
is the \(i\)-th standard basis vector. Thus 
\[
\begin{align}\newcommand{\tr}{\mathrm{tr}}
\tr(P_VW)
&amp;= \sum_{i=1}^m  \tr((U^\top v_i)^\top D (U^\top v_i) \\
&amp;= \sum_{i=1}^m \lambda_i
\end{align}
\]</p>

<p>Therefore it suffices to show that \(\mathrm{tr}(P_VW) \leq \sum_{i=1}^m \lambda_i\) for all dimension \(m\) subspaces \(V\).</p>

<p>We will show this is true in the case \(m = 2\), i.e. \(\mathrm{tr}(P_VW) \leq \lambda_1 + \lambda_2\) when \(V\) is 2 dimensional. The case \(m &gt; 2\) uses the same argument but it is more notationally heavy. Let \(\alpha = U^\top v_1 \in \mathbb{R}^d\) and \(\beta =U^\top v_2 \in \mathbb{R}^d\). Note that since \(U\) is unitary \(\|\alpha\|_2^2 = \|\beta\|_2^2 = 1\) and \(\langle \alpha, \beta \rangle = 0\).</p>

<p>The first step is to show that \(\alpha_i^2 + \beta_i^2 \leq 1\) for all \(i\). Let \(e_i = (0, \dots , 0, 1, 0, \dots , 0)\) be the \(i\)-th standard basis vector. Since \(\alpha\) and \(\beta\) are orthogonal and have length 1, the projection of \(e_i\) onto \(\operatorname{span}\{\alpha, \beta \}\) is given by
\[\hat{e}_i = \langle e_i, \alpha \rangle \alpha + \langle e_i, \beta \rangle \beta = \alpha_i \alpha + \beta_i \beta .\]
Then
\[ \alpha_i^2 + \beta_i^2 = \|\hat{e_i}\|_2^2 \leq \|e_i\|_2^2 = 1 \]
since a projected vector always has length less than or equal to the original vector.</p>

<p>The second step is to observe that \(\sum_{i=1}^d (\alpha_i^2 + \beta_i^2) = \|\alpha \|_2^2 + \|\beta \|_2^2 = 2\).</p>

<p>Finally, we want to maximize 
\[ \mathrm{tr}(P_VW) = \sum_{i=1}^d \lambda_i(\alpha_i^2 + \beta_i^2) \]
and we know that
\[\alpha_i^2 + \beta_i^2 \leq 1 \text{ and } \sum_{i=1}^d(\alpha_i^2 + \beta_i^2) = 2 .\]</p>

<p>The eigenvalues of a positive semidefinite matrix are nonnegative so the sum \(\sum_{i=1}^d \lambda_i(\alpha_i^2 + \beta_i^2)\) is maximized when when the first and second coefficient are as large as possible, i.e. when \(\alpha_1^2 + \beta_1^2 = \alpha_2^2 + \beta_2^2 = 1\). But then the second condition implies that \(\alpha_i^2 + \beta_i^2 = 0\) for \(i &gt; 2\). Thus
\[ \mathrm{tr}(P_VW) = \sum_{i=1}^d \lambda_i(\alpha_i^2 + \beta_i^2) \leq \lambda_1 + \lambda_2. \]
\(\square\)</p>

<p>We also need to prove that the size of the eigenvalue is proportional to the variance in the direction of the corresponding eigenvector.</p>

<p><strong>Theorem:</strong> As in the previous theorem let \(X = \begin{pmatrix}x_1 &amp; x_2 &amp; \cdots &amp; x_n \end{pmatrix}\) be the data matrix, \(W = XX^\top\) the covariance matrix, \(u_1, \dots , u_d\) the eigenvectors of \(W\) and \(\lambda_1, \dots , \lambda_d\) the eigenvalues. Let \(P_{u_i}: \mathbb{R}^d \to \mathbb{R}^d\) be the projection operator onto the subspace \(\mathrm{span}\{u_i\}\). Then \[
    \sum_{j=1}^m\|P_{u_i}x_j\|_2^2 = \lambda_i
\] </p>

<p><em>Proof:</em> </p>

<p>
The working is similar to the previous proof so I'll omit some steps.
\[
\begin{align}
\sum_{j=1}^m\|P_{u_i}x_j\|_2^2 
&amp;= \|u_iu_i^\top X\|_F^2 \\
&amp;= \mathrm{tr}((u_iu_i^\top X)(u_iu_i^\top X)^\top)  \\
&amp;= \mathrm{tr}(u_i^\top W u_i)  \\
&amp;=  \mathrm{tr}((U^\top u_i)^\top D (U^\top u_i) \\
&amp;= \lambda_i .
\end{align}
\] 
</p>

<p><a href="https://news.ycombinator.com/item?id=19356584">Comment on Hacker News.</a></p>

<hr />
<p><b id="f1">1</b> I actually scaled by two times the square root of the eigenvalue. The eigenvalue tells you the variance and I wanted the standard deviation. I multiplied by two so that ellipse would capture most of the data. <a href="#a1">↩</a></p>]]></content><author><name></name></author><summary type="html"><![CDATA[The code used to generate the plots for this post can be found here.]]></summary></entry></feed>