<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Sebastian Wild&apos;s Site</title>
    <description>Website of Sebastian Wild, algorithms researcher
</description>
    <link>https://www.wild-inter.net/</link>
    <atom:link href="https://www.wild-inter.net/feed.xml" rel="self" type="application/rss+xml"/>
    <pubDate>Tue, 31 Mar 2026 18:53:37 +0200</pubDate>
    <lastBuildDate>Tue, 31 Mar 2026 18:53:37 +0200</lastBuildDate>
    <generator>Jekyll v3.10.0</generator>
    
      <item>
        <title>Powersort&apos;s Pursuit</title>
        <description>&lt;script type=&quot;text/x-mathjax-config&quot;&gt;
	var font = &quot;Neo-Euler&quot;;
	MathJax.Hub.Config({
		tex2jax: {
			inlineMath: [[&apos;$&apos;,&apos;$&apos;]],
			displayMath: [[&apos;\\[&apos;,&apos;\\]&apos;]],
			processEscapes: true,
		},
		&quot;SVG&quot;:{ 
			font:font
		},
		&quot;HTML-CSS&quot;: {
			webFont: font,
			imageFont: font,
			preferredFont: font,
			availableFonts: [],
			scale: 85,
			mtextFontInherit: true
		}
	});
&lt;/script&gt;

&lt;script type=&quot;text/javascript&quot; src=&quot;https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;&lt;/script&gt;

&lt;p&gt;My colleague &lt;a href=&quot;https://pcwww.liv.ac.uk/~tonymcc/&quot;&gt;Tony McCabe&lt;/a&gt; put a &lt;a href=&quot;https://pgb.liv.ac.uk/~tony/mgame/index.php&quot;&gt;game implementation of merging policies&lt;/a&gt;
together, where you can try to find the minimal-mergecost order of merges.&lt;/p&gt;

&lt;p&gt;Below are a few example inputs to try.
You have to drag&amp;amp;drop one run over one of its neighbors to merge them; this costs you the sum of their lengths.
The goal is to merge up all runs with minimal total cost.&lt;/p&gt;

&lt;!-- 
&lt;div class=&quot;control_div&quot;&gt;
  &lt;form action=&quot;https://pgb.liv.ac.uk/~tony/mgame/index.php&quot; method=&quot;post&quot;&gt;
  &lt;label for=&quot;values&quot;&gt;Run lengths: &lt;/label&gt;
  &lt;input type=&quot;text&quot; name=&quot;values&quot; size = 55 value=&quot;8 2 2 2 3 1&quot;&gt;
  &lt;input type=&quot;submit&quot; value = &quot;Play!&quot; style=&quot;height:40px&quot;&gt;
  &lt;/form&gt;
&lt;/div&gt; 
--&gt;

&lt;!-- 
[2 3 4 5](https://pgb.liv.ac.uk/~tony/mgame/index.php?values=2 3 4 5) 
--&gt;

&lt;div style=&quot;margin:3ex;&quot;&gt;
  &lt;form action=&quot;javascript:loadGame(&apos;&apos;)&quot; method=&quot;post&quot;&gt;
  &lt;label for=&quot;values&quot;&gt;Run lengths:&amp;nbsp;&amp;nbsp;&lt;/label&gt;
  &lt;input type=&quot;text&quot; name=&quot;values&quot; size=&quot;55&quot; value=&quot;8 2 2 2 3 1&quot; /&gt;
  &lt;input style=&quot;margin: 1ex;&quot; type=&quot;submit&quot; value=&quot;Load input&quot; /&gt;
  &lt;/form&gt;
&lt;/div&gt;

&lt;p&gt;&lt;button onclick=&quot;loadGame(&apos;4 2 2 1 3 3&apos;)&quot;&gt;LEGO input: 4 2 2 1 3 3&lt;/button&gt;
 
&lt;button onclick=&quot;loadGame(&apos;8 2 2 2 3 1&apos;)&quot;&gt;Example from talk: 8 2 2 2 3 1&lt;/button&gt;
 
&lt;button onclick=&quot;loadGame(&apos;5 3 3 14 1 2&apos;)&quot;&gt;Example from ESA paper: 5 3 3 14 1 2&lt;/button&gt;
 
&lt;button onclick=&quot;loadGame(&apos;2 4 6 3 2 4 234 112 2&apos;)&quot;&gt;Extremes: 2 4 6 3 2 4 234 112 2&lt;/button&gt;&lt;/p&gt;

&lt;p&gt;&lt;button onclick=&quot;loadGame(&apos;110 10 10 10 10 15 15 15 15&apos;)&quot;&gt;Bad for Timsort: 110 10 10 10 10 15 15 15 15&lt;/button&gt;
 
&lt;button onclick=&quot;loadGame(&apos;60 19 20 9 5 25 99 28&apos;)&quot;&gt;Tricky input: 60 19 20 9 5 25 99 28&lt;/button&gt;&lt;/p&gt;

&lt;iframe id=&quot;mgame&quot; src=&quot;https://pgb.liv.ac.uk/~tony/mgame/index.php?values=2 3 4 5&quot; style=&quot;transform: scale(0.5); transform-origin: 0 0; height: 10px; width: 199%; &quot;&gt;&lt;/iframe&gt;

&lt;h3 id=&quot;connection-to-sorting&quot;&gt;Connection to sorting&lt;/h3&gt;

&lt;p&gt;The game captures exactly the &lt;a href=&quot;https://speakerdeck.com/sebawild/quicksort-timsort-powersort?slide=62&quot;&gt;optimization problem that Timsort and Powersort face&lt;/a&gt;: The boxes are existing sorted runs in the data that we have to merge in pairs until we eventually have a single sorted run. The cost of merging is (slightly simplistically) set to the size of the output. For a &lt;em&gt;stable&lt;/em&gt; sort, we can only merge adjacent runs.&lt;/p&gt;

&lt;script&gt;
  var frame = document.getElementById(&apos;mgame&apos;);
  var input = document.querySelector(&apos;input[name=&quot;values&quot;]&apos;);

  function loadGame(values) {
    if (values == &apos;&apos;) {
      values = input.value;
    }
    var newSrc = &apos;https://pgb.liv.ac.uk/~tony/mgame/index.php?values=&apos; + values;
    input.value = values;
    frame.src = newSrc;
  }

  function resizeIframe() {
    var frameWidth = frame.offsetWidth;
    var desiredHeight = frameWidth * 0.61;
    frame.style.height = desiredHeight + &apos;px&apos;;
  }

  window.addEventListener(&apos;resize&apos;, resizeIframe);
  resizeIframe(); // Call the function initially to set the height
&lt;/script&gt;

</description>
        <pubDate>Sun, 16 Jul 2023 00:00:00 +0200</pubDate>
        <link>https://www.wild-inter.net/posts/powersort-game</link>
        <guid isPermaLink="true">https://www.wild-inter.net/posts/powersort-game</guid>
        
        
        <category>algorithms</category>
        
        <category>programming</category>
        
      </item>
    
      <item>
        <title>My PyCon US 2023 Powersort Talk</title>
        <description>&lt;p&gt;I presented Powersort and its story at &lt;a href=&quot;https://us.pycon.org/2023/&quot;&gt;&lt;em&gt;PyCon US 2023&lt;/em&gt;&lt;/a&gt;, the largest Python community conference.
Here are some resources and impressions from the last couple of days here in Salt Lake City.&lt;/p&gt;

&lt;figure&gt;
&lt;a href=&quot;/assets/salt-lake-city-ensign-peak.jpg&quot;&gt;&lt;img src=&quot;/assets/salt-lake-city-ensign-peak.jpg&quot; /&gt;&lt;/a&gt; 
&lt;figcaption&gt;
View on downtown Salt Lake City (with the conference venue!) and Utah State Capitol, from Ensign Peak.
&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;h2 id=&quot;resources&quot;&gt;Resources&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=QtG858LRQI0&amp;amp;list=PL2Uw4_HvXqvY2zhJ9AMUa_Z6dtMGF3gtb&amp;amp;index=94&quot;&gt;&lt;i class=&quot;fab fa-youtube&quot;&gt;&lt;/i&gt; Official Talk recording&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.youtube.com/watch?v=XjOnY-OLAPc&quot;&gt;&lt;i class=&quot;fab fa-youtube&quot;&gt;&lt;/i&gt; My reupload with the antiphase audio fixed&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://speakerdeck.com/sebawild/quicksort-timsort-powersort&quot;&gt;&lt;i class=&quot;fas fa-file&quot;&gt;&lt;/i&gt; Talk slides&lt;/a&gt;&lt;br /&gt;
(Speakerdeck seems to have an issue with my fancy transparency patterns, but you can &lt;a href=&quot;https://files.speakerdeck.com/presentations/2812be56dd6b48909bbbe8c766a55189/Quicksort__Timsort__Powersort-600-dpi.pdf&quot;&gt;&lt;i class=&quot;fas fa-file-pdf&quot;&gt;&lt;/i&gt; download the pdf&lt;/a&gt; there.)&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://tiny.cc/timsort&quot;&gt;Colab notebook&lt;/a&gt; with the (educational) implementations of Timsort and Powersort&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://tiny.cc/sort-lake-city&quot;&gt;How-to for donating data&lt;/a&gt; to the Adaptive Sorting Benchmark&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here’s the blurb from the talk submission:&lt;/p&gt;

&lt;blockquote&gt;

  &lt;p&gt;&lt;strong&gt;&lt;a href=&quot;https://us.pycon.org/2023/schedule/presentation/50/&quot;&gt;Quicksort, Timsort, Powersort - Algorithmic ideas, engineering tricks, and trivia behind CPython’s new sorting algorithm&lt;/a&gt;&lt;/strong&gt;&lt;br /&gt;
Writing a sorting function is easy - coding a fast and reliable reference implementation less so.
In this talk, I tell the story behind CPython’s &lt;a href=&quot;https://github.com/python/cpython/issues/78742&quot;&gt;latest updates&lt;/a&gt; of the list sort function.&lt;/p&gt;

  &lt;p&gt;&lt;strong&gt;Aims:&lt;/strong&gt; entertain people with twists of history and algorithmic puzzles, which tell a lovely story of how a seemingly useless piece of theory lead to the fastest and most elegant solution of a practical challenge.&lt;/p&gt;

  &lt;p&gt;&lt;strong&gt;Target audience:&lt;/strong&gt; geeks believing in the power of solid algorithmic thinking; 
programmers interested in engineering performance-critical code; all Python enthusiast curious about what makes (sorting lists in) Python fast.&lt;/p&gt;

  &lt;p&gt;&lt;strong&gt;Content&lt;/strong&gt;:
After using Quicksort for a long while, Tim Peters invented &lt;em&gt;Timsort&lt;/em&gt;, a clever Mergesort variant, for the CPython reference implementation of Python.  Timsort is both effective in Python and a popular export product: it is used in many languages and frameworks, notably OpenJDK, the Android runtime, and the V8 JavaScript engine.&lt;/p&gt;

  &lt;p&gt;Despite this success, algorithms researchers eventually pinpointed two flaws in Timsort’s underlying algorithm:
The first could lead to a &lt;a href=&quot;http://www.envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/&quot;&gt;stack overflow in CPython&lt;/a&gt; (and Java); although it has meanwhile been fixed, it is curious that 10 years of widespread use didn’t bring it to surface.
The second flaw is related to &lt;em&gt;performance&lt;/em&gt;: the order in which detected sorted segments, the “runs” in the input, are merged, can be &lt;a href=&quot;https://arxiv.org/abs/1801.04641&quot;&gt;50% more costly&lt;/a&gt; than necessary.  Based on ideas from the little known puzzle of optimal alphabetic trees, the &lt;a href=&quot;https://arxiv.org/abs/1805.04154&quot;&gt;&lt;em&gt;Powersort&lt;/em&gt; merge policy&lt;/a&gt; finds nearly optimal merging orders with negligible overhead, and is now (Python 3.11.0) part of the CPython implementation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;impressions&quot;&gt;Impressions&lt;/h2&gt;

&lt;p&gt;PyCon US is huge; there were 2200 attendees on site, plus over 400 online participants;
the industrial sponsors alone donated over $1 million (!) towards the event.
(Cheers to JetBrains for bringing proper coffee to the US and giving it away for free 
(cappuccinos throughout the conference, yay ☕), and AWS &amp;amp; Superblocks for inviting the whole conference to their party 🍺).&lt;/p&gt;

&lt;p&gt;Talk topics were very mixed and broad; some talks presented technical details on
changes to the CPython implenentation 
(like the two &lt;a href=&quot;https://us.pycon.org/2023/schedule/presentation/73/&quot;&gt;talks by Mark&lt;/a&gt; and 
&lt;a href=&quot;https://us.pycon.org/2023/schedule/presentation/6/&quot;&gt;Brandt&lt;/a&gt; from the &lt;em&gt;Faster CPython&lt;/em&gt; initiative,
or the flurry of talks on PyScript, the attempts to make Python run in the browser(!)), 
whereas others provided an overview of an area or reported on particular projects
(such as &lt;a href=&quot;https://us.pycon.org/2023/schedule/presentation/142/&quot;&gt;games on the micro:bit&lt;/a&gt; 
or &lt;a href=&quot;https://us.pycon.org/2023/schedule/presentation/141/&quot;&gt;algorithmic embroidery&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;I was also deeply impressed by how much PyCon has going on in terms of community building
at the conference (and outside of it), and how much personal appreciation
people showed for each other.
It is a remarkable achievement to not only get the technical aspects of the language right,
but also the community around it. While one first needs a thing to gather around 
before a community can evolve, it makes me wonder whether a healthy community indeed 
doesn’t follow, but &lt;em&gt;cause&lt;/em&gt; the success of Python.&lt;/p&gt;

&lt;p&gt;I was very happy that &lt;em&gt;Guido van Rossum&lt;/em&gt; made it to my talk.
Of course, I couldn’t let the opportunity pass to interview him
on sorting in CPython afterwards:&lt;/p&gt;
&lt;blockquote&gt;
  &lt;p&gt;“I don’t remember the exact reasoning, but the mere fact that we used &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;qsort&lt;/code&gt; 
for sorting lists shows you that I didn’t care much about sorting.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Fair enough 😉&lt;br /&gt;
Had that been any different, who knows whether I would have had an excuse to enjoy PyCon US today?&lt;/p&gt;
</description>
        <pubDate>Sat, 22 Apr 2023 00:00:00 +0200</pubDate>
        <link>https://www.wild-inter.net/posts/powersort-pycon-talk</link>
        <guid isPermaLink="true">https://www.wild-inter.net/posts/powersort-pycon-talk</guid>
        
        
        <category>algorithms</category>
        
        <category>programming</category>
        
      </item>
    
      <item>
        <title>How to contribute your inputs to the Adaptive Sorting Benchmark</title>
        <description>&lt;p&gt;If you consider contributing sorted lists from your own Python application to our
benchmark for adaptive sorting, the steps below show you how to do collect this data.
Note: Our instrumentation stores a list of integers with equivalent comparison-behavior
to all lists sorted when running Python code through our custom CPython.&lt;/p&gt;

&lt;h3 id=&quot;background&quot;&gt;Background&lt;/h3&gt;

&lt;p&gt;The goal of the benchmark is to collect real-world data from Python applications
to better understand the effectiveness of adaptive features in the list sort functions.
In my &lt;a href=&quot;/posts/powersort-pycon-talk&quot;&gt;PyCon US 2023 talk&lt;/a&gt;, I reached out to Pythonistas to contribute their sorting inputs.
If sorted lists were completely random data, we would never see (significant) improvements
from these, but data hardly is very random.&lt;br /&gt;
How much pre-sortedness is there in your use case?  Let’s find out!&lt;/p&gt;

&lt;h3 id=&quot;step-1-build-instrumented-cpython&quot;&gt;Step 1: Build instrumented CPython&lt;/h3&gt;

&lt;p&gt;Clone the instrumented branch of CPython; currently we have support for 3.11 or 3.10.
(If we dearly need another version, &lt;a href=&quot;mailto:sebawild@gmail.com&quot;&gt;drop me a line&lt;/a&gt; and we can add it.)&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;git clone https://github.com/sebawild/cpython --branch 3.11-instrumented --single-branch cpython-sorting
cd cpython-sorting
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The steps below assume linux and a set up development environment; 
check the &lt;a href=&quot;https://devguide.python.org/getting-started/setup-building/#setup&quot;&gt;official instructions&lt;/a&gt;).
For a core installation only standard C build tools are needed, plus OpenSSL headers.
(On Ubuntu, you get the latter via &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sudo apt-get install libssl-dev&lt;/code&gt;).&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./configure --enable-optimizations &amp;amp;&amp;amp; make -j
make test 
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;step-2-set-up-your-project&quot;&gt;Step 2: Set up your project&lt;/h3&gt;

&lt;p&gt;First, we create a venv (a virtual environment to keep installed package local).
Inside &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cpython-sorting&lt;/code&gt;, call&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./python -m venv sorting-python
source sorting-python/bin/activate
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;to create and activate the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sorting-python&lt;/code&gt; venv.
Now you can use pip in the usual way to install any needed packages.&lt;/p&gt;

&lt;h3 id=&quot;step-3-run-your-application-and-submit-arraystxt&quot;&gt;Step 3: Run your application and submit &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;arrays.txt&lt;/code&gt;&lt;/h3&gt;

&lt;p&gt;You run your application as normal: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;python your-awesome-script.py&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;To collect the benchmark data, first delete &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;arrays.txt&lt;/code&gt; (results are otherwise appended)
and run your application.
Then store &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;arrays.txt&lt;/code&gt; and &lt;a href=&quot;mailto:sebawild@gmail.com&quot;&gt;send it over&lt;/a&gt;,
with a quick description of your application.&lt;/p&gt;

&lt;p&gt;Afterwards &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;arrays.txt&lt;/code&gt; will contain all sorted lists (and some stats).
Note that even during the process of starting python, a few dozen calls to list sort
are made (mostly on tiny lists); for the benchmark, we are mostly interested in
big lists.&lt;/p&gt;

&lt;p&gt;A rudimentary script to read an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;arrays.txt&lt;/code&gt; file and compute some presortedness
metrics is implemented in &lt;a href=&quot;https://github.com/sebawild/cpython/blob/3.11-instrumented/run_information.py&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;run-information.py&lt;/code&gt;&lt;/a&gt;.
Simply running &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;python run-information.py&lt;/code&gt; (in the same folder) will
print stats on the longest sorted list (by default).
This is sufficient to check whether your application sorted substantially long list at all.
If so, please send your &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;arrays.txt&lt;/code&gt; to me.&lt;/p&gt;

&lt;h4 id=&quot;limitations&quot;&gt;Limitations&lt;/h4&gt;

&lt;p&gt;The instrumentation is a quick hack at this point, not production-ready code.
It is hence best to run code via our python in a sandbox environment.&lt;/p&gt;

&lt;p&gt;Known limitations:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The output &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;arrays.txt&lt;/code&gt; is appended each time you run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;python&lt;/code&gt; and it could grow large.&lt;/li&gt;
  &lt;li&gt;Our instrumentation is not ready for multi-threading.
The instrumentation may crash python in obscure scenarios such as comparison functions
that modify the sorted list.&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Fri, 21 Apr 2023 00:00:00 +0200</pubDate>
        <link>https://www.wild-inter.net/posts/pycon23-sorting-benchmark</link>
        <guid isPermaLink="true">https://www.wild-inter.net/posts/pycon23-sorting-benchmark</guid>
        
        
        <category>algorithms</category>
        
        <category>programming</category>
        
      </item>
    
      <item>
        <title>Build CPython from source and install packages</title>
        <description>&lt;p&gt;For experimenting with novel CPython features, you can quickly set up an isolated environment.
This post shows you how to do that.&lt;/p&gt;

&lt;p&gt;I did this on Ubuntu 20.04 LTS
with standard build tools installed, but the same instructions probably
work more generally.&lt;/p&gt;

&lt;h2 id=&quot;compile-python&quot;&gt;Compile python&lt;/h2&gt;

&lt;p&gt;Download latest CPython sources&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;git clone git@github.com:python/cpython.git
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Change to a stable branch instead of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;main&lt;/code&gt; (so that we don’t have to build all libraries from source); here we’re using &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;3.11&lt;/code&gt;, the latest stable branch:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;git checkout 3.11
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;To run the build, use the following (standard) commands.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;configure --enable-optimizations
make
make test
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--enable-optimizations&lt;/code&gt; does some instrumentation first, runs a demo workload, and then compiles again using deemed best compiler options.&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;make test&lt;/code&gt; may not be necessary, but probably not a bad idea.
For me, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;test_ssl&lt;/code&gt; fails, but I’ll ignore that for now.&lt;/p&gt;

&lt;p&gt;Note: If you want several builds to compare, you need to have a full copy of the source (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cpython&lt;/code&gt; root) folder; you can build in a subfolder, but that doesn’t change that all &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;python&lt;/code&gt;s share the Lib folder and hence only the latest compile works correctly.
This seems to remain the case even with a venv that isolates the installed packages. You cannot run a Python version if you change the git checkout to a different version; the build still uses the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Lib&lt;/code&gt; subfolder from the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cpython&lt;/code&gt; repo.&lt;/p&gt;

&lt;h2 id=&quot;pip-bootstrap&quot;&gt;pip bootstrap&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The better option: Create a virtual environment, &lt;a href=&quot;#create-a-venv&quot;&gt;see below&lt;/a&gt;.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So far, the compilation generated a naked &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;python&lt;/code&gt; executable that is just the Python interpreter.
For almost anything interesting, we will have to install packages, and the most convenient way for that is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pip&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Python already comes with a bootstrap module to do that (https://pip.pypa.io/en/stable/installation/):&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./python -m ensurepip --upgrade
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That’s it! Now you can run&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./python -m pip install numpy pandas
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;etc. to install packages.
These all get installed into the system wide folder
as&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./python -m pip show pandas
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;reveals.&lt;/p&gt;

&lt;h3 id=&quot;create-a-venv&quot;&gt;Create a venv&lt;/h3&gt;

&lt;p&gt;A &lt;a href=&quot;https://packaging.python.org/en/latest/tutorials/installing-packages/#creating-virtual-environments&quot;&gt;virtual environment&lt;/a&gt; is a folder with all Python needs, isolated from other installations.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;./python -m venv my-python
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;generates a virtual environment in the subfolder&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;source my-python/bin/activate
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;make this the &lt;em&gt;active&lt;/em&gt; venv for the current running shell.
Check &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;python --version&lt;/code&gt; to see if it worked.&lt;/p&gt;

&lt;p&gt;From now on, you can use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;python&lt;/code&gt; instead of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./python&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pip&lt;/code&gt; directly instead of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;./python -m pip&lt;/code&gt; etc.&lt;/p&gt;

&lt;p&gt;Moreover, a call to&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;python -m pip show pandas
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;reveals that these are now local to your project (the venv &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;my-python&lt;/code&gt; really), and that is much better isolation.&lt;/p&gt;

&lt;h3 id=&quot;why-a-stable-branch&quot;&gt;Why a stable branch?&lt;/h3&gt;

&lt;p&gt;CPython is reasonably easy and quick to compile, so why not simply work with the current &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;main&lt;/code&gt; branch?
The main reason (no pun intended) is to easily be able to install any Python packages with pip without much hassle.
For major releases (like 3.11), PyPi has precompiled “wheels” of many popular packages and so installing them does not need all their build dependencies installed and is very quick.&lt;/p&gt;

&lt;p&gt;Since Python version jumps often affect the C API, many libraries also lag a bit behind CPython main and will not easily be usable with the development branch.&lt;/p&gt;
</description>
        <pubDate>Tue, 07 Feb 2023 00:00:00 +0100</pubDate>
        <link>https://www.wild-inter.net/posts/python-virtualenv</link>
        <guid isPermaLink="true">https://www.wild-inter.net/posts/python-virtualenv</guid>
        
        
        <category>algorithms</category>
        
        <category>teaching</category>
        
        <category>web</category>
        
      </item>
    
      <item>
        <title>Powersort in official Python 3.11 release</title>
        <description>&lt;p&gt;&lt;img src=&quot;/assets/powersort.jpg&quot; style=&quot;   float: right;   width:16em; opacity:1;    /* background-color: #0f00; */   /* padding: 0em; */   margin-left: 0.5em;   /* border: 0px; */ &quot; /&gt; 
Our &lt;a href=&quot;https://en.wikipedia.org/wiki/Powersort&quot;&gt;sorting method &lt;em&gt;Powersort&lt;/em&gt;&lt;/a&gt; is used as default &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;list.sort()&lt;/code&gt; algorithm in CPython, the reference implementation of the 
Python programming language.&lt;/p&gt;

&lt;div class=&quot;digression&quot; style=&quot;font-size: 100%;&quot;&gt;
  &lt;p&gt;&lt;i class=&quot;fab fa-python&quot;&gt;&lt;/i&gt;  &lt;strong&gt;&lt;em&gt;Join the &lt;a href=&quot;https://powersort-competition.github.io/PowersortCompetitionWebsite/#/&quot;&gt;Powersort Competition&lt;/a&gt;&lt;/em&gt;&lt;/strong&gt; &lt;i class=&quot;fab fa-python&quot;&gt;&lt;/i&gt;&lt;br /&gt;
Help us study Timsort and Powersort and win substantial prizes!&lt;/p&gt;
&lt;/div&gt;

&lt;p&gt;See my &lt;a href=&quot;/posts/powersort-pycon-talk&quot;&gt;PyCon US talk&lt;/a&gt; for the full story.&lt;br /&gt;
Here’s the entry from the official &lt;a href=&quot;https://docs.python.org/release/3.11.0/whatsnew/changelog.html#id100&quot;&gt;Python changelog&lt;/a&gt;:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;a href=&quot;https://bugs.python.org/issue?@action=redirect&amp;amp;bpo=34561&quot;&gt;bpo-34561&lt;/a&gt;: 
  &lt;strong&gt;List sorting now uses the merge-ordering strategy from Munro and Wild’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;powersort()&lt;/code&gt;.&lt;/strong&gt; Unlike the former strategy, this is provably near-optimal in the entropy of the distribution of run lengths. Most uses of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;list.sort()&lt;/code&gt; probably won’t see a significant time difference, but may see significant improvements in cases where the former strategy was exceptionally poor. However, as these are all fast linear-time approximations to a problem that’s inherently at best quadratic-time to solve truly optimally, it’s also possible to contrive cases where the former strategy did better.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The change had been included in the development version of CPython,
but with the official release of &lt;a href=&quot;https://www.python.org/downloads/release/python-3110/&quot;&gt;Python 3.11&lt;/a&gt;, Powersort is now on route to be deployed to hundreds of millions of devices, on top of already being in active use in &lt;a href=&quot;https://foss.heptapod.net/pypy/pypy/-/blob/branch/default/rpython/rlib/listsort.py&quot;&gt;PyPy&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;update-june-2025&quot;&gt;Update (June 2025)&lt;/h3&gt;

&lt;p&gt;Powersort has also been adopted for &lt;a href=&quot;https://github.com/numpy/numpy/pull/29208&quot;&gt;numpy&lt;/a&gt;, replacing the former Timsort implementation.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://powersort-competition.github.io/PowersortCompetitionWebsite/#/&quot;&gt;&lt;em&gt;University of Liverpool Powersort Competition&lt;/em&gt;&lt;/a&gt; is also still underway, with lots of prizes up for grabs!&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/assets/strong-arm-outline.svg&quot; style=&quot;   float: left;   width:6em; opacity:1;    background-color: #0f00;   padding: 0em;   margin-right: 0.5em;   border: 0px; &quot; /&gt; 
Powersort is explained in my &lt;a href=&quot;/posts/powersort-pycon-talk&quot;&gt;PyCon US 2023 talk&lt;/a&gt; 
(in my biased opinion in a much clearer way than in 
our &lt;a href=&quot;/publications/munro-wild-2018&quot;&gt;original publication&lt;/a&gt; 😅);
More context is given in my &lt;a href=&quot;/teaching/comp526/&quot;&gt;&lt;em&gt;Efficient Algorithms&lt;/em&gt;&lt;/a&gt; module in the unit on sorting,
which has an &lt;a href=&quot;https://youtu.be/EzrPdDMaxMI&quot;&gt;intro to adaptive sorting (34min)&lt;/a&gt; and then covers &lt;a href=&quot;https://youtu.be/exbuZQpWkQ0&quot;&gt;Powersort itself (15min)&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We showed how to extend Powersort to &lt;em&gt;multiway merges&lt;/em&gt;, looking &lt;a href=&quot;https://www.wild-inter.net/publications/multiway-powersort&quot;&gt;very promising in first experiments&lt;/a&gt;.&lt;/p&gt;

&lt;h3 id=&quot;coverage&quot;&gt;Coverage&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://technews.acm.org/archives.cfm?fo=2022-12-dec/dec-14-2022.html#1230749&quot;&gt;ACM TechNews&lt;/a&gt; (2022-12-14)&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://news.liverpool.ac.uk/2022/12/12/liverpool-computer-scientists-improve-python-sorting-function/&quot;&gt;University of Liverpool News story&lt;/a&gt; (2022-12-12)&lt;br /&gt;
on &lt;a href=&quot;https://www.linkedin.com/posts/university-of-liverpool_liverpool-computer-scientists-improve-python-activity-7009115579949195264-3uIc&quot;&gt;LinkedIn post&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://techxplore.com/news/2022-12-scientists-python-function.html&quot;&gt;TechXPlore&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.londondaily.news/liverpool-computer-scientists-improve-python-sorting-function/&quot;&gt;London Daily News&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;!--
Further coverage:

 * https://www.i-programmer.info/news/216-python/15954-python-now-uses-powersort.html

--&gt;
</description>
        <pubDate>Mon, 24 Oct 2022 00:00:00 +0200</pubDate>
        <link>https://www.wild-inter.net/posts/powersort-in-python-3.11</link>
        <guid isPermaLink="true">https://www.wild-inter.net/posts/powersort-in-python-3.11</guid>
        
        
        <category>algorithms</category>
        
        <category>teaching</category>
        
        <category>web</category>
        
      </item>
    
      <item>
        <title>Amortized analysis of resizing-array stacks</title>
        <description>&lt;p&gt;A rigorous proof that a stack implemented with doubling arrays has constant amortized time operations;
written up here since it does not seem to appear in any of the standard algorithms books.&lt;/p&gt;

&lt;script type=&quot;text/x-mathjax-config&quot;&gt;
	var font = &quot;Neo-Euler&quot;;
	MathJax.Hub.Config({
		tex2jax: {
			inlineMath: [[&apos;$&apos;,&apos;$&apos;]],
			displayMath: [[&apos;\\[&apos;,&apos;\\]&apos;]],
			processEscapes: true,
		},
		&quot;SVG&quot;:{ 
			font:font
		},
		&quot;HTML-CSS&quot;: {
			webFont: font,
			imageFont: font,
			preferredFont: font,
			availableFonts: [],
			scale: 85,
			mtextFontInherit: true
		}
	});
&lt;/script&gt;

&lt;script type=&quot;text/javascript&quot; src=&quot;https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;&lt;/script&gt;

&lt;p&gt;A well-known, fundamental data structure is the implementation of a stack 
using resizing arrays (a.k.a. doubling arrays), where we maintain an array of
$C$ items for the $n$ elements of a stack, and whenever the array becomes full, 
we double its size, and whenever the array becomes less that one quarter full,
we halve its size.
This maintains the invariant that $\frac14 C \le n \le C$.&lt;/p&gt;

&lt;p&gt;A folklore analysis shows that this achieves constant amortized cost for all
stack operations, despite the occasional expensive resizing operations.&lt;/p&gt;

&lt;p&gt;This analysis is not a particularly hard or surprising proof by any means, but it makes a great first nontrivial example
of amortized analysis, and hence I wanted to show it in my &lt;a href=&quot;/teaching/comp526&quot;&gt;&lt;em&gt;Efficient Algorithms (COMP526)&lt;/em&gt;&lt;/a&gt; lectures;
see &lt;a href=&quot;/teaching/comp526/02-fundamental-ds&quot;&gt;&lt;em&gt;Unit 2 – Fundamental Data Structures&lt;/em&gt;&lt;/a&gt; for the full context.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;The goal is to show that while any individual push/pop in a resizing-array based stack might be expensive ($\Omega(n)$ cost), any &lt;em&gt;sequence&lt;/em&gt; of operations is necessarily much cheaper, namely $O(1)$ time per operation &lt;strong&gt;on average&lt;/strong&gt;.
As the dominant operation, we count array accesses, i.e., any read or write access to an array.&lt;/p&gt;

&lt;h2 id=&quot;part-1-amortized-costs-for-all-operations&quot;&gt;Part 1: Amortized costs for all operations&lt;/h2&gt;
&lt;p&gt;Basically, each operation has two types of costs for the amortized analysis: &lt;strong&gt;actual costs&lt;/strong&gt; (# array accesses) and a &lt;strong&gt;change in potential/credits&lt;/strong&gt;.
We define the potential $\Phi = \min\lbrace n-\frac14C,\;C-n\rbrace$,
and the amortized cost $a_i$ of an operation is the actual cost plus $-4$ times the change in potential.
The intuition behind $\Phi$ is to measure the distance of the current filling mark $n$ from the “expensive boundaries” $\frac14C$ resp. $C$.&lt;/p&gt;

&lt;p&gt;We have to analyze both costs separately.&lt;/p&gt;

&lt;h3 id=&quot;actual-costs&quot;&gt;Actual costs:&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;cheap push/pop: exactly 1 array access to write/read the topmost element.&lt;/li&gt;
  &lt;li&gt;copying push: currently there are $n$ elements on the stack, these have to be read from the old array ($n$ accesses) and written to the new array ($n$ accesses); also one more element has to be added (like in cheap push). In total that is $2n+1$ actual cost.&lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;copying pop: actually exactly the same: there are $n$ elements on the stack, these have to be read from the old array ($n$ accesses) and written to the new array ($n$ accesses); also one element has to be read to be returned. In total that is $2n+1$ actual cost.&lt;/p&gt;

    &lt;p&gt;(One could avoid this very last extra read by not copying the element that we pop right after anyways; but typical implementations do not do this for convenience. It would clearly not save much either way.)&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;credits--potential-change&quot;&gt;Credits / Potential change&lt;/h3&gt;
&lt;p&gt;The credits is the &lt;em&gt;change&lt;/em&gt; in potential $\Phi = \min\lbrace n-\frac14C,\;C-n\rbrace$.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;cheap push: $n$ gets one bigger, but $C$ is unchanged. If $C-n &amp;lt; n-\frac14 C$, then $\Phi$ drops by one (“we lose one credit”).&lt;/li&gt;
  &lt;li&gt;cheap pop: $n$ gets one smaller, but $C$ is unchanged. If $n-\frac14 C&amp;lt;C-n$, then $\Phi$ drops by one (“we lose one credit”).&lt;/li&gt;
  &lt;li&gt;copying push: We must have had $n=C$ (i.e. $\Phi_{i-1}=0$) before this push, and we will now set $C=2n$ before the push. Then, the push increments $n$. That means the new potential $\Phi_i=(n+1)-\frac14\cdot2n=\frac12n+1$.  We have earned $\frac12n+1$ credits.&lt;/li&gt;
  &lt;li&gt;copying pop: We must have had $n=\frac14C$ (i.e. $\Phi_{i-1}=0$) before this push, and we will now make $C=2n$ before the pop; the pop itself then decrements $n$. So $\Phi_i=(n-1)-\frac14\cdot 2n =  \frac12n-1$, and we have earned $\frac12n-1$ credits.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;adding-up&quot;&gt;Adding up&lt;/h3&gt;
&lt;p&gt;Adding up actual cost and $-4(\Phi_i-\Phi_{i-1})$ shows that in each case the amortized costs are at most 5.&lt;/p&gt;

&lt;h2 id=&quot;part-2-from-amortized-to-total-actual-costs&quot;&gt;Part 2: From amortized to total actual costs&lt;/h2&gt;

&lt;p&gt;The second part is indeed the same for &lt;em&gt;all&lt;/em&gt; amortized analyses: The total actual cost over a &lt;em&gt;sequence&lt;/em&gt; of $m$ operations is essentially bounded by the sum of their amortized costs, plus initial/final potential; this is shown using a telescoping-sum argument:&lt;/p&gt;

\[5m 
\ge \sum_{i=1}^m a_i 
= \sum_{i=1}^m c_i - 
4 \underbrace{\sum_{i=1}^m(\Phi_i - \Phi_{i-1})}_{=\Phi_m - \Phi_0}\]

&lt;p&gt;Rearranging gives&lt;/p&gt;

\[\sum_{i=1}^m c_i \le 5m + 4\Phi_m-4\Phi_0\]

&lt;p&gt;Now, we can also show using the invariant $\frac14 C \le n \le C$, i.e.,
$n\le C \le 4n$, that $0\le \Phi\le \frac35n$:
Since $\Phi$ is piecewise linear, it suffices to consider the endpoints of the 
linear segments, i.e., $C = 4n$, $C = n$ and $n-\frac14 C = C-n$, i.e., 
$C = \frac85 n$; at these points $\Phi$ has values $0$, $0$, and $\frac35 n$, respectively.&lt;/p&gt;

&lt;p&gt;Hence $\displaystyle\sum_{i=1}^m c_i \le 4\Phi_m -4\Phi_0 \le 5m + 2.4n \in \Theta(m+n)$.&lt;/p&gt;
</description>
        <pubDate>Thu, 20 Oct 2022 00:00:00 +0200</pubDate>
        <link>https://www.wild-inter.net/posts/amortized-analysis-resizing-arrays</link>
        <guid isPermaLink="true">https://www.wild-inter.net/posts/amortized-analysis-resizing-arrays</guid>
        
        
        <category>algorithms</category>
        
        <category>teaching</category>
        
      </item>
    
      <item>
        <title>Increase the number of recent folders in Thunderbird</title>
        <description>&lt;p&gt;Showing more than 15 recent folders in move-to and copy-to context menus is easy in Thunderbird 91.&lt;/p&gt;

&lt;p&gt;I’m a heavy user of many IMAP folders for organizing email (and Günter Gersdorf’s brilliant Thunderbird extension
&lt;a href=&quot;https://www.ggbs.de/extensions/CopySent2Current.html&quot;&gt;Copy Sent to Current&lt;/a&gt;), moving emails to folders quickly is important.&lt;/p&gt;

&lt;p&gt;Thunderbird long has remembered which folders were used most recently, offering to move or copy mails there in a separate menu, but the default number of folders shown there was a miserly 15 folders.
Previously, increasing that number required a rather &lt;a href=&quot;http://forums.mozillazine.org/viewtopic.php?f=28&amp;amp;t=2710625&quot;&gt;hidden hack&lt;/a&gt;,
but in the latest version of Thunderbird (91 at the time of writing), it is easy:&lt;/p&gt;

&lt;p&gt;Simply open the “Config Editor” in the preferences, and change the key &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;mail.folder_widget.max_recent&lt;/code&gt;	to your preferred value; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;40&lt;/code&gt; for me.&lt;/p&gt;
</description>
        <pubDate>Thu, 27 Jan 2022 00:00:00 +0100</pubDate>
        <link>https://www.wild-inter.net/posts/number-recent-folders-thunderbird</link>
        <guid isPermaLink="true">https://www.wild-inter.net/posts/number-recent-folders-thunderbird</guid>
        
        
        <category>web</category>
        
        <category>linux</category>
        
      </item>
    
      <item>
        <title>How to move your lecture online &amp;ndash; in little time</title>
        <description>&lt;p&gt;I describe my solution for online lecturing amid the COVID-19 crisis
using youtube livestreams and PINGO.&lt;/p&gt;

&lt;p&gt;Although I kind of saw it coming 
after reading this &lt;a href=&quot;https://medium.com/@tomaspueyo/coronavirus-act-today-or-people-will-die-f4d3d9cd99ca&quot;&gt;excellent data analysis&lt;/a&gt; 
(on March 12, before things got really crazy),
things did get hectic:
The official decision of University of Liverpool to move all
face-to-face classes online with &lt;em&gt;immediate effect&lt;/em&gt; came
on Saturday evening (March 14), with my class due on Monday, March 16.
So what follows is not the well-thought out, technically sophisticated
and educationally up-to-date mode of online teaching I (and you) might dream of,
but it is what allowed me to deliver an (according to isolated feedback) effective 
online lecture with less than one day of prep time.&lt;/p&gt;

&lt;h2 id=&quot;where-i-started-good-old-screencasts-&quot;&gt;Where I started: Good old screencasts …&lt;/h2&gt;

&lt;p&gt;I have been recording screencasts of my lectures for &lt;a href=&quot;/teaching/comp526&quot;&gt;COMP526&lt;/a&gt;
and posting them on youtube all term.
The methodology for that (on Ubuntu) is basically still as described
&lt;a href=&quot;/teaching/advanced-algorithms/#technical-details&quot;&gt;here&lt;/a&gt;, only with an
update of my laptop (now an HP Elitebook x360) and &lt;a href=&quot;https://github.com/xournalpp/xournalpp&quot;&gt;Xournal++&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;So a reasonable mic, 
screencasting software (&lt;a href=&quot;https://www.maartenbaert.be/simplescreenrecorder/&quot;&gt;SimpleScreenRecorder&lt;/a&gt;)
and a &lt;a href=&quot;/teaching/comp526#syllabus&quot;&gt;website&lt;/a&gt; to post videos and lecture notes were already set up.&lt;/p&gt;

&lt;h2 id=&quot;-and-in-class-formative-assessments-aka-clicker-questions&quot;&gt;… and in-class formative assessments (aka clicker questions)&lt;/h2&gt;

&lt;p&gt;What I also came to like as an effective tool, is an in-class 
response system to quickly ask for opinions, prior knowledge,
to recap definitions, and to test understanding.
(I have been using &lt;a href=&quot;http://trypingo.com/&quot;&gt;PINGO&lt;/a&gt; for that.)&lt;/p&gt;

&lt;p&gt;So my initial contingency plan was to record the lectures at home
and upload them. But what was missing, was a way to 
keep the clicker questions;
and – I know how these things go first hand –
I was afraid that had there been no incentive for students to keep 
on track with watching the videos, it all too easily happens that some 
fall behind.&lt;/p&gt;

&lt;p&gt;I was determined to not let that happen (quite so easily).&lt;/p&gt;

&lt;h2 id=&quot;going-live&quot;&gt;Going live!&lt;/h2&gt;

&lt;p&gt;My solution was to use youtube livestreams for the lectures;
how to do that is explained below.
That way, we (the students and myself that is) would be seeing the same screen
(almost – more on that later) in real time, and I could simply continue 
with the clicker questions.&lt;/p&gt;

&lt;p&gt;Youtube also has a “live chat” that offers a (limited) backchannel for
students to ask questions (which quite a few did!), signal technical problems 
(none yet, luckily!), or give a quick “hands” on who is still following.&lt;/p&gt;

&lt;h3 id=&quot;quick-how-to-for-youtube-livestreams&quot;&gt;Quick how-to for youtube livestreams&lt;/h3&gt;

&lt;p&gt;(Here is &lt;a href=&quot;https://support.google.com/youtube/answer/2474026?hl=en&amp;amp;ref_topic=9257984&quot;&gt;youtube’s detailed manual&lt;/a&gt; on that;
you want the “encoder streaming”.)&lt;/p&gt;

&lt;p&gt;After signing into your youtube account, click on CREATE → Go Live (top right).
There you pick “Stream” (the middle tab at the top).
I did change the defaults, except for setting the stream latency to “ultra low”.
In the top right, you can get a link that you can share with students even ahead of time.&lt;/p&gt;

&lt;p&gt;Now, to stream your screen content (or part of it), you need an &lt;em&gt;encoder&lt;/em&gt;.
I had good experiences with SimpleScreenRecorder, and indeed you can use it for this, too.
The screenshot below shows the settings I used; what goes into the “Save as” box
is shown in the youtube stream settings as “Stream URL” and “Stream key”;
the entry simply is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;&amp;lt;Stream URL&amp;gt;/&amp;lt;Stream key&amp;gt;&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Selecting AAC as audio codec is vital!&lt;/strong&gt; The mp3 encoder (selected by default) does 
&lt;em&gt;not&lt;/em&gt; work with youtube, but the error messages don’t tell you that.&lt;/p&gt;

&lt;figure&gt;
	&lt;img src=&quot;/assets/simplescreenrecorder-youtube.png&quot; /&gt;
&lt;caption&gt;
	Settings for streaming to youtube.
&lt;/caption&gt;
&lt;/figure&gt;

&lt;p&gt;Then you click Continue and simple start your recording.
The youtube stream settings site should now show your screen content (with a few seconds delay).&lt;/p&gt;

&lt;p&gt;Clicking on “GO LIVE” (top right) starts the actual livestream.&lt;/p&gt;

&lt;p&gt;On my (fairly new) laptop, downscaling the 4K display to 1080p and encoding
as x264 did put considerable load on the machine, 
but I did not experience severe problems, so I did not try to play with the
encoder settings at all.
Your mileage may vary.&lt;/p&gt;

&lt;p&gt;As for all youtube videos, you can configure the visibility of the stream as “unlisted”, 
then only students with the link can view the video, but no-one can find it through search;
“public” videos appear in searches.
If you choose “private”, people have to be signed in, and I did not want to force students to do that.&lt;/p&gt;

&lt;h3 id=&quot;phone-for-backchannel&quot;&gt;Phone for backchannel&lt;/h3&gt;

&lt;p&gt;During the lecture, I kept the youtube app on my Android open to see
messages coming on the live chat. 
(This is also a great way to test if your stream works.)&lt;/p&gt;

&lt;h3 id=&quot;aftermath&quot;&gt;Aftermath&lt;/h3&gt;

&lt;p&gt;For consistency, I split the recorded livestream into individual
videos for each subsection, using the youtube studio editor, but also
keep the livestream itself (as an unlisted video).
(Pretend to trim the live stream, but then instead of “SAVE” click the 
three dots and “SAVE AS NEW”).
The nicely cut videos are then linked to from my course website, e.g.,
&lt;a href=&quot;/teaching/comp526/07-compression#material&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Cutting videos takes a few extra minutes that I did not need
when locally recording in class, where I could easily start a new recording 
with one click. But it makes the recordings much easier to navigate and
use later on.&lt;/p&gt;

&lt;h2 id=&quot;first-impressions&quot;&gt;First impressions&lt;/h2&gt;

&lt;p&gt;After using this setup for 3 hours of lectures, I am overall fairly happy.
It does not cost much more preparation for me (although a bit extra time for cutting videos) 
and is clearly superior to only uploading videos.
The livestream always had well below 10s of delay, which is totally fine
for the interactive questions, and sound and video quality are excellent.&lt;/p&gt;

&lt;p&gt;Compared to (my experiences with) video conferencing solutions 
Cisco WebEx, Skype, and Microsoft Teams, the stream is clearly superior
in quality and stability, and the resulting recordings are essentially
indistinguishable from the ones I recorded in face-to-face lectures.&lt;/p&gt;

&lt;p&gt;A clear downside of my approach is the missing instant feedback 
from looking around the audience’s faces. (I usually have 30-50 students 
in class, so this was very doable.)
I used to look around for this instant feedback very frequently –
I’m lecturing facing the audience in the lecture rooms –
so there is no way to replace this with the same number of PINGO questions.&lt;/p&gt;

&lt;p&gt;Ideally, I’d like to have an additional (informal, anonymous) 
“quick-emotions” backchannel with buzzers for “I’m lost”, “I got it”,
and “I need a break” (or so) that students could continuously push 
(as opposed to questions I have to trigger).
So far, I have not found a service for that.&lt;/p&gt;

</description>
        <pubDate>Tue, 17 Mar 2020 00:00:00 +0100</pubDate>
        <link>https://www.wild-inter.net/posts/youtube-livestream-lectures</link>
        <guid isPermaLink="true">https://www.wild-inter.net/posts/youtube-livestream-lectures</guid>
        
        
        <category>teaching</category>
        
        <category>web</category>
        
        <category>linux</category>
        
      </item>
    
      <item>
        <title>Install pdf2htmlEX on recent Ubuntu</title>
        <description>&lt;p&gt;Because of unresolved dependencies, installing pdf2htmlEX became
challenging in recent Ubuntu.&lt;/p&gt;

&lt;h3 id=&quot;update-2024-11&quot;&gt;&lt;strong&gt;Update&lt;/strong&gt; [2024-11]&lt;/h3&gt;

&lt;p&gt;For Ubuntu 24.04, the situation seems to again have changed.
While the version from &lt;a href=&quot;https://pdf2htmlex.github.io/pdf2htmlEX/&quot;&gt;pdf2htmlex.github.io&lt;/a&gt; still works, it does fail to convert some PDFs for me.
I have not yet found a solution for this, but I will update this post when I do.&lt;/p&gt;

&lt;p&gt;The old docker built by bwits is still available and works fine, including all the other steps described below, so for now (and again, until the team at pdf2htmlex.github.io has an updated built), the docker container is the way to go.&lt;/p&gt;

&lt;h3 id=&quot;update-2022-09&quot;&gt;&lt;strong&gt;Update&lt;/strong&gt; [2022-09]&lt;/h3&gt;

&lt;p&gt;Much of the complication below can now be avoided! 
A few developers – worthy of our collective Thanks! – revived
pdf2htmlEX and ported it to new versions of poppler and fontforge.
Their effort lives on &lt;a href=&quot;https://pdf2htmlex.github.io/pdf2htmlEX/&quot;&gt;pdf2htmlex.github.io&lt;/a&gt; and they offer
various prepackaged &lt;a href=&quot;https://github.com/pdf2htmlEX/pdf2htmlEX/releases&quot;&gt;releases&lt;/a&gt;,
including AppImages.&lt;/p&gt;

&lt;h2 id=&quot;pdf2htmlex-in-docker&quot;&gt;pdf2htmlEX in docker&lt;/h2&gt;

&lt;p&gt;I use pdf2htmlEX to make &lt;a href=&quot;pdf2htmlex&quot;&gt;pdfs nicely readable in the browser&lt;/a&gt;.
pdf2htmlEX relies on a custom version of the poppler library, and
support for more recent versions of poppler has not been built into it yet.
Since no new maintainer has been found, people started to look for alternatives
to keep using pdf2htmlEX productively, without being forced to stay on
old libraries systemwide.
Docker containers are a solution for precisely such use cases.&lt;/p&gt;

&lt;p&gt;I here describe the steps that it took me to get pdf2htmlEX running on
Ubuntu 18.04.1 LTS; I was fine with a certain overhead (in time and space)
for running it, but I wanted direct command-line interaction on individual
files. Since docker containers are isolated from the host system, this requires
some extra steps.&lt;/p&gt;

&lt;p&gt;First install docker; I used the snap version, so I ran:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;snap install docker
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Next, I pulled the prepackaged docker container by
&lt;a href=&quot;https://hub.docker.com/r/bwits/pdf2htmlex&quot;&gt;bwits&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;sudo docker pull bwits/pdf2htmlex
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;For running pdf2htmlEX conveniently and (somewhat) securely,
you should be able to run docker as user;
this is not possible directly since docker uses Unix sockets owned by root
for communicating with containers.
But if you create a group &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;docker&lt;/code&gt; and add yourself to it,
the socket will be &lt;a href=&quot;https://docs.docker.com/install/linux/linux-postinstall/#manage-docker-as-a-non-root-user&quot;&gt;owned by that group&lt;/a&gt; instead.
So:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;sudo groupadd docker
sudo usermod -aG docker $USER
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You probably have to reboot (log out and restart the docker daemon) before
this takes effect, you can test it with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;docker run hello-world&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If everything worked out, we can now run pdf2htmlEX as&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;docker run -ti --rm -v `pwd`:/pdf bwits/pdf2htmlex pdf2htmlEX [args] file.pdf
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;to convert &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;file.pdf&lt;/code&gt; in the current working directory.
Note that the application inside the container only gets access to the
the folder you map to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/pdf&lt;/code&gt; using the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-v&lt;/code&gt; option,
i.e., in the above command the current directory.&lt;/p&gt;
</description>
        <pubDate>Tue, 01 Jan 2019 00:00:00 +0100</pubDate>
        <link>https://www.wild-inter.net/posts/pdf2htmlEX-on-docker</link>
        <guid isPermaLink="true">https://www.wild-inter.net/posts/pdf2htmlEX-on-docker</guid>
        
        
        <category>linux</category>
        
      </item>
    
      <item>
        <title>Why DOIs Rule</title>
        <description>&lt;p&gt;DOIs (digital object identifiers) are much more than a unique id for
scientific papers.&lt;/p&gt;

&lt;p&gt;Making the style for bibliographies consistent probably ranks among the
least favorite tasks of researchers who would like to disseminate their
findings.
Thanks to LaTeX and BibTeX, the task of citing other research
is mostly reduced to curating a high-quality &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bib&lt;/code&gt;-file of references.&lt;/p&gt;

&lt;p&gt;But that shifts the problem to getting high-quality bib entries!
My experiences with publisher-provided bib-entries and services like Google Scholar
were very mixed – most required manual tweaking (and double checking!) of the entries.&lt;/p&gt;

&lt;p&gt;This should be much easier. All it needs is a
well-curated data base of metadata for scientific research
(maintained by those who care for consistency: the publishers!),
but with a machine-readable well-defined interface to be used by some
tool created by someone who
understands BibTeX well (it seems, this is rather not the publishers strength …).&lt;/p&gt;

&lt;p&gt;Luckily, both exists!
DOIs (digital object identifiers) are not just an id for papers,
they also serve as keys in exactly such a data base.
And with &lt;a href=&quot;https://doi2bib.org&quot;&gt;doi2bib.org&lt;/a&gt;, there is a service that
produces high-quality bib-entries from a doi.&lt;/p&gt;

&lt;p&gt;This brings us one step closer to a system, in which
the TeX source would only give the DOI and everything else is taken care of
automatically
(retaining the option for manual tweaking as with the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;bbl&lt;/code&gt; files of BibTeX).&lt;/p&gt;

&lt;h2 id=&quot;update-nothing-is-infallible&quot;&gt;Update: Nothing is infallible&lt;/h2&gt;

&lt;p&gt;I found a case where I was not happy with the result of doi2bib:
For papers in Springer journals that first appear online and later in
the printed journal, doi2bib mixes the two entries. 
Month and year of publication are set to the first online version,
but volume and issue number are also filled in, so that the resulting
bib entry looks as if the printed issue appeared earlier.
This is confusing, I would rather use only the final printed information,
and ended up manually adapting the bib files.&lt;/p&gt;
</description>
        <pubDate>Fri, 13 Jul 2018 00:00:00 +0200</pubDate>
        <link>https://www.wild-inter.net/posts/why-dois-rule</link>
        <guid isPermaLink="true">https://www.wild-inter.net/posts/why-dois-rule</guid>
        
        
        <category>web</category>
        
        <category>publishing</category>
        
      </item>
    
  </channel>
</rss>
