<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://mpaviotti.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://mpaviotti.github.io/" rel="alternate" type="text/html" /><updated>2026-04-06T20:45:49+00:00</updated><id>https://mpaviotti.github.io/feed.xml</id><title type="html">Marco Paviotti</title><subtitle>Lecturer in Computer Science</subtitle><author><name>Marco Paviotti</name><email>m.paviotti@kent.ac.uk</email></author><entry><title type="html">So you want to get a PhD..</title><link href="https://mpaviotti.github.io/posts/2026/02/PhD/" rel="alternate" type="text/html" title="So you want to get a PhD.." /><published>2026-02-15T13:15:00+00:00</published><updated>2026-02-15T13:15:00+00:00</updated><id>https://mpaviotti.github.io/posts/2026/02/PhD</id><content type="html" xml:base="https://mpaviotti.github.io/posts/2026/02/PhD/"><![CDATA[<p>while I think this is a great idea, let me tell you: <em>This is not a degree for the faint-hearted</em>.</p>

<h1 id="-what-a-phd-really-is">🎓 What a PhD Really Is</h1>

<p>A PhD (Doctor of Philosophy) is a research degree focused on creating <strong>new
knowledge</strong>, not just learning existing knowledge.</p>

<p>It is radically different from any other university degree. <br />
As a PhD student you will not necessarily just study books an pass exams, but
focus more on doing independent <strong>original research</strong>, <strong>publishing and
presenting</strong> your findings and finally writing and defending a <strong>doctoral
thesis</strong>.</p>

<p>In other words, you will spend a lot of time reading, writing, creating
knowledge and disseminate it. Essentially, having a PhD demonstrates that you
possess a <strong>deep knowledge</strong> of your own specific field and can engage with
other <strong>international experts</strong> as a peer.</p>

<p>If it still sounds easy to you here’s some facts. A PhD is the <strong>highest
academic degree</strong> awarded by universities — held by roughly <strong>2% of the world’s
population</strong> (I took this number from Google, don’t ask!) and, based on
calculations I’ve made in a university I worked for, <strong>failure rates for PhD
students can be as high as 40%</strong> .</p>

<p>The reasons why people don’t make it to the end of the line can vary from person to person, but common mistakes include underestimating the amount of work required, supervisor mismatch and/or topic mismatch, and not understanding what is the requirement for getting a PhD.</p>

<h2 id="-it-can-be-the-best-moment-of-your-life-or-the-worst-pirate_flag-">🌟 It can be the best moment of your life,… or the worst :pirate_flag: !</h2>

<p>A PhD can become, one of your <strong>biggest life achievements</strong> and the most
<strong>intellectually rewarding periods</strong> of your life. This provided that you can
make <strong>consistent progress</strong>, tolerate uncertainty, reach the finish line.</p>

<p>Otherwise, it can become <strong>a long tunnel with no visible exit</strong>, a slow erosion
of confidence, and a prolonged period of stress and doubt.</p>

<p>You might think I’m trying to scare you — and to some extent, that’s true — but
it’s better to be honest than to lead you into a trap. Yes, it can feel like a
trap, because once you begin, you’re committing to three or four years of your
life, possibly more.</p>

<p>It’s truly a make-or-break journey.  If halfway through you realize it’s not
right for you, you’re faced with a difficult decision: acknowledging that you’ve
lost a significant amount of time and quit or keep going for another 
several years and not knowing whether you’re going to make it or not.</p>

<p>That choice can be heartbreaking — for you and for us.</p>

<h2 id="-i-have-a-phd-in-making-mistakes">🔥 I have a PhD in “Making Mistakes”</h2>

<p>One of the most precious life lessons I learned from my PhD is that</p>

<p><em>Failure happens when you try something new</em></p>

<p>If anyone ever mocked you for having failed at something don’t worry; <strong>Those who are never wrong have never tried anything challenging</strong>. 
I personally think I failed in my life more than I succeeded. In a sense, <em>I have a PhD in making mistakes</em>, and I am not ashamed of saying it.</p>

<p>However, failure is not a side effect, it is <strong>the mechanism.</strong> If you cannot tolerate
being wrong, uncertain, or behind for long periods, a PhD will feel unbearable.</p>

<p>Doing research means</p>

<ul>
  <li>Submit papers that get rejected ❌</li>
  <li>Run experiments that fail 💥</li>
  <li>Write code that doesn’t work 🐛</li>
  <li>Read things you don’t understand (yet) 🤯</li>
</ul>

<p>But, 💪 if you can treat failure as data, confusion as progress, and rejection
as part of life, then a PhD is for you. In other words, if failure challenges
you and drives you forward, and if not knowing things excites you rather than
frustrates you, then this path is for you.</p>

<hr />
<h2 id="-the-harsh-truth-of-modern-academia">🧠 The harsh truth of modern academia</h2>
<p>Sorry for saying this, but the harsh truth is that undergraduate degrees are
getting easier and easier to attain, on the other hand, PhDs are <strong>as demanding
as ever</strong>.</p>

<p>When doing an undergraduate degree, you are <strong>safeguarded</strong>; Material is
prepared for you, grades follow standardized metrics, and admission almost
guarantees eventual graduation. A PhD, on the other hand, <strong>does not follow
these rules</strong>; There is <strong>no guarantee of success</strong>.</p>

<p>There is also an uncomfortable truth about modern academia that we (as a society) will need to address at some point: <strong>The AI factor</strong>.</p>

<p>Don’t get me wrong, it is not a sin to use AI to restructure text, but I would
be very careful about using it for doing rigorous work.</p>

<p>The problem is that the unscrupulous use of AI is making undergraduates often
<strong>less trained to work independently</strong> and to push themselves rigorously. Not
only AI only regurgitates existing knowledge and cannot produce new, but also
the knowledge it regurgitates is often wrong and sloppy. I recently spent days
trying to debug AI slop and at some point I gave up because the AI would produce wrong math 
faster than I could disprove it.</p>

<p>This is also known, as a <strong>The bullshit asymmetry principle</strong> (a valuable piece of wisdom btw). The principle is as follows:</p>

<p><em>The amount of energy needed to refute bullshit is an order of magnitude bigger
than to produce it.</em></p>

<p>Read more about it
<a href="https://statmodeling.stat.columbia.edu/2019/01/28/bullshit-asymmetry-principle/">here</a>.</p>

<hr />

<h1 id="-the-most-important-thing-of-a-phd">🔱 The most important thing of a PhD</h1>

<p>The other most important thing I’ve learned when doing a PhD is that
relationships are not built on magic or illusions — they are built on mutual
benefit.🤝</p>

<p><code class="language-plaintext highlighter-rouge">I give you what you need, if you give me something that I need</code> ⚖️</p>

<p>Anyone that ignores this fundamental balance risks becoming <strong>toxic</strong>.
☠️🧪⚠️</p>

<p>This basic life lesson is the core of the PhD program and therefore I’d say that the most important in your PhD is</p>

<p><code class="language-plaintext highlighter-rouge">The relationship with your supervisor</code>.</p>

<p>It is crucial that you find a supervisor whose  <strong>research interests align</strong>
with yours, someone you feel <strong>comfortable working with</strong> and someone whose
mentoring style matches your needs.</p>

<p>When looking for a PhD, my suggestion is to state upfront what kind of topics
you like and what kind of supervision you want. Don’t negotiate on it.
Ultimately, it is your PhD.</p>

<h2 id="-the-2-player-game">🤝 The 2-Player Game</h2>

<p>In my opinion, a PhD is essentially a <strong>two-player game</strong> between the student and the supervisor.</p>

<p>Crucially though, if this relationship break, the one who suffers the most is
the student.🛋️💡  While the supervisor is responsible for you, if you do not
carry on with your own work, they will not lose their job if you don’t get your
PhD.</p>

<p>In many European systems in particular, the <strong>supervisor may be the single most
important factor</strong> in your PhD experience.</p>

<p>A bad or toxic supervisor can derail even a strong student, while a good supervisor can elevate an average project into an excellent one and, also,<br />
a good supervisor can do little if the students is not willing to contribute.</p>

<h1 id="-moving-abroad">🌍 Moving abroad</h1>

<p>If you are not a native English speaker, I strongly support the idea of doing a
PhD in an English-speaking country or similar (like the Netherlands, Canada or
Scandinavia).</p>

<p>For a handful of reasons:</p>
<ul>
  <li>these countries are best at supporting research 🔬</li>
  <li>you will hone your English abilities 🇬🇧</li>
  <li>you will learn to interact with a wide variety of different people 🧑‍🤝‍🧑</li>
</ul>

<p>In other words, moving abroad will strengthen your PhD considerably into many
different directions.</p>

<p>Often countries like Denmark or the Netherlands have way more funding of PhD
students, they treat students more fairly and … they pay more.  It is overall
a better environment that allows everyone to thrive.</p>

<p>Moreover, to get a good PhD you need to be able to collaborate with different
people as good ideas tend to arise in environments with a high level of
diversity. This is, in my opinion, just because we work in a very specialised 
and narrow field and so eventually finding people locally is extremely hard. <br />
Hence it is extremely important that you learn to communicate and collaborate
with all kind of people regardless of their culture, ethnicity and so on.</p>

<p>At the same time, <strong>doing a PhD while living abroad</strong>, confronting yourself with
different cultures and learning a new language  can become <strong>a daunting task</strong>.<br />
Some people experience significant <strong>cultural shocks</strong> while pursuing their PhD,
which can take a heavy toll on their mental health. I would encourage anyone who
feels overwhelmed to seek support. Many universities provide resources,
sometimes limited, but still helpful, or it can be beneficial to speak with a
professional.</p>

<p>Make sure you like the country you are moving to before accepting a PhD offer.</p>

<hr />
<h1 id="-takeaways">✅ Takeaways</h1>

<p>Explaining what a PhD is, is not an easy task, it is a challenging and unpredictable journey.<br />
Everyone’s experience is different, so in the end, <strong>no one can truly tell you
what a PhD is</strong> — you have to experience it yourself, fail, learn, and try
again.</p>

<p>To anyone who is considering doing a PhD, I’d suggest to choose</p>
<ul>
  <li>a topic that makes you tick :metal:</li>
  <li>a supervisor that is willing to supervise you 🤝</li>
  <li>a country that you like</li>
</ul>

<p>During times of self-doubt, take care of yourself, don’t beat yourself up and
avoid overindulging in pubs or nightclubs 🍻. Get a good night’s sleep 🛌  and
try again tomorrow.</p>

<p>(Do as I say, not as I did).</p>

<hr />]]></content><author><name>Marco Paviotti</name><email>m.paviotti@kent.ac.uk</email></author><category term="PhD" /><summary type="html"><![CDATA[while I think this is a great idea, let me tell you: This is not a degree for the faint-hearted.]]></summary></entry><entry><title type="html">Recursion as an Effect</title><link href="https://mpaviotti.github.io/posts/2025/12/Lifting/" rel="alternate" type="text/html" title="Recursion as an Effect" /><published>2025-12-28T19:24:21+00:00</published><updated>2025-12-28T19:24:21+00:00</updated><id>https://mpaviotti.github.io/posts/2025/12/Guarded-Lifting</id><content type="html" xml:base="https://mpaviotti.github.io/posts/2025/12/Lifting/"><![CDATA[<p>In one of my previous <a href="/posts/2022/11/CCC-FixedPoints/">post</a> I showed that any
theory featuring general recursion is inconsistent when viewed as a logical
system which inevitably leads to the idea that all definable functions in such a
theory should be total (or productive).</p>

<p>However, losing Turing-completeness could be somewhat problematic for some, but
it can be addressed in several ways. One of these, an possibly the most popular,
is to isolate recursion into a monad, effectively regarding  <strong>recursion as an
effect</strong>. We discuss several lifting monads which are fit for purpose.</p>

<h2 id="domain-theoretic-liftings">Domain-Theoretic Liftings</h2>
<p>The domain-theoretic approach to non-termination is to model computations as
maps between sets with an additional  element.  Thus we define a <strong>lifting</strong>
operation which takes a set and adds an element to it</p>

\[M A = A + 1\]

<p>which is the set of computations that either return an element of type \(A\) or
do not terminate.</p>

<p>Of course, without proper restrictions on the functions that we can apply to it,
this monad allows one to “decide non-termination”: one can write a function
\(f : M A \to \{\textbf{True}, \textbf{False}\}\) which returns
\(\textbf{True}\) if the program does not terminate and \(\textbf{False}\)
otherwise. This clearly is not what we are trying to model.</p>

<p>To avoid this problem, in domain theory, a set \(A\) are endowed with a <strong>complete partial order</strong> (\(\sqsubseteq\)) where non-termination is modelled as the least element (\(\bot\)). The operation \(A \mapsto A_\bot\) which adds a least element to a CPO is called the <strong>lifting</strong> of a CPO.  <br />
Moreover, functions have to respect a <strong>continuity</strong> condition, that is the
function must preserve least upper bounds of arbitrary \(\omega\)-chains:</p>

\[f(\bigsqcup_{i\in \omega} d_i) = \bigsqcup_{i \in \omega} f(d_i)\]

<p>Essentially what this means is that the function \(f\) when applied to the best
approximation of a subset, can be computed <em>locally</em> for each element of this
subset. One consequence of this fact is that \(f\) is monotonic: it preserves
the order of the CPO. One feature of this category is that every continuous map
\(A_\bot \xrightarrow{\text{cont}} A_\bot\) has a fixed-point operator via the
Fixed-Point Theorem:</p>

\[\text{fix}(f) = \bigsqcup_{i \in \omega} f^n(\bot)\]

<p>which is given by the least upper bound of an \(\omega\)-chain</p>

\[\bot \sqsubseteq f(\bot) \sqsubseteq f^2(\bot) \sqsubseteq \dots  \sqsubseteq f^n(\bot)  \sqsubseteq \dots\]

<p>To go back to our original problem. Since \(\bot \sqsubseteq a\) for all \(a \in
A\), we cannot define a continuous map \(A_\bot \to 2_\bot\) such the one above
because in the codomain of this function the elements \(\textbf{True}\) and
\(\textbf{False}\) are not related.</p>

<p><strong>Remark.</strong> When doing mathematics into a proof assistant the expert distinguishes two ways:</p>
<ol>
  <li>implementing all the theory inside the prover’s logic, or</li>
  <li>creating a new synthetic language whose structure is interpreted inside the mathematical theory we want to work with</li>
</ol>

<p>The second approach is the one, for example, used in HoTT, where types are
certain topological spaces and functions are continuous.</p>

<p>The problem of formalising domain theory is that it becomes more complicated
when the proof assistant is based on type theory. In particular, the problem is
that a type is not really a set.</p>

<p>On the other hand, doing things synthetically would mean that recursion is
somewhat spread across the whole language. What I mean by this is that since
every continuous function has a fixed-point then non-termination can happen at
every type making the internal language of this category effectively an
inconsistent language when viewed as a logic. Hence the need for treating
recursion as an effect.</p>

<h2 id="the-coinductive-lifting-capretta">The Coinductive Lifting (Capretta)</h2>
<p>One solution proposed by Capretta is to take the coinductive solution to the
following domain equation</p>

\[D A \cong A + D A\]

<p>In other words, \(DA\) is the set coinductively generated by the constructors
\(\text{now} : A \to D A\) and \(\text{delay} : DA \to DA\).  Intuitively,
\(\text{now}(x)\) is a terminating computation which returns an element \(x \in
A\) in \(0\) steps, while \(\text{delay}(c)\) takes a computation \(c \in DA\)
and delays it by adding one computational step to it. For example,
\(\text{delay}(\text{delay}(\text{delay}(10)))\) is a computation which returns
the number \(10\) in three steps.</p>

<p><strong>Remark.</strong> \(D\) can be given the structure of an \(\omega\)-CPO with \(\bot\).</p>

<p>First, the non-terminating computation \(\bot\) can be defined coinductively as</p>

\[\bot = \text{delay}(\bot)\]

<p>which is clearly a productive definition. Intuitively, \(\bot\) corresponds to
the never-ending stream of delays:</p>

\[\bot = \text{delay}(\text{delay}(\text{delay}\dots))\]

<p>Clearly, we cannot produce a function which discriminate between a terminating
computation and non-terminating one. Capretta proves that \(D\) is a domain
(up-to bisimilarity), that is, he defines a partial order \(\sqsubseteq_D\) on
\(D\) which leads to a notion of least upper bounds for \(\omega\)-chains,
written \(\bigsqcup_{n\in \omega} d_n\) for</p>

\[d_0 \sqsubseteq_D d_0 \sqsubseteq_D d_1 \dots \sqsubseteq_D d_n \dots\]

<p>then it can be proven that every continuous function on \(D\) has a fixed-point
similarly to the construction in domain theory.</p>

<p><strong>Considerations.</strong> Now that recursion is being isolated into an effect we have
solved one problem. However, programming in practice with this monad is far from
being easy as one has to</p>
<ol>
  <li>prove that each program on \(DA\) they define is a continuous function</li>
  <li>working with a coinductive bisimilarity relation rather than equality</li>
  <li>ensure productivity of definitions</li>
</ol>

<h2 id="metric-lifting-monad-martin-escardó">Metric Lifting Monad (Martin Escardó)</h2>

<p>Escardó’s <em>metric lifting</em> models partiality using <strong>metric spaces</strong> rather than
coinduction, but the idea is not that different from Capretta’s.  The <strong>metric
lifting</strong> of a set \(A\), written \(LA\), is defined as</p>

\[LA = (A \times \mathbb{N}) \cup \{\infty\}\]

<p>together with a distance function \(d : LA \times LA \to [0, \infty]\) where
equal computations have distance \(0\), terminating computations \((a,k)\) and
non-terminating ones have distance \((1/2)^k\), and terminating computations
\((a,k)\) and \((b,l)\) have distance \(1/2^{\text{min}(k,l)}\). Intuitively,
\((a,k)\) is a computation which returns \(a\) in \(k\) steps and \(\infty\) is
the divergent computation.</p>

<p><strong>Remark.</strong> \(LA\) is a complete bounded metric ultrametric space.</p>

<p>The unit of the monad \(LA\) is defined by \(\eta_A(a) = (a,0)\) and the delay
operation is defined by</p>

\[\delta_A(a,n) = (a, n + 1) \qquad \delta_A(\infty) = \infty\]

<p>In metric spaces terminology, a function is non-expansive if it does not expand
the space relative to a distance function \(d\), but possibly contracts it:</p>

\[d(f(x), f(y)) \le d(x,y)\]

<p>On the other hand, a contractive map is a map which contracts the space:</p>

\[d(f(x), f(y)) \le c \dot (d(x,y))\]

<p>for a certain \(c &lt; 1\). At this point it is possible to define a fixed-point
operator for all contractive maps</p>

\[\text{fix} : (LA \to LA) \to LA\]

<p>which sends every non-expansive map \(f\) to the fixed-point of \(\delta_A \circ
f\), which is contractive because \(\delta_A\) is contractive. At this point the
non-divergent computation is now defined as</p>

\[\bot_A = \text{fix}(id_{LA})\]

<p><strong>Considerations.</strong> This approach does not seem to suffer from the use of
coinduction, but it still needs the programmer to prove functions are
non-expansiveness.</p>

<h2 id="guarded-lifting-atkey--mcbride">Guarded Lifting (Atkey &amp; McBride)</h2>
<p>The coinductive lifting monad suffers from productivity and equality issues,
while both the coinductive and metric liftings need additional structure on the
maps defined on them to work properly with fixed-points.</p>

<p>In guarded type theory however, maps are always non-expansive and
contractiveness is enforced at the type level. In particular, a contractive map
is a function of type \(\triangleright X \to X\)  for which there is always a
fixed-point at all types \(X\):</p>

\[\text{fix}_g : (\triangleright X \to X) \to X\]

<p>sending a map \(f : (\triangleright X \to X)\) to the unique fixed-point of \(f
\circ \text{next}\).  The guarded lifting is defined as the unique solution to
the domain equation</p>

\[L_g A = A + \triangleright L_g A\]

<p>There is an obvious unit of the monad \(\eta_A : A \to L_g A\) and delay map
which has type</p>

\[\delta_A : \triangleright L_g A \to L_g A\]

<p>Conceptually, this monad can be seen as Capretta’s lifting monad with an
explicit notion of time or delay built into the type theory.  At this point the
divergent computation \(\bot_A : L_{g} A\) is defined as the guarded fixed-point
of \(\delta\):</p>

\[\bot_A = \text{fix}_g (\delta_A)\]

<p>Now we can check from the fixed-point property that \(\bot = \delta_A (\text{next}(\bot_A))\). Here, the term \(\delta_A \circ \text{next}\) corresponds to the delay operation which adds one step to the computation.</p>

<h2 id="conclusion">Conclusion</h2>

<p><strong>The Synthetic Approach.</strong> What I personally found truly amazing about the guarded lifting is that this
monad is truly synthetic. There is no need for additional structure as in
Capretta’s lifting, no need for checking continuity or non-expansiveness of
maps. Furthermore, using the model of guarded type theory one can show that (in
a certain sense) it corresponds to Martin’s metric lifting on one side and to
Capretta’s monad on the other. I will probably need another post to explain this
point.</p>

<p><strong>Intensionality.</strong> To be honest, the only problem arising from the use of guarded recursion
unfortunately is the fact that computations are modelled intensionally, that is
two computations that return the same output given the same input are not
necessarily equal if they take a different amount of steps to terminate. This is
an issue that has to be solved once again by quotienting the monad which is
another problem entirely.</p>

<p>Nevertheless, these problems also arise in coinductive and metric approaches. At
present, the only extensional model of general recursion I am aware of is based
on domain theory.</p>

<p><strong>Consistency.</strong> Naturally, one might wonder why do we need guarded recursion, if domain theory
already lets us model all of this extensionally? The answer to that is that,
while domain theory is extremely powerful for modelling recursion extensionally,
it does not yield a logically consistent model suitable for type theory. As
noted in the introduction, this inconsistency makes domain-theoretic models
ill-suited as foundations for type-theoretic languages, where logical soundness
is essential.</p>]]></content><author><name>Marco Paviotti</name><email>m.paviotti@kent.ac.uk</email></author><category term="semantics" /><category term="categories" /><category term="monads" /><category term="domain-theory" /><category term="recursion" /><summary type="html"><![CDATA[In one of my previous post I showed that any theory featuring general recursion is inconsistent when viewed as a logical system which inevitably leads to the idea that all definable functions in such a theory should be total (or productive).]]></summary></entry><entry><title type="html">On Lax Monoidal Functors</title><link href="https://mpaviotti.github.io/posts/2025/12/Lax-Monoidal-Functors/" rel="alternate" type="text/html" title="On Lax Monoidal Functors" /><published>2025-12-19T21:17:21+00:00</published><updated>2025-12-19T21:17:21+00:00</updated><id>https://mpaviotti.github.io/posts/2025/12/Lax-Monoidal-Functors</id><content type="html" xml:base="https://mpaviotti.github.io/posts/2025/12/Lax-Monoidal-Functors/"><![CDATA[<p>What is the difference between</p>
<ul>
  <li>a lax monoidal functor</li>
  <li>a monoid in a Day-monoidal category</li>
  <li>a morphism of lax-algebras for the free monoid 2-monad, and</li>
  <li>a codistributive law with the tensor product?</li>
</ul>

<p><em>Well, None. Let’s see why</em>.</p>

<p>To keep this post as concise as humanly possible I will assume knowledge of (symmetric)<a href="https://ncatlab.org/nlab/show/monoidal+category">monoidal categories</a>, <a href="https://ncatlab.org/nlab/show/Kan+extension">kan extensions</a> and <a href="https://ncatlab.org/nlab/show/enriched+category">enriched categories</a>.</p>

<p>We show informally the following proposition.</p>

<p><strong>Proposition.</strong>
Let
\((\mathcal{C}, \otimes_{\mathcal{C}}, I_{\mathcal{C}})\)
be a small monoidal closed category enriched in a monoidal closed category 
\((\mathcal{D}, \otimes_{\mathcal{D}}, I_{\mathcal{V}})\)
 and let
\(F : \mathcal{C} \to \mathcal{D}\)
be a functor. The following statements for \(F\) are equivalent:</p>

<ol>
  <li>It is a lax monoidal functor</li>
  <li>It is a monoid in the monoidal category
\(([\mathcal{C}, \mathcal{D}], \otimes_\text{Day}, y(I_{\mathcal{C}}))\)</li>
  <li>It is a homomorphism of pseudo algebras for the free monoid 2-monad</li>
  <li>
    <p>It is a \(\mathbb{N}\)-indexed family of (co)distributive laws for a functor \(F : \mathcal{C} \to \mathcal{C}\)</p>

\[\text{Nat}(\otimes^{n} \circ F^{n}, F \circ \otimes^{n})\]

    <p>where
\(\otimes^{n} : \mathcal{C}^{n} \to \mathcal{C}\)</p>
  </li>
</ol>

<p>Let us assume the hypothesis of the proposition.</p>

<p><strong>Proof(Sketch).</strong> 
(1) \(\Leftrightarrow\) (2).</p>

<p>A <em>lax
monoidal functor</em> is a functor which lax-preserves the monoidal structure of \(\mathcal{C}\) that is, there is a morphism</p>

\[u : I_{\mathcal{D}} \to F I_{\mathcal{C}}\]

<p>and a family of morphisms</p>

\[\circledast_{X,Y} : F X \otimes_{\mathcal{D}} F Y \to F (X \otimes_{\mathcal{C}} Y)\]

<p>indexed by \(X,Y\) and natural therein, subject to some <a href="https://ncatlab.org/nlab/show/monoidal+functor">coherence conditions</a>.</p>

<p>On the other hand, the <em>Day convolution</em> provides a natural way to define a
monoidal structure on the category of functors.  In other words, the task is to
turn the category of functors \([\mathcal{C}, \mathcal{D}]\) into a monoidal
category by equipping it with a tensor product and a unit. Hence, for two
functors
\(F, G : \mathcal{C} \to \mathcal{D}\)
the Day convolution \(\otimes_\text{Day}\) is defined as follows:</p>

\[\begin{align*}
(F \otimes_\text{Day} G) C &amp; := \int^{X,Y \in \mathcal{C}} \mathcal{C}(X \otimes_{\mathcal{C}} Y, C) \otimes_{\mathcal{D}} F X \otimes_{\mathcal{D}} G Y\\
    &amp; = \text{Lan}_{\otimes_{\mathcal{C}}}(\otimes_{\mathcal{D}} \circ F \times F)
\end{align*}\]

<p>while the unit of \([\mathcal{C}, \mathcal{D}]\) is given by the Yoneda embedding applied to
the unit \(I_\mathcal{C}\) that is \(y(I_\mathcal{C}) = \mathcal{C}(I_\mathcal{C},-)\).</p>

<p>Now, a monoid in \(([\mathcal{C}, \mathcal{D}], \otimes_\text{Day}, y(I))\) is called a Day-monoid. This is a functor
\(F : \mathcal{C} \to \mathcal{D}\)
together with a unit and multiplication map.</p>

<ul>
  <li>The <strong>unit map</strong>
\(\eta : y(I) \to F\)
is obtained from the unit of the lax monoidal functor (and viceversa) via the enriched Yoneda lemma</li>
</ul>

\[\text{Nat}(y(I), F) \cong \mathcal{D}(I_{\mathcal{D}}, F(I_{\mathcal{C}}))\]

<ul>
  <li>The <strong>multiplication map</strong> 
\(\mu : F \otimes_{\text{Day}} F \to F\)
is obtained from \(\circledast\) (and viceversa) by using the adjunction \(\text{Lan}_J \dashv - \circ J\) as follows</li>
</ul>

\[\text{Nat}(\text{Lan}_{\otimes_\mathcal{C}}(\otimes_\mathcal{D} \circ F \times F), F)
\cong
\text{Nat}(\otimes_\mathcal{D} \circ F \times F, F \circ \otimes_\mathcal{C})\]

<p>It remains to prove that the laws of the unit and multiplication of the monoid  imply
the lax monoidal properties of \(u\) and \(\circledast\) (left as exercise to the reader).</p>

<p>\((2) \Leftrightarrow (3)\).</p>

<p>This is a rather easy statement which generalises the free monoid construction to 2-categories.</p>

<p>In particular, the cheapest way of turning a set \(A\) into a monoid is to take the set of words over \(A\), namely \(A^*\). This is the free monoid over \(A\) where the empty word is the unit and concatenation is the multiplication of the monoid. The Eilenberg-Moore algebras of \(A^*\) are equivalent to the algebraic structure of the monoid \(A^*\).<br />
In particular, the category of Eilenberg-Moore algebras over \(A^*\) is
equivalent to the category of monoids</p>

\[\mathcal{C}^{A^*} \simeq \textbf{Mon}\]

<p>Similarly, given a category \(\mathcal{C}\), the cheapest way of turning this category into a monoid (a monoidal category) is to send \(\mathcal{C}\) to the category of finite sequences of objects \((A_1, \dots, A_n)\) and componentwise sequences of morphisms in \(\mathcal{C}\). 
In other words, \(T\) is the free monoid 2-monad in \(\textbf{Cat}\) defined as the \(\mathbb{N}\)-coproduct \(\mathcal{C}^n\), that is</p>

\[T\mathcal{C} = \sum_{n : \mathbb{N}} \mathcal{C}^{n}\]

<p>Now, similarly to what happens in the 1-category case, we have the following equivalence</p>

\[\textbf{Cat}^T \simeq 2\text{-Mon}\]

<p>where \(\textbf{Cat}^T\) is the 2-category of algebras for a 2-monad \(T\) and \(T\)-algebra homomorphisms and \(2\)-Mon is the 2-category of monoidal categories and monoidal functors (monoids in \(\textbf{Cat}\)). Hence (pseudo) \(T\)-algebra homomorphisms are (lax) monoidal functors.</p>

<p>\((3 \Leftrightarrow 4)\).</p>

<p>Clearly, if \(T\) is the free monoid 2-monad, an algebra for  \(T\) is a map</p>

\[a : \sum_{n : \mathbb{N}} \mathcal{C}^n \to \mathcal{C}\]

<p>The previous point states that this is a monoidal category where \(A \otimes_\mathcal{C} B := a (A,B)\) and \(I_\mathcal{C} = a()\), thus \(a\) sends \((A_1, \dots, A_n)\) to \(A_1 \otimes_\mathcal{C} \dots \otimes_\mathcal{C} A_n\).</p>

<p>A lax monoidal functor \(F\) is a lax \(T\)-algebra homomorphism, thus it has to satisfy</p>

\[F(A_1) \otimes_\mathcal{D} \dots \otimes_\mathcal{D} F(A_n) \to F(A_1 \otimes_\mathcal{C} \dots \otimes_\mathcal{C} A_n)\]

<p>which is defined at all \(n\) and \(A_i\) hence it is a (co)distributive law</p>

\[\text{Nat}(\otimes^{n}_{\mathcal{D}} \circ F^{n}, F \circ \otimes^{n}_{\mathcal{C}})\]]]></content><author><name>Marco Paviotti</name><email>m.paviotti@kent.ac.uk</email></author><category term="monoids" /><category term="categories" /><category term="monads" /><summary type="html"><![CDATA[What is the difference between a lax monoidal functor a monoid in a Day-monoidal category a morphism of lax-algebras for the free monoid 2-monad, and a codistributive law with the tensor product?]]></summary></entry><entry><title type="html">Bisimulations, Equality and Traces</title><link href="https://mpaviotti.github.io/posts/2023/10/Bisim-Eq/" rel="alternate" type="text/html" title="Bisimulations, Equality and Traces" /><published>2023-10-09T13:14:21+00:00</published><updated>2023-10-09T13:14:21+00:00</updated><id>https://mpaviotti.github.io/posts/2023/10/bisimulations</id><content type="html" xml:base="https://mpaviotti.github.io/posts/2023/10/Bisim-Eq/"><![CDATA[<p>Strong bisimulation for CCS is the preferred equivalence method in concurrency because it relates less programs than trace equality. However, the reality is that is strong bisimulation and trace equality ought to be regarded as equivalent. This is the essence behind proof assistant’s like (e.g.) Isabelle. So what is going here?</p>

<p>One example of this fact is when considering CCS with the choice operator.  In
this language we can define a process \(P\) and a process \(Q\) as follows</p>

\[P = \text{pay}.(\text{coffee}. 0 + \text{tea}. 0)\]

\[Q = (\text{pay}.\text{coffee}.0 + \text{pay}.\text{tea}. 0)\]

<p>Now the <em>trace semantics</em> of the CCS processes can be defined by a function</p>

\[[\![ \cdot ]\!] : \text{CCS} \to \mathcal{P}_\text{fin}(\text{Str } L)\]

<p>where \(L\) is the finite set of actions and, for a generic set \(A\), the set
\(\text{Str } A = 1 + A \times \text{Str }A\) is the set of possibly finite
streams over a set \(A\).</p>

<p>For the processes above we have that the semantics of \(P\) is \([\![P]\!] =
\{\text{pay}.\text{coffee}, \text{pay}.\text{tea}\}\) and the semantics of \(Q\)
is \([\![ Q ]\!] = \{\text{pay}.\text{coffee}, \text{pay}.\text{tea}\}\) and
thus the trace semantics of \(P\) and \(Q\) indicate that these processes should
be equal.</p>

<p>However, consider the relation \(P\) <em>simulates</em> \(Q\) which is stated as</p>

\[P \lesssim Q \Leftrightarrow \forall P'. \text{ if } P \xrightarrow{a} P'
\text{ then  } \exists Q'. Q \xrightarrow{a} Q' \text{ s.t. } P' \lesssim Q'\]

<p>Now the <em>bisimulation</em> relation can be defined as \(P \approx Q \Leftrightarrow P
\lesssim Q \text{ and } Q \lesssim P\).</p>

<p>The above example is a standard example in concurrency theory that shows that
bisimulation can distinguish processes where equality on the trace semantics
indicate that they should be regarded as equal and that is why bisimulations
turn out to be more useful relations to compare processes.</p>

<p>Using the example above we can prove that \(Q \lesssim P\). Let’s define
half-evaluated processes as</p>

\[P' = \text{coffee}. 0 + \text{tea}. 0\]

\[P'_{1} = \text{coffee}. 0\]

\[P'_{2} = \text{tea}. 0\]

\[Q_{1} = \text{pay}.\text{coffee}.0\]

\[Q_{2} = \text{pay}.\text{tea}.0\]

\[Q'_{1} = \text{coffee}.0\]

\[Q'_{2} = \text{tea}.0\]

<p>Now for all transitions of \(Q\) we have to show \(P\) simulates them. The first
one is \(Q \xrightarrow{\text{pay}} Q'_{1}\).  Obviously \(P
\xrightarrow{\text{pay}} P'_{1}\) and so now we have to show that \(Q'_{1}
\lesssim P'_{1}\) which clearly does.  This works similarly if \(Q\) decides to
take the other route and produce tea in the end.</p>

<p>All right, but \(P \lesssim Q\) does not work. This is because since \(P\) makes a transition \(P \xrightarrow{\text{pay}} P'\) we are forced to select which branch in \(Q\) is simulating this behaviour. No matter which one we choose we get stuck in one way or the other. Say \(Q \xrightarrow{\text{pay}} Q'_{1}\) we have to show \(P' \lesssim Q'_{1}\), but this latter fact does not hold because \(P'\) can make two different transitions and \(Q'_1\) can only make one.</p>

<h2 id="corecursion-schemes-and-traces">CoRecursion Schemes and Traces</h2>
<p>Consider now the unfold function which takes a seed function an produces a trace
by <em>running</em> the seed at each steps</p>

<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">unfold</span> <span class="o">::</span> <span class="p">(</span><span class="n">x</span> <span class="o">-&gt;</span> <span class="p">(</span><span class="kt">L</span><span class="p">,</span> <span class="n">x</span><span class="p">))</span> <span class="o">-&gt;</span> <span class="n">x</span>  <span class="o">-&gt;</span> <span class="kt">Str</span> <span class="kt">L</span>
<span class="n">unfold</span> <span class="n">seed</span> <span class="n">x</span> <span class="o">=</span> <span class="kr">let</span> <span class="p">(</span><span class="n">l</span><span class="p">,</span><span class="n">x'</span><span class="p">)</span> <span class="o">=</span> <span class="n">seed</span> <span class="n">x</span> <span class="kr">in</span> <span class="n">l</span> <span class="o">::</span> <span class="n">unfold</span> <span class="n">seed</span> <span class="n">x'</span> </code></pre></figure>

<p>Notice that the seed function \(X \to L \times X\) can be viewed as a Labeled
Transition System (LTS) where the set of states is \(X\) and the function is the
function implementing the transitions.</p>

<p>It is a very well-known fact that the <code class="language-plaintext highlighter-rouge">unfold</code> is a fully abstract map in the
sense if we consider the notion of bisimilarity above and set \([\![ \cdot
]\!]\) to be <code class="language-plaintext highlighter-rouge">unfold seed</code>  then we have the following theorem</p>

<blockquote>
  <p><strong>Full abstraction</strong> \(\text{ for all } t_{1}, t_{2}, t_{1} \approx t_{2} \Leftrightarrow [\![ t_{1} ]\!] = [\![ t_{2}]\!]\).</p>
</blockquote>

<p>This is also backed by the fact that when programming in proof assistants like
(e.g.) Agda – since coinductive data types are not really final coalgebras –
it is common practice to just  add the following axiom to the type theory</p>

<blockquote>
  <p><strong>Axiom</strong> \(\text{ for all } (s_{1}, s_{2} : \text{Str L}). s_{1} \approx s_{2} \to s_{1} = s_{2}\) .</p>
</blockquote>

<p>Even more so, in some proof assistants like Isabelle coinductive data types are
real final coalgebras and so the above axiom is actually a true fact in the
prover’s logic.</p>

<p>Notice that the other direction is obvious and thus the axiom implies
bisimiliary is <em>logically equivalent</em>  equality.</p>

<blockquote>
  <p>So why bisimulation in the above example does not correpond to equality?</p>
</blockquote>

<p>The reason is that the shape behaviours for CCS+choice is not \(BX = L \times
X\) but it is \(\mathcal{P}_\text{fin}(L \times X)\).</p>

<p>In fact, the <em>seed</em> function describing the LTS of CCS+choice has the following
type</p>

<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"> 
<span class="n">opsem</span> <span class="o">::</span> <span class="kt">CCS</span> <span class="o">-&gt;</span>  <span class="p">[(</span><span class="kt">L</span><span class="p">,</span> <span class="kt">CCS</span><span class="p">)]</span></code></pre></figure>

<p>where we use lists <code class="language-plaintext highlighter-rouge">[-]</code> as a (rough) implementation of finite powersets.</p>

<p>At this point the LTS for CCS+choice can be defined roughly like this</p>

<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="o">...</span>
<span class="n">opsem</span> <span class="p">(</span><span class="n">p</span> <span class="o">+</span> <span class="n">q</span><span class="p">)</span> <span class="o">=</span> <span class="p">[(</span><span class="n">l</span><span class="p">,</span> <span class="n">p'</span><span class="p">)</span> <span class="o">|</span> <span class="p">(</span><span class="n">l</span><span class="p">,</span> <span class="n">p'</span><span class="p">)</span> <span class="o">&lt;-</span> <span class="n">opsem</span> <span class="n">p</span> <span class="p">]</span> <span class="o">++</span> <span class="p">[(</span><span class="n">l</span><span class="p">,</span> <span class="n">q'</span><span class="p">)</span> <span class="o">|</span> <span class="p">(</span><span class="n">l</span><span class="p">,</span> <span class="n">q'</span><span class="p">)</span> <span class="o">&lt;-</span> <span class="n">opsem</span> <span class="n">q</span> <span class="p">]</span>  </code></pre></figure>

<p>And now the unfold on this LTS will yield a fully abstract semantics</p>

<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="n">unfold</span> <span class="n">opsem</span> <span class="o">::</span> <span class="kt">CCS</span> <span class="o">-&gt;</span> <span class="kt">Trees</span> <span class="kt">L</span> </code></pre></figure>

<p>where \(\text{Trees}\; L = \mathcal{P}_\text{fin} (L \times \text{Trees}\; L)\).</p>]]></content><author><name>Marco Paviotti</name><email>m.paviotti@kent.ac.uk</email></author><category term="semantics" /><category term="categories" /><summary type="html"><![CDATA[Strong bisimulation for CCS is the preferred equivalence method in concurrency because it relates less programs than trace equality. However, the reality is that is strong bisimulation and trace equality ought to be regarded as equivalent. This is the essence behind proof assistant’s like (e.g.) Isabelle. So what is going here?]]></summary></entry><entry><title type="html">The mini Yoneda lemma for Type Theorists</title><link href="https://mpaviotti.github.io/posts/2023/09/Yoneda-TT/" rel="alternate" type="text/html" title="The mini Yoneda lemma for Type Theorists" /><published>2023-09-09T13:14:21+00:00</published><updated>2023-09-09T13:14:21+00:00</updated><id>https://mpaviotti.github.io/posts/2023/09/mini-yoneda</id><content type="html" xml:base="https://mpaviotti.github.io/posts/2023/09/Yoneda-TT/"><![CDATA[<p>I have managed to teach the Yoneda lemma to students who knew very little about category theory, here’s how you do it.</p>

<p>Say that you want to do denotational semantics for a simply typed calculus with a unary constructor \(\textsf{R}\) which has the following typing rule</p>

\[\frac{\Gamma \vdash t : A}{\Gamma \vdash \textsf{R}(t) : B}\]

<p>The task is to give a semantic interpretation \([\![ \cdot ]\!]\) for the language by
induction on the typing judgment \(\Gamma \vdash t : A\) such that terms are
interpreted as morphisms \([\![\Gamma ]\!] \xrightarrow{[\![ t ]\!]} [\![ A ]\!]\), assuming
for course \([\![ \cdot ]\!]\) is also defined separately for contexts and types.</p>

<p>We interpret the rule above we do induction
on the typing judgment. Thus we assume there exists a morphism
\([\![ \Gamma ]\!] \xrightarrow{[\![ t ]\!]} [\![ A ]\!]\) and we construct a morphism
\([\![ \Gamma ]\!] \xrightarrow{[\![ \textsf{R} ]\!] } [\![ B ]\!]\).</p>

<p>For simplicity we remove the semantics brackets, for example,
assuming \(A\) be interpretation of \([\![ A ]\!]\), \(t : \Gamma \to A\) the
interpretation of \(t\) an so on.</p>

<p>Back to the problem we are trying to solve. It can be quite tricky sometimes to figure out what the semantics of \(\textsf{R}(t)\) are since there is some plumming needed to pass around the
context. A particular instantiation of the Yoneda lemma states that given a
morphism \(t : \Gamma \xrightarrow{t} A\) and a morphism \(R : A \to B\) there
is a canonical way to construct a morphism \(\Gamma \xrightarrow{R(t)} B\).</p>

<p>To show this we instantiate the contravariant Yoneda lemma by setting \(F =
\mathbb{C}(-, B)\). Then for all objects \(A : \mathbb{C}^{\text{op}}\) we have</p>

\[\mathbb{C}(A, B) \cong \mathbb{C}(-, A) \xrightarrow{\cdot} \mathbb{C}(-, B)\]

<p>Let \(R : A \to B\) be the interpretation of \(\textsf{R}\) then, one side of
the isomorphism is \(\phi (\textsf{R},t) = F(t)(\textsf{R}) = \mathbb{C}(t,
B)(\textsf{R})\).  In other words, the interpretation of \(\textsf{R}(t)\) is
simply \(\textsf{R} \circ t\).</p>]]></content><author><name>Marco Paviotti</name><email>m.paviotti@kent.ac.uk</email></author><category term="semantics" /><category term="categories" /><summary type="html"><![CDATA[I have managed to teach the Yoneda lemma to students who knew very little about category theory, here’s how you do it.]]></summary></entry><entry><title type="html">CCCs and the complete models of STLC</title><link href="https://mpaviotti.github.io/posts/2023/03/CCC-STLC/" rel="alternate" type="text/html" title="CCCs and the complete models of STLC" /><published>2023-03-16T13:14:21+00:00</published><updated>2023-03-16T13:14:21+00:00</updated><id>https://mpaviotti.github.io/posts/2023/03/CCC-models</id><content type="html" xml:base="https://mpaviotti.github.io/posts/2023/03/CCC-STLC/"><![CDATA[<p>Cartesian closed categories are not regarded as complete models of the Simply Typed \(\lambda\)-calculus in the traditional sense. Let’s see why.</p>

<p>Assume \(\Lambda_X\) is the set of closed well-typed STLC (Simply Typed \(\lambda\)-calculus) terms.
Clearly, STLC can be interpreted into any Cartesian Closed category (CCC) by defining an interpretation function 
\([\![\cdot]\!] : \Lambda_X \to \mathcal{C}\) such that for any term \(t \in \Lambda_X\) , \([\![t]\!] \in \mathcal{C}(1, [\![\sigma]\!])\) where \(\sigma\) is the type of \(t\). We will only consider well-typed interpretations here. Moreover, it can be proved that the interpretation function is sound and complete.  The completeness statement reads as follows. For all terms \(t_1\) and \(t_2\),</p>

\[t_1 \equiv_{\beta\eta} t_2 \text{ iff } [\![t_1]\!] = [\![t_2]\!]\]

<p>where the \((\Rightarrow)\) direction is soundness whereas \((\Leftarrow)\) is <em>completeness of the interpretation</em>.</p>

<p>This statement is certainly true. If two terms are \(\beta\eta\) equivalent they are equal in the model, i.e. the semantics is agnostic to \(\beta\eta\)-step reductions. Conversely, all equations that hold for any two STLC-denotable terms also hold in the syntax.</p>

<p>However, completeness of a model is a slightly different statement:</p>

\[t_1 =_{\beta\eta} t_2 \text{ iff for all } [\![ \cdot ]\!] : \Lambda_X \to \mathcal{C}, [\![t_1]\!] = [\![t_2]\!]\]

<p>This one states that fixed a category \(\mathcal{C}\), \(\beta\eta\)-equivalence between terms holds if and only if these two terms are equal in <strong>every</strong> possible interpretation.</p>

<p>In this sense, CCC categories are not <em>complete models</em>. The counter example is given by the preorder category \(\mathcal{P}\) with CCC structure. The preorder the category has at most one morphism (\(\sqsubset\)) between objects. If this category has the greatest element \(\top\), binary meets (\(\wedge\)) and Heyting implications (\(\to\)) then \(\mathcal{P}\) is CCC.</p>

<p>Now the problem is that when the category is thin every (well-typed) interpretation interprets two programs of the same type into morphisms of the same type, but since the category is thin these two morphisms are always equal. 
For example, consider the projection maps out of the product  \(x \wedge x \xrightarrow{\pi_1} x\) and \(x \wedge x \xrightarrow{\pi_2} x\) for the particular case when the codomain of the two coincide. In \(\mathcal{P}\) these two are the same map, i.e. \(\pi_1 = \pi_2\).</p>

<p>Now the right-hand side of the completeness theorem is satisfied since For all well-typed interpretations \([\![\cdot]\!]\) we have \([\![\pi_1]\!] = [\![\pi_2]\!]\) (when the codomain of the two is the same). However, the projections \(\pi_1\) and \(\pi_2\) in the syntax are definitely not \(\beta\eta\)-equivalent.</p>

<p>I will defer the reader to the <a href="https://link.springer.com/chapter/10.1007/BFb0014068">original paper</a> for more details.</p>]]></content><author><name>Marco Paviotti</name><email>m.paviotti@kent.ac.uk</email></author><category term="semantics" /><category term="categories" /><category term="stlc" /><summary type="html"><![CDATA[Cartesian closed categories are not regarded as complete models of the Simply Typed \(\lambda\)-calculus in the traditional sense. Let’s see why.]]></summary></entry><entry><title type="html">The Axiom of Choice in Type Theory</title><link href="https://mpaviotti.github.io/posts/2022/11/Axiom-Choice/" rel="alternate" type="text/html" title="The Axiom of Choice in Type Theory" /><published>2022-11-25T13:14:21+00:00</published><updated>2022-11-25T13:14:21+00:00</updated><id>https://mpaviotti.github.io/posts/2022/11/axiom-of-choice</id><content type="html" xml:base="https://mpaviotti.github.io/posts/2022/11/Axiom-Choice/"><![CDATA[<p>The Axiom of Choice (AC) is an axiom that states that the product of a family of non-empty sets is itself non-empty. This is a rather controversial axiom amongst mathematicians but in type theory this axiom is provable within the logic.</p>

<p>First off, I do not consider myself an expert on set theory, but after having this kind  of conversation with mathematicians and computer scientists I found <em>there are</em> some
misconceptions around this axiom and the reasons why it is needed.</p>

<p>For example, as you will see, it is indeed true that the axiom of choice is connected with the existential quantifier, it is not true, however, that we cannot pick an element out of the existential because the logic is classical.</p>

<p>In my mind there are two problems: the first is that</p>

<blockquote>
  <p>the existential quantifier does not ensure there exists <strong>one</strong> element with a particular property in the domain of discourse</p>
</blockquote>

<p>and the second is that</p>

<blockquote>
  <p>we would need to create a infinite proof that uses Existential Instantiation for each element of the indexing set</p>
</blockquote>

<p>However, in order to fully understand what is going on we need to be more precise. 
So first let’s begin with what is the axiom of choice.</p>

<h3 id="the-axiom-of-choice-ac">The axiom of choice (AC)</h3>
<p>The original formulation of the AC is the following.</p>

<blockquote>
  <p>Given a set \(X\) and a family of non-empty sets \(\{A_x\}_{x \in X}\) over
\(X\), the infinite product of these sets, namely  \(\Pi_{x \in X}. A_{x}\) is
non-empty</p>
</blockquote>

<p>For the record, the infinite product is defined as follows</p>

\[\Pi_{x \in X}. A_{x} = \{ f : X \to \bigcup_{x \in X} A_{x} \mid f(x) = A_{x} \}\]

<p>However, this statement is a little bit more packed than we would like it to be. An equivalent statement is skolemization.</p>

<h3 id="skolemization-sk">Skolemization (Sk)</h3>
<p>Skolemization is what allows one to turn an existentially quantified formula into a function. Formally, skolemization is the following statement</p>

<blockquote>
  <p>Given a relation \(R \subseteq X \times Y\), \(\forall x \in X. \exists y \in Y. R(x,y)\) then \(\exists f \in X \to Y. \forall x \in X. R (x, f(x))\)</p>
</blockquote>

<p>The AC is equivalent to Skolemization. A full discussion of this fact can be
found in
<a href="https://mathoverflow.net/questions/191010/when-does-skolemization-require-the-axiom-of-choice">here</a></p>

<p>For proving that Sk \(\Rightarrow\) AC, for a family of sets \(\{A_{x}\}_{x \in
X}\), we define a relation \(R(x,y) = y \in A_{x}\).  For the other direction we
assume a relation \(R \subseteq X \times Y\) and then we construct the family of
sets \(\{A_{x}\}_{x \in X}\) such that each \(A_{x} = \{ y \mid y \in Y \text{
and } R(x,y)\}\).</p>

<h3 id="the-existential">The existential</h3>
<p>Set theory is a first-order logic together with a set of axioms (9 of them
exactly including the AC) postulating the existence of certain sets. Besides the
propositional fragment of first-order logic there is also the predicate fragment
formed by <em>universal quantification</em> (\(\forall\)) and <em>existential
quantification</em> (\(\exists\)).</p>

<p>The Existential Instantiation rule states that if we know there exists an \(x\)
that satisfies the property \(P\) and we can construct a proof from a fresh
\(t\) that satisfies that property to a proposition \(R\) then we can obtain
\(R\)</p>

\[\frac{\exists x. P \qquad t, P[t/x]\cdots R }{R}\]

<p>with \(t\) free for \(x\) in \(P\).</p>

<p>So here we have to treat \(t\) carefully in that it is a fresh \(t\) that
satisfies \(P\), but “we do not know what it is!”.</p>

<p>The reason why I put this sentence in quotes is because this is the explanation
that many people would use. However, to me the real reason is that <em>we do not
know how many other elements in the universe exist with such a property</em>. There
is certainly one, but there may be more.</p>

<h3 id="the-problem-with-producing-a-choice-function">The problem with producing a choice function</h3>

<p>To prove Sk we have to assume \(\forall x \in X. \exists y \in Y. R(x, y)\) and then
prove \(\exists f : X \to Y. \forall x \in X. R (x , f (x))\). Though \(f : X \to Y\) really means a relation \(f \subseteq X \times Y\) such that it is a function, <em>i.e.</em> that for all \(x \in X\) there exists only one \(y \in Y\) such that \((x,y) \in f\).</p>

<p>Now first we try to construct this relation \(f\). A first naive attempt is to use the axiom of comprehension as follows</p>

\[f = \{(x, y) \mid x \in X \wedge y \in Y \wedge R(x, y)\}\]

<p>The problem is that \(f\) is clearly not a function since there may be more than one \(y\) per one \(x\) in \(R\). Notice that the above statement is very simlar to the one where we include the existential</p>

\[f = \{(x, y) \mid x \in X \wedge \exists y'. y = y' \wedge R(x, y)\}\]

<p>But this does not change much from before since we know there exists at least one \(y\) per every \(x\) but we do not know how many. Clearly, we can prove that for all \(x \in X\) we have \(R(x, f(x))\), however, we cannot prove that \(f\) is a function. In particular, that for each \(x \in X\) we have a <strong>unique</strong> \(y \in Y\) we map \(x\) to.</p>

<p>Now the question is, couldn’t we just have picked one \(y\) for each \(x\)?</p>

<p>We could do this if we were able to use Existential Instantiation for each \(x \in
X\). If \(X\) was finite then we could certainly do that as we can pick an \(n \in \mathbb{N}\) and assume \(X\) assuming that \(X = \{x_0, x_1, \dots, x_n \}\).<br />
Now we can construct a set of pairs \((x_i, y_i)_{i\in \{1,\dots,n\}}\) such that every \((x_i, y_i) \in R\) by repeatedly using existential instantiation. Once the set is created we can assign \(f\) to it</p>

\[f = \{(x_0,y_0), (x_1,y_1), \dots, (x_n, y_n)\}\]

<p>However, when \(X\) is not finite, we cannot simply <em>write down</em> the set by hand. Instead we have to create a formula and then use set comprehension. However, there is no (open) formula of the form</p>

<p>$(x_0,y_0) \in R \wedge (x_1,y_1) \in R \wedge \dots \wedge (x_n, y_n) \in R \wedge \dots }$$</p>

<p>This is because formulas and proofs in set theory are finite and the one above is an infinite formula which would need an (potentially) infinite number of applications of the Existential Instantiation rule.</p>

<h3 id="conclusions">Conclusions</h3>
<p>Hopefully this untangles some confusion around the axiom of choice.</p>

<p>On the other hand, AoC is derivable in Type Theory simply because we have access to the proof that for every \(x\) there exists a \(y\) such that \(R(x,y)\). But the reason why there exists only one is because inhabitants of the dependent product \(\forall\) are functions already.</p>

<p>See the code below.</p>

<figure class="highlight"><pre><code class="language-agda" data-lang="agda">choice : ∀ (A B : Set) → ∀ (R : A →  B → Set) → (∀ (x : A) → Σ B (λ y → R x y)) → Σ (A → B) (λ f → ∀ x → R x (f x))
choice A B R r = (λ x →  proj₁ (r x)) , (λ x → proj₂ (r x)) </code></pre></figure>

<p>If you have any comment about this please feel free to drop me an email or something I would very happy to know more (especially if I said something wrong).</p>

<p>###</p>]]></content><author><name>Marco Paviotti</name><email>m.paviotti@kent.ac.uk</email></author><category term="set theory" /><category term="foundations" /><summary type="html"><![CDATA[The Axiom of Choice (AC) is an axiom that states that the product of a family of non-empty sets is itself non-empty. This is a rather controversial axiom amongst mathematicians but in type theory this axiom is provable within the logic.]]></summary></entry><entry><title type="html">Inconsistencies in Cartesian Closed Categories with fixed-points</title><link href="https://mpaviotti.github.io/posts/2022/11/CCC-FixedPoints/" rel="alternate" type="text/html" title="Inconsistencies in Cartesian Closed Categories with fixed-points" /><published>2022-11-10T13:14:21+00:00</published><updated>2022-11-10T13:14:21+00:00</updated><id>https://mpaviotti.github.io/posts/2022/11/fixed-points-CCS</id><content type="html" xml:base="https://mpaviotti.github.io/posts/2022/11/CCC-FixedPoints/"><![CDATA[<p>Any Cartesian Closed Category (CCC) with an initial object and a fixed-point operator is trivial. Essentially this means that in languages like (e.g.) Haskell the empty type is not actually empty as it contains the non-terminating computation. Perhaps this is obvious, but here’s the categorical explanation.</p>

<p>Here the word <em>trivial</em> means that every object \(A\) in the category is isomorphic to the terminal object \(1\).</p>

<p>To do this proof we make use of the fixed-point operator, which exists at all types.</p>

<p>We know that for all endomaps \(f : A \to A\) in the category there exists a map
\(\text{fix}_{f} : 1 \to A\) such that \(f \circ \text{fix}_{f} =
\text{fix}_{f}\). Thus, we can use the unique endomap on the initial object,
namely the identity map \(id_{0}: 0 \to 0\), to get a map \(\text{fix}_{id_{0}} :
1 \to 0\). But now, because \(0\) is initial (and \(1\) is terminal), we also
have a unique map into the terminal object, namely \(! : 0 \to 1\). It is easy
to see that \(\text{fix}_{id_{0}}\) and \(1\) are inverses to each other, hence
they form an isomorphism \(0 \cong 1\).
In particular, \(\text{fix}_{id_{0}} \circ ! : 0 \to 0\) is \(id_{0}\) by initiality and
\(! \circ \text{fix}_{id_{0}} : 1 \to 1\) is \(id_{1}\) by finality.</p>

<p>Now we compute as follows. For every object \(A\) in the category  \(1 \cong 0
\cong 0 \times A \cong 1 \times A \cong A\) and the proof is concluded.</p>

<p>This result was shown to hold also when in the case when instead of the
initial object we postulate a natural numbers object \(\mathbb{N}\).</p>

<p>A natural question to ask now is:</p>

<blockquote>
  <p>is every model of PCF trivial?</p>
</blockquote>

<p>To answer this question we take as a model of PCF the category of <a href="https://en.wikipedia.org/wiki/Scott_domain">Scott domains</a>.
This category consists of pointed directed complete partial orders
(dCPPO) as objects and continuous functions as arrows (just following <a href="https://www.amazon.co.uk/Domain-Theoretic-Foundations-Functional-Programming-Streicher/dp/9812701427">Thomas Streicher’s
book</a> to avoid any misunderstanding).</p>

<p>Now, we would like to prove that this category is cartesian closed (which we know), has a fixed-point map (which it has) 
and that it has an initial object. However,</p>

<blockquote>
  <p>there is no initial object in the category of Scott domains</p>
</blockquote>

<p>This is because if this category had an initial element \(0\) it would have at least a bottom
element \(\bot_0\). Notice that the subset \(\{\bot_0\}\) is indeed directed and
its suprema \(\bigsqcup \{\bot_0\}\) is \(\bot_0\) itself. Now if we take any
other dCPPO \(X\), a continuous function \(f : 0 \to X\) that maps \(\bot_{0}\) to any element \(x \in X\) 
will satisfy the equation</p>

\[f \bigsqcup \{\bot_0\} = \bigsqcup f \{\bot_0\}\]

<p>because, for any \(x \in X\) we choose for \(f(\bot_0)\) (even the bottom element), \(\bigsqcup f \{\bot_0\} = \bigsqcup \{x\} = x\).</p>

<p>The only way this category had an initial element is if the arrows in the
category were <em>strict</em>, namely they preserved \(\bot\) elements, but, as we have
seen, continuous functions do not necessarily preserve it.</p>

<blockquote>
  <p>Is this just a coincidence that Scott’s model is not trivial?</p>
</blockquote>

<p>Not really. Because if it was trivial it would have broken computational adequacy which is the statement
that for every pair or well-typed terms in the language \(\Gamma \vdash t : A\)
and \(\Gamma \vdash t' : A\)</p>

<p>if \([\![ t ]\!] = [\![ t' ]\!]\) then \(t \approx t'\)</p>

<p>where \(\approx\) is contextual equivalence of programs.</p>

<p>But if the models was trivial then all the pairs of PCF-denotable terms (pairs
of maps into something isomorphic to \(1\)) would be equal (by finality) and
therefore operationally equivalent.</p>

<h3 id="what-does-this-all-mean-for-the-haskell-programmer">What does this all mean for the Haskell programmer?</h3>

<p>Well nothing, because Haskell does not have a formal model.</p>

<p>But let’s say we make a big leap and take the fragment of Haskell consisting of
“inductive data types” and recursion. Now I can craft a program that resembles
what I just said above</p>

<figure class="highlight"><pre><code class="language-haskell" data-lang="haskell"><span class="cp">{-# LANGUAGE GADTs #-}</span>

<span class="kr">data</span> <span class="kt">Empty</span> <span class="kr">where</span>

<span class="kr">data</span> <span class="kt">Unit</span> <span class="o">=</span> <span class="kt">One</span> <span class="nb">()</span>

<span class="n">y</span> <span class="o">::</span> <span class="p">(</span><span class="n">a</span> <span class="o">-&gt;</span> <span class="n">a</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">a</span>
<span class="n">y</span> <span class="n">f</span> <span class="o">=</span> <span class="n">f</span> <span class="p">(</span><span class="n">y</span> <span class="n">f</span><span class="p">)</span>

<span class="n">empty</span> <span class="o">::</span> <span class="kt">Empty</span> <span class="o">-&gt;</span> <span class="kt">Empty</span>
<span class="n">empty</span> <span class="n">x</span> <span class="o">=</span> <span class="n">x</span>

<span class="p">(</span><span class="o">===</span><span class="p">)</span> <span class="o">::</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="n">a</span> <span class="o">-&gt;</span> <span class="n">a</span>
<span class="n">x</span> <span class="o">===</span> <span class="n">y</span>

<span class="n">endoEmpty</span> <span class="o">::</span> <span class="kt">Unit</span> <span class="o">-&gt;</span> <span class="kt">Empty</span>
<span class="n">endoEmpty</span> <span class="o">=</span> <span class="n">y</span> <span class="n">id</span> <span class="o">===</span> <span class="n">id</span> <span class="p">(</span><span class="n">y</span> <span class="n">id</span><span class="p">)</span> <span class="c1">-- by Fixed-point property y f = f (y f)</span></code></pre></figure>

<p>Is this a problem? No, this is not a problem because <code class="language-plaintext highlighter-rouge">y id</code> is the infinite
computation. In other words, sends the unit element to \(\bot\). But since
Haskell functions need not to be strict, I can send the \(\bot\) element in
<code class="language-plaintext highlighter-rouge">Empty</code> to <code class="language-plaintext highlighter-rouge">One ()</code>. So this map is not an isomorphism.</p>

<h3 id="conclusions">Conclusions</h3>
<p>This is probably a very convoluted way of saying</p>
<blockquote>
  <p>There is no initial object (or natural numbers object) in PCF (or other “PCF-like” languages like Haskell)</p>
</blockquote>

<p>this is because <code class="language-plaintext highlighter-rouge">Empty</code> actually contains the bottom element \(\bot\). 
For the same reasons, if we now consider System F with a polymorphic fixed-point operator and define the \(0\) object by setting</p>

\[0 = \forall x . x\]

<p>This object has actually an inhabitant: the non-terminating computation. Thus, it is not the initial object.</p>]]></content><author><name>Marco Paviotti</name><email>m.paviotti@kent.ac.uk</email></author><category term="semantics" /><category term="categories" /><category term="recursion" /><summary type="html"><![CDATA[Any Cartesian Closed Category (CCC) with an initial object and a fixed-point operator is trivial. Essentially this means that in languages like (e.g.) Haskell the empty type is not actually empty as it contains the non-terminating computation. Perhaps this is obvious, but here’s the categorical explanation.]]></summary></entry></feed>