We are looking at a triple of correlations that relate three variables:
In my previous post, we saw these pictures of triples of correlations:
The yellow shape is the space of possible correlations .
On the left, we intersect with the plane (i.e. we set correlation to zero).
On the right, the plane is (i.e. the three correlations sum to one).
The intersection of the yellow shape with the blue plane is all correlations where these extra linear conditions hold. The intersection seems to look different in the two pictures. How can we describe the difference? What are the possible shapes of intersections that can occur?
These questions were studied in the paper Nets of Conics by C.T.C. Wall from 1977. In the paper, Wall gives an algebraic classification, with 26 possibilities for what the intersection can look like:
Where do our two examples fit into Wall’s classification?
To see this, we look at the points on the boundary of the yellow shape. Such a point gives a matrix
with determinant equal to zero. To get Wall’s pictures, the boundary is extended all points where the determinant vanishes, ignoring the requirement that come from correlations (e.g. we now get points with coordinates outside of the range -1 to 1). In our first example, we get the extended picture:
or, from a different angle
Wall’s pictures plot the intersection of the yellow shape with the blue plane, shading in the correlations. For our first example, we get a shaded circle:
All of Wall’s intersections have cubic (degree three) equations, but here we have a circle, which is degree two. We recover the cubic if we apply a change of coordinates:
In terms of equations, the full cubic polynomial is
which factors as .
Table 1 of Wall’s paper tells us we are in type D*. Then, from Wall’s Figure 3 above, we see that our first example is sub-type D*c:
We follow the same recipe to find our second example in Wall’s classification. First, we extend the picture of the yellow shape and the blue plane:
Already we start to see some differences between the two examples. The intersection of the yellow shape with the blue plane gives the picture
The intersection is the cubic curve (or elliptic curve) with equation . This polynomial can’t be factored, and has no singular points. This means it is in Wall’s type A, which has four sub-types. Wall’s Figure 2, above, shows that it is sub-type Ab or Ac, since it has a shaded region. We distinguish Ab from Ac using what Wall calls a “preferred point”.
To find the preferred point, we apply a change of coordinates to convert the cubic to “canonical form” , where is a cubic (e.g. following the instructions here). In this example, we apply the change of coordinates
.
to and then set . This transforms the cubic to . In the new coordinates, the cubic is:
The preferred point is the rational root of the cubic in the normal form, which is . It has a linear factor of , which gives preferred point , the black point marked.
Going back to Wall’s pictures, we find that we’re in sub-type Ab:
In his paper, Wall says this classification is “intrinsically interesting, and involves some pleasant geometry”.
In my previous post we saw how the above shape (called the elliptope, or the samosa) shows up when looking at correlations between three variables. The samosa is all points that are a triple of correlations of three variables. For example, is in the samosa, because it is possible to have three variables with all correlations 0.5. But is not, because if two variables have perfect correlation with a third, they must also be perfectly correlated.
Put another way, the samosa is the set of points where the matrix
has non-negative determinant, for in the range from -1 to 1. If we know two of the three correlations, we can use this description of the samosa to find the possiblilities for the third correlation. This gives the line segments in the previous post.
What if we only know one of the three correlations? Maybe we know that correlation is zero. What are the possible values of correlations and ?
The space of three correlations intersected with the plane .
We can intersect the plane with the samosa. The points that lie in the samosa, and on this plane, are triples of correlations .
If we knew the three correlations sum to one, we would intersect the samosa with .
The space of three correlations intersected with the plane .
The situation of intersecting the samosa with a plane arises in the study of certain statistical models. For example, it can be seen in this figure from “The Handbook of Graphical Models”:
In the top-left picture, the yellow shape is the samosa. It has been intersected with a green plane. The points on the green plane and in the samosa are inside the black curve on the top-right picture.
These pictures are cartoons of graphical models (statistical models that follow the structure of a graph). A Gaussian graphical model arises by intersecting the samosa (or its higher-dimensional analogue) with a particular linear space. Each point in the samosa is now not a collection of correlations, but rather an inverse covariance matrix.
When studying such graphical models, it is helpful to look at the inverses of the matrices. This is where the second row of pictures in the above figure comes in. In the lower left picture, the red shape gives the inverses of matrices of the form
The red shape is obtained by projecting the inverse matrices using sufficient statistics.
Here is some Macaulay2 code I used to compute the surface. It is also described in Example 5.3 here.
R = QQ[a,x,y,z,s11,s12,s13,s22,s23,s33];
K = matrix{{a,x,y},{x,a,z},{y,z,a}};
S = matrix{{s11,s12,s13},{s12,s22,s23},{s13,s23,s33}};
M = K*S - matrix{{1,0,0},{0,1,0},{0,0,1}};
I = minors(1,M);
J = eliminate(I,{a,x,y,z});
R1 = QQ[s11,s12,s13,s22,s23,s33,t1,t2,t3,t4];
d = minors(3,S); d = substitute(d,R1);
J = substitute(J,R1);
J = J + d + ideal(t1 - s11 - s22 - s33, t2 - s12, t3 - s13, t4 - s23);
K = eliminate(J,{s11,s12,s13,s22,s23,s33});
dc = decompose K
Let’s imagine a hypothetical situation. There’s an infection going round, and we want to predict the future severity of someone’s illness.
There is a test that offers a good prediction. Let’s say the outcome of the test has a correlation of 0.78 with the patient’s severity of infection. The problem with the test is that it is expensive and time-consuming. But there’s an alternative test, which is much cheaper and faster. We don’t know how well the cheap test correlates with the severity of infection, but we know the correlation between the cheap test and the expensive test is quite high, at 0.89.
We have three related correlations – two known, and one unknown.
What can we say about how well the cheap test correlates with the severity of infection?
We might expect to be able to say something about the unknown correlation. For example, if the expensive test had a correlation of 1 with the severity of infection, and that the cheap test also had a correlation of 1 with the expensive test, then everything is perfectly correlated and the cheap test must also have a correlation of 1 with the severity of infection.
But, let’s assume the expensive test only has a correlation of 0.5 with the severity of infection, and that the two tests are also only correlated with correlation 0.5. Now it isn’t clear whether we can say anything about the correlation of the cheap test with the severity of infection.
Let’s go back to the numbers in the original example.
We can organise our correlations into a matrix. We have three variables:
severity of infection
expensive test outcome
cheap test outcome
And we build the matrix:
The entry of the matrix gives the correlation between variable and . Each variable has a perfect correlation with itself, so the diagonal entries of the matrix are equal to 1. In addition, the correlation between and is the same as the correlation between and , so the matrix is symmetric.
A matrix of correlations has to be positive semi-definite. This condition will allow us to find the range of possible values for .
But first, let’s consider the case where all three correlations are unknown. We then have correlation matrix
The region of possible values for the correlations are the values for which the matrix is positive semi-definite. This is a region of 3-dimensional space that looks like this:
It is called the elliptope, a name that dates back at least as far as 1996 (see here) and is also called the samosa, a name that dates back at least as far as 2011 (see here).
We can now fix values for our two known correlations, to see the possible values for the third correlation. The possibilities are all values that keep the triple of correlations inside the samosa.
If we know two correlations are and , the range of possibilities for the third correlation is given by the black line
We can find the upper and lower limit of the third correlation by seeing where the black line intersects the boundary of the samosa. The boundary is defined by setting the determinant of the matrix
to be zero. We get a quadratic polynomial in with roots at approximately 0.41 and 0.98. So the third correlation has to lie in the range 0.41 to 0.98.
So for the infection example, although the cheap test has a high correlation with the expensive test, in the worst case it only offers a correlation of 0.41 with the severity of infection.
If the two known correlations had both been 0.5, a similar computation shows that the third correlation has to lie in the range -0.5 to 1. The third correlation could even be negative!
If two correlations take values 0.5, the third correlation could range from -0.5 to 1, the values along the black line
This is the first part of a small series of “correlated” posts that I hope to write about correlations – stay tuned for more!
As Anna said, time has flown since we regularly wrote here, so let’s get going again!
An application I wrote recently included a summary for ‘the general public’. Here’s an excerpt, about some diagram algebras I have been studying.
Consider 2n holes in the ground, and n moles which have to pop out of one hole and make their way to a free hole. If we consider the paths these moles create we get a diagram belonging to some of the diagram algebras I will study: allowing the moles to cross each others’ paths gives us the Brauer algebra, and if we don’t allow them to cross we get the Temperley-Lieb algebra. Diagram algebras have strong connections with physics: replacing our moles with particles, gives a physicists connection diagram.
Let’s unpack this a little, with some pictures. We have 2n holes, and n moles, who each have a start and an end hole, and create a path. Let’s take n=3.
Here are some moles expertly drawn. They have to move forwards (they can’t move around the back of holes) and make their way to a free one.
in the Brauer case, these paths are allowed to cross:
But in the Temperley-Lieb case, they aren’t:
(There are some more interesting looking paths for this, but our n=3 case doesn’t have so many!)
Of course, (disclaimer) these are only the basic building blocks of these algebras and explained in a very non-mathematical way! (Moles rhymes with holes so why not!)
In maths, we package this information a little differently, with the basic building block (or basis) elements given by matchings (for the Brauer algebra) or planar matchings (for the Temperley-Lieb algebra) of the union of the sets
{-n,…,-1} and {1,…n}.
Which essentially means we line n of our 2n ‘holes’ up on one side, and n on the other. This allows for operations in the algebra, but that’s a different story. Here are some elements of the Brauer and Temperley-Leib algebras drawn in this way:
An element of the Brauer algebra Br5
An element of the Temperley-Lieb algebra TL5
There are a many areas of mathematics these algebras, and other digram algebras, are well-used in. For instance they are very important in representation theory, knot theory, topological quantum field theory, and as I mentioned before they describe connection diagrams in physics.
First of all let me apologise for my lack of recent posts, finishing a PhD and moving countries is exhausting!
I recently gave a learning seminar on knots, and I would like to share some of my discoveries with you in this post.
What is a knot?
We all create knots in our daily lives, by tying our shoelaces or tangling our headphones. But mathematically a knot is the following:
“A knot is an oriented, locally flat embedding of into “.
we can think of this as taking a string, tying some sort of knot and then gluing the two ends of the string together! We then add an arrow to ‘orient’ the knot: imagine the string flowing in one direction.
The ‘locally flat’ roughly means that zooming in to any section of our knot we see what looks like a straight line travelling through space, and the fact we are mapping the circle into the three-sphere can be ignored for now: we can just imagine that is the Euclidean space .
Knot diagrams
We can view a knot in the form of a knot diagram $ latex D $. This is the projection of the knot onto a plane in , where we symbolise which part of our string is behind another part by using under and over crossings. Here are some examples of well known knots and their diagrams:
In 1932 Reidemeister proved that any two diagrams that represent equivalent knots are related by a sequence of Reidemeister moves along side smooth deformations that preserve the arcs and crossings. These moves are as follows:
Knot adjectives!
When we define a new type of mathematical object, such as a group, topological space, or knot, it helps to be able to describe specific subsets of those objects and so we define characteristics that a group/space/knot may have. For instance a group could be cyclic or abelian, a space could be connected or compact and a knot could be…. below we will introduce some adjectives!
Prime
A prime knot is one that cannot be written as the connect sum of two or more knots, which aren’t the unknot. The connect sum operation takes two knots, cuts them, and then glues them to each other as below. We have to make sure that we do this in a way which agrees with the orientations.
The unknot and the trefoil are both prime knots, whereas the granny knot is the connect sum of a trefoil with itself and the reef knot is the connect sum of the trefoil with its reflection!
Ribbon
A knot is called ribbon if it bounds a self intersecting disc with only ribbon singularities. A ribbon singularity is shown below, where the knot is in red and the disk that it bounds is in blue.
The reef knot is an example of a ribbon knot:
Alternating
An alternating knot is one such that, when you travel along the knot diagram you encounter an ‘over then under then over then under’ pattern in the crossings. The trefoil is an example of an alternating knot: try placing your finger on the trefoil and flowing along the string!
That’s all the knotty adjectives I have time for now: I’ll be back soon to explore the relationship between knots and braids!
Before you ask a mathematician if they can visualize the fourth dimension, ask them if they can truly visualize a three-dimensional object, like the boundary of a four-dimensional football. If they tell you it’s easy, and their name isn’t Maryna Viazovska, they’re probably lying.
Making an accurate picture of an object from a high dimensional space is very challenging. In this blog post we’ll see a surprising case where it turns out to be possible. We’ll visualize an interesting seven-dimensional object, which comes from a question in statistics.
Let’s consider the probability that each of the teams in the quarter-finals of the Men’s FIFA 2018 World Cup would win. The teams were (Uruguay, France, Brazil, Belgium, Russia, Croatia, Sweden, England). Today we know the probabilities of the teams winning, in that order, are , because France has already won. Back on 3rd July the probabilities (according to FiveThirtyEight) were , and on 7th July the probabilities were .
In a recent project we were studying which probability distributions lie in a particular statistical model. We found out that our statistical model is given by inequalities that the eight probabilities need to satisfy. If we call the probabilities , the inequalities are:
The probabilities have to sum to 1, so . We want to visualize the part of seven-dimensional space in which the inequalities hold. How can we do it?
The first step is to notice that some combinations of letters do not affect whether the inequalities hold or not. They are:
So we can apply a change of coordinates that removes these three directions, leaving something four-dimensional. Finally, to get something three-dimensional we can assume that the four remaining coordinates lie on the sphere.
We end up with a picture that looks like this:
The part of space that lies inside the statistical model are the points outside either the blue blob, the green blob, or the yellow blob.
These days, we have an even better way to visualize the statistical model, truly in 3D. It even doubles-up as a handmade toy for children.
Duality relates objects, which seem different at first but turn out to be similar. The concept of duality occurs almost everywhere in maths. If two objects seem different but are actually the same, we can view each object in a “usual” way, and in a “dual” way – the new vantage point is helpful for new understanding of the object. In this blog post we’ll see a pictorial example of a mathematical duality.
How are these two graphs related?
In the first graph, we have five vertices, the five black dots, and six green edges which connect them. For example, the five vertices could represent cities (San Francisco, Oakland, Sausalito etc. ) and the edges could be bridges between them.
In the second graph, the role of the cities and the bridges has swapped. Now the bridges are the vertices, and the edges (or hyperedges) are the cities. For example, we can imagine that the cities are large metropolises and the green vertices are the bridge tolls between one city and the next.
Apart from swapping the role of the vertices and the edges, the information in the two graphs is the same. If we shrink each city down to a dot in the second graph, and grow each bridge toll into a full bridge, we get the first graph. We will see that the graphs are dual to each other.
We represent each graph by a labeled matrix: we label the rows by the vertices and the columns by the edges, and we put a in the matrix whenever the vertex is in the edge. For example, the entry for vertex and edge is , because edge contains vertex . The matrix on the left is for the first graph, and the one on the right is for the second graph.
We can see that the information in the two graphs is the same from looking at the two matrices – they are the same matrix, transposed (or flipped). The matrix of a hypergraph is the transpose of the matrix of the dual hypergraph.
Mathematicians are always on the look-out for hidden dualities between seemingly different objects, and we are happy when we find them. For example, in a recent project we studied the connection between graphical models, from statistics, and tensor networks, from physics. We showed that the two constructions are the duals of each other, using the hypergraph duality we saw in this example.
ALERT ALERT! Applied topology has taken the world has by storm once more. This time techniques from algebraic topology are being applied to model networks of neurons in the brain, in particular with respect to the brain processing information when exposed to a stimulus. Ran Levi, one of the ‘co-senior authors’ of the recent paper published in Frontiers in Computational Neuroscience is based in Aberdeen and he was kind enough to let me show off their pictures in this post. The paper can be found here.
So what are they studying?
When a brain is exposed to a stimulus, neurons fire seemingly at random. We can detect this firing and create a ‘movie’ to study. The firing rate increases towards peak activity, after which it rapidly decreases. In the case of chemical synapses, synaptic communication flows from one neuron to another and you can view this information by drawing a picture with neurons as dots and possible flows between neurons as lines, as shown below. In this image more recent flows show up as brighter.
Image credit: Blue Brain project. This image shows a depiction of neurons and synaptic connections between them. The more recently a synaptic communication has been fired, the brighter it is depicted in the image.
Numerous studies have been conducted to better understand the pattern of this build up and rapid decrease in neuron spikes and this study contains significant new findings as to how neural networks are built up and decay throughout the process, both at a local and global scale. This new approach could provide substantial insights into how the brain processes and transfers information. The brain is one of the main mysteries of medical science so this is huge! For me the most exciting part of this is that the researchers build their theory through the lens of Algebraic Topology and I will try to explain the main players in their game here.
Topological players: cliques and cavities
The study used a digitally constructed model of a rats brain, which reproduced neuron activity from experiments in which the rats were exposed to stimuli. From this model ‘movies’ of neural activity could be extracted and analysed. The study then compared their findings to real data and found that the same phenomenon occurred.
Neural networks have been previously studied using graphs, in which the neurons are represented by vertices and possible synaptic connections between neurons by edges. This throws away quite a lot of information since during chemical synapses the synaptic communication flows, over a miniscule time period, from one neuron to another. The study takes this into account and uses directed graphs, in which an edge has a direction emulating the synaptic flow. This is the structural graph of the network that they study. They also study functional graphs, which are subgraphs of the structural graph. These contain only the connections that fire within a certain ‘time bin’. You can think of these as synaptic connections that occur in a ‘scene’ of the whole ‘movie’. There is one graph for each scene and this research studies how these graphs change throughout the movie.
The main structural objects discovered and consequentially studied in these movies are subgraphs called directed cliques. These are graphs for which every vertex is connected to every other vertex. There is a source neuron from which all edges are directed away, and a sink neuron for which all edges are directed towards. In this sense the flow of information has a natural direction. Directed cliques consisting of n neurons are called simplices of dimension (n-1). Certain sub-simplices of a directed clique for their own directed cliques, when the vertices in the sub-simplices contain their own source and sink neuron, called sub-cliques. Below are some examples of the directed clique simplices.Image credit: EPFL. This image shows examples of directed cliques.
And the images below show these simplices occurring naturally in the neural network.
Image credit: Frontiers in Computational Neuroscience, ‘Cliques of Neurons Bound into Cavities Provide a Missing Link between Structure and Function’, Figure 1A. This image shows a reconstructed microcircuit produced using the model of neural activity. A 5-neuron clique is shown in red.Image credit: Frontiers in Computational Neuroscience, ‘Cliques of Neurons Bound into Cavities Provide a Missing Link between Structure and Function’, Figure 1B3. This image shows a zoomed in depiction of the 5 neuron clique in the image above, with its corresponding simplex on the right.
Image credit: Frontiers in Computational Neuroscience, ‘Cliques of Neurons Bound into Cavities Provide a Missing Link between Structure and Function’, Adaptation of Figure 2C. This image shows a 6-simplex (a directed clique with 7 vertices) on the left and a 7-simplex on the right, with representations of how these cliques appear in the neural network shown in the centre.
The researchers found that over time, simplices of higher and higher dimension were born in abundance, as synaptic communication increased and information flowed between neurons. Then suddenly all cliques vanished, the brain had finished processing the new information. This relates the neural activity to an underlying structure which we can now study in more detail. It is a very local structure, simplices of up to 7 dimensions were detected, a clique of 8 neurons in a microcircuit containing tens of thousands. It was the pure abundance of this local structure that made it significant, where in this setting local means concerning a small number of vertices in the structural graph.
As well as considering this local structure, the researchers also identified a global structure in the form of cavities. Cavities are formed when cliques share neurons, but not enough neurons to form a larger clique. An example of this sharing is shown below, though please note that this is not yet an example of a cavity. When many cliques together bound a hollow space, this forms a cavity. Cavities represent homology classes, and you can read my post on introducing homology here. An example of a 2 dimensional cavity is also shown below.
An example of simplices sharing neurons.Image credit: Frontiers in Computational Neuroscience, ‘Cliques of Neurons Bound into Cavities Provide a Missing Link between Structure and Function’, Figure 5A. This image shows an example of a two dimensional cavity. It is bounded by 2 simplicies (triangles) which are directed cliques with 3 neurons.
The graph below shows the formation of cavities over time. The x-axis corresponds to the first Betti number, which gives an indication of the number of 1 dimensional cavities, and the y-axis similarly gives an indication of the number of 3 dimensional cavities, via the third Betti number. The spiral is drawn out over time as indicated by the text specifying milliseconds on the curve. We see that at the beginning there is an increase in the first Betti number, before an increase in the third alongside a decrease in the first, and finally a sharp decrease to no cavities at all. Considering the neural movie, we view this as an initial appearance of many 1 dimensional simplices, creating 1 dimensional cavities. Over time, the number of 2 and 3 dimensional simplices increases, by filling in extra connections between 1 dimensional simplices, so the lower dimensional cavities are replaced with higher dimensional ones. When the number of higher dimensional cavities is maximal, the whole thing collapses. The brain has finished processing the information!
Image credit: Frontiers in Computational Neuroscience, ‘Cliques of Neurons Bound into Cavities Provide a Missing Link between Structure and Function’, Figure 6B
The time dependent formation of the cliques and cavities in this model was interpreted to try and measure both local information flow, influenced by the cliques, and global flow across the whole network, influenced by cavities.
So why is topology important?
These topological players provide a strong mathematical framework for measuring the activity of a neural network, and the process a brain undergoes when exposed to stimuli. The framework works without parameters (for example there is no measurement of distance between neurons in the model) and one can study the local structure by considering cliques, or how they bind together to form a global structure with cavities. By continuing to study the topological properties of these emerging and disappearing structures alongside neuroscientists we could come closer to understanding our own brains! I will leave you with a beautiful artistic impression of what is happening.
Image credit: Blue Brain project. This image shows an artists depiction of their interpretation of the results, projected into 3 dimensions. The simplices are represented by the clique-like small structures and the centre is the artists depiction of a cavity.
There is a great video of Kathryn Hess (EPFL) speaking about the project, watch it here.
For those of you who want to read more, check out the following blog and news articles (I’m sure there will be more to come and I will try to update the list)
I’m going to a conference next week, and it’s all about braids! So I thought I would write a wee post on combing, a technique which dates back to Artin in the 1940s. In fact the paper where he introduces the concept of combing finishes with the following amusing warning:
“Although it has been proved that every braid can be deformed into a similar normal form the writer is convinced that any attempt to carry this out on a living person would only lead to violent protests and discrimination against mathematics. He would therefore discourage such an experiment.” – Artin 1946
but I really don’t see it as so bad!
Combing is a technique for starting with any braid (see my introductory post on braids here) and ending up with a braid in which first the leftmost strand moves and the others stay put, then the next strand moves while the rest stay put etc etc. It’s much nicer to show this in pictures.
We want to start with any old braid, say this one:
and transform it into a braid where the strands move one at a time, like the following one. I’ve coloured the strands here so you can see that, reading the braid from top to bottom, first the red strand moves (i.e. all crossing involve the red strand, until it is finished), and then the green, and then the blue.
For convenience I’ll only look at braids called pure braids, where each strand starts and ends at the same position. You can easily comb non-pure braids, you just need to add an appropriate twist right at the end to make them finish in the correct positions.
So how do we do this? Consider the first strand, I’ve coloured it red to make it clear. We want all the crossings between red and black strands to happen before (higher up than) a crossing of two black strands. So in this case the crossing circled in yellow are okay, because they happen lower down than any crossing involving the red strand. The crossings circled in blue and green need to be changed.
We can slide some crossings of black strands down past the red and black crossings, as they don’t interfere. Here we can do it with the crossing circled in blue, as shown:
We can start to do it with the crossing circled in green, but we encounter a problem as it wont simply slide past the red strand crossing below it. Moving this crossing down requires using some of the braid relations (see braid post) to replace a few crossings with an equivalent section in which the red strand moves first, as follows:
Even though this braid looks different than the previous one they are in fact the same (you can always test this with string!). Now we have a braid in which the first strand moves before any others. Since all the first stand action is now at the top of the braid, we can now ignore the first strand all together, and consider the rest of the braid, as show below:
we only need to consider the following section now, and again we can put this into a form where only the first strand moves.
In this case using braid relations gives us the following:
And we can now ignore the green strand!
Colouring the first strand in this final section gives us no crossing that don’t involve the first strand:
and we colour the last strand yellow for fun!
Remembering all the pieces we have ignored gives us the full combed braid, where we focus on the leftmost strand until it ‘runs out of moves’ before looking to the next one.
And this is exactly the same as the original braid, which looks a lot messier when coloured:
Why might we want to do this? In some cases it makes mathematical proofs a lot easier. For me, recently I have been focusing only on what the first strand is doing, and so I want a technique to push the other strands down and away!
Making a cup of tea in a hurry is a challenge. I want the tea to be as drinkable (cold) as possible after a short amount of time. Say, 5 minutes. What should I do: should I add milk to the tea at the beginning of the 5 minutes or at the end?
The rule we will use to work this out is Newton’s Law of Cooling. It says “the rate of heat loss of the tea is proportional to the difference in temperature between the tea and its surroundings”.
This means the temperature of the tea follows the differential equation , where the constant is a positive constant of proportionality. The minus sign is there because the tea is warmer than the room – so it is losing heat. Solving this differential equation, we get , where is the initial temperature of the tea.
We’ll start by defining some variables, to set the question up mathematically. Most of them we won’t end up needing. Let’s say the tea, straight from the kettle, has temperature . The cold milk has temperature . We want to mix tea and milk in the ratio . The temperature of the surrounding room is .
Option 1: Add the milk at the start
We begin by immediately mixing the tea with the milk. This leaves us with a mixture whose temperature is . Now we leave the tea to cool. Its cooling follows the equation . After five minutes, the temperature is
Option 1
Option 2: Add the milk at the end
For this option, we first leave the tea to cool. Its cooling follows the equation . After five minutes, it has temperature . Then, we add the milk in the specified ratio. The final concoction has temperature
Option 2
So which temperature is lower: the “Option 1” temperature or the “Option 2” temperature?
It turns out that most of the terms in the two expressions cancel out, and the inequality boils down to a comparison of (from Option 2) with (from Option 1). The answer depends on whether . For our cup of tea, it will be: there’s more tea than milk () and the milk is colder than the surroundings (). [What does this quantity represent?] Hence, since is positive, we have , and option 2 wins: add the milk at the end.
But, does it really make a difference? (What’s the point of calculus?)
Well, we could plug in reasonable values for all the letters (, etc.) and see how different the two expressions are.
So, why tea with Almond milk?
My co-blogger Rachael is vegan. She inspires me to make my tea each morning with Almond milk.
Finally, here’s a picture of an empirical experiment from other people (thenakedscientists) tackling this important question: