Turan’s Theorem

Turan’s theorem gives bounds on the number of edges required in a graph on a fixed number of vertices n to guarantee it contains a complete graph of size r+1. Equivalently, an upper bound on the number of edges in a K_{r+1}-free graph. For some of the applications and proofs, it may be more natural to look instead at the complement graph, for which the theorem becomes a statement about the existence or otherwise of an independent set of size r+1.

Rather than give an expression for the bound immediately, it is more natural to consider the Turan graph T(n,r), the maximal graph on n vertices without a copy of K_{r+1}. This is constructed by dividing the vertices into r classes with ‘as equal size as possible’. That is, some classes have size \lfloor \frac{n}{r}\rfloor and others have size \lfloor \frac{n}{r}\rfloor +1. Then connect any pair of vertices which are not in the same class by an edge. This gives a complete r-partite graph on these classes. Since any collection of r+1 vertices contains at least two in the same class, it can’t contain a K_{r+1}. Note that the complement of the complete r-partite graph is the union of r disjoint complete graphs on the classes.

There are a number of ways to enumerate the edges in T(n,r), and some can get quite complicated quite quickly. After a moderate amount of thought, this is my favourite. Let n=\ell r+k, so T(n,r) has k classes of size (l+1) and (r-k) classes of size l. Pick an ordered pair of vertices uniformly at random. (So picking the same vertices is indeed an option, and is counted twice.) Then the probability they are the same class is

\frac{k}{r}\cdot\frac{\ell+1}{n}+\frac{r-k}{r}\cdot \frac{\ell}{n} = \frac{1}{r}.

So the probability they are in different classes is \frac{r-1}{r}, and we can treat all of the 2n^2 ordered pairs in this way, noting a) that we count everything twice; and b) we know a priori that we don’t have loops, so the fact that we’ve included these in the count doesn’t matter. We end up with the enumeration (1-\frac{1}{r})\frac{n^2}{2} for the edges in T(n,r).

A standard proof

For both proofs, I find it slightly easier to work in the complement graph, where we are aiming for the largest number of edges with an independent set of size (r+1). Suppose we have a graph with the minimal number of vertices such that there’s no independent set of given size. Suppose also that there is an edge joining vertices v and w, such that d(v)> d(w). Then if we change v’s neighbourhood \Gamma(v) so that it becomes the same as \Gamma(w), (that is, we replace v with a copy of w, and maintain the original edge vw), then it is easily checked that we still do not have an independent set of that size, but fewer edges.

Note that by attempting to make the neighbourhoods of connected vertices equal, we are making the graph look more like a union of complete components. We can do a similar trick if we have three vertices u,v,w such that there are edges between u and v and v and w, but not u and w. Then we know the degrees of u,v,w are the same by the previous argument, and so it can again be checked that making \Gamma(u),\Gamma(w) the same as \Gamma(v), and adding the edge uw reduces the number of edges, and maintains the non-existence of the independent set.

The consequence of this is that we’ve shown that the minimum can only be attained when presence of edges is an equivalence relation (ignoring reflexivity). Thus the minimum is only attained for a union of at most r complete graphs. Jensen (or any root-mean-square type inequality) will then confirm that the true minimum is attained when the sizes of the r components are as equal as possible.

A probabilistic proof

The following probabilistic proof is courtesy of Alon and Spencer. The motivation is that in the (equality) case of a union of complete graphs, however we try to build up a maximal independent set, we always succeed. That is, it doesn’t matter how we choose which vertex (unconnected to those we already have) to add next – we will always get a set of size r. This motivates a probabilistic proof, as an argument in expectation will have equality in the equality case, which is always good.

Anyway, we build up an independent set in a graph by choosing uniformly at random a vertex which is not connected to any we have so far, until this set of vertices is empty. It makes sense to settle the randomness at the start, so give the vertices a uniform random labelling on [n], and at each stage, choose the independent vertex with minimal label.

Thus, a vertex v will be chosen for the independent set if, and only if, it has a smaller label than all of its neighbours, that is, with probability \frac{1}{1+d(v)}. So the expected size of the independent set constructed in this fashion is

\sum_{v\in V(G)} \frac{1}{1+d(v)}\ge \frac{V}{1+\bar d} = \frac{V}{1+\frac{2E}{V}}.

One can chase through the expressions to get the bound we want back.

Olympiad example

The reason I was thinking about Turan’s theorem was a problem which the UK IMO squad was discussing. It comes from an American selection test (slightly rephrased): given 100 points in the plane, what is the largest number of pairs of points with \ell_1 distance in (1,2]?

The key step is to think about how large a collection of points can have this property pairwise. It is easy to come up with an example of four points which work, and seemingly impossible to come up with an example with five points. To prove this, I found it easiest to place a point at the origin, then explicitly work with coordinates relative the basis (1,1),(1,-1) for fairly obvious reasons in this metric.

Anyway, once you are convinced that you can’t have five points with this property pairwise, you are ready to convert into a graph-theoretic statement. Vertices correspond to points, and edges link pairs of points whose distance is in (1,2] as required. We know from the previous paragraph that there is no copy of K_5 here, so Turan’s theorem bounds the number of edges, ie the number of suitable pairs.

It also tells us under what sort of circumstances the bound is attained, and from this, it’s natural to split the 100 points into four groups of 25, for example by taking four points which satisfy the condition pairwise (eg a diamond around the origin), and placing each group very near one of the points.

Extensions and other directions

The existence of a complete subgraph is reminiscent of Ramsey theory, which in one case is a symmetric version of Turan’s theorem. In Turan, we are adding enough edges to force a complete subgraph, while in the original form of Ramsey theory, we are asking how large the graph needs to be to ensure that for any edge configuration, either the original graph or the complement graph includes a complete subgraph. It makes a lot more sense to phrase this in terms of colours for the purpose of generalisation.

A natural extension is to ask about finding copies of fixed graphs H other than the complete graph. This is the content of the Erdos-Stone theorem. I’d prefer to say almost nothing rather than be vague, but the key difference is that the bound is asymptotic in the number of vertices rather than exact. Furthermore, the asymptotic proportion of vertices depends on the chromatic number of H, which tells you how many classes r are required to embed H in a (large if necessary) r-partite graph. So it is perhaps unsurprising that the limiting proportions end up matching the proportion of edges in the Turan graphs, namely r-1/r as r varies, which leaves the exact scaling open to further investigation in the case where H is bipartite (hence has chromatic number 2).

Uniform Spanning Trees

For applications to random graphs, the local binomial structure and independence means that the Galton-Watson branching process is a useful structure to consider embedding in the graph. In several previous posts, I have shown how we can set up the so-called exploration process which visits the sites in a component as if the component were actually a tree. The typical degree is O(1), and so in particular small components will be trees with high probability in the limit. In the giant component for a supercritical graph, this is not the case, but it doesn’t matter, as we ignore vertices we have already explored in our exploration process. We can consider the excess edges separately by ‘sprinkling’ them back in once we have the tree-like backbone of all the components. Again, independence is crucial here.

I am now thinking about a new model. We take an Erdos-Renyi process as before, with edges arriving at some fixed rate, but whenever a cycle appears, we immediately delete all the edges that make up the cycle. Thus at all times the system consists of a collection (or forest) of trees on the n vertices. So initially this process will look exactly like the normal E-R process, but as soon as the components start getting large, we start getting excess edges which destroy the cycles and make everything small again. The question to ask is: if we run the process for long enough, roughly how large are all the components? It seems unlikely that the splitting mechanism is so weak that we will get true giant components forming, ie O(n) sizes, so we might guess that, in common with some other split-merge models of this type, we end up with components of size n^{2/3}, as in the critical window for the E-R process.

In any case, the scaling limit process is likely to have components whose sizes grow with n, so we will have a class of trees larger than those we have considered previously, which have typically been O(1). So it’s worth thinking about some ways to generate random trees on a fixed number of vertices.

Conditioned Galton-Watson

Our favourite method of creating trees is inductive. We take a root and connect the root to a number of offspring given by a fixed distribution, and each of these some offspring given by an independent sample from the same distribution and so on. The natural formulation gives no control over the size of the tree. This is a random variable whose distribution depends on the offspring distribution, and which in some circumstances be computed explicitly, for example when the offspring distribution is geometric. In other cases, it is easier to make recourse to generating functions or to a random walk analogue as described in the exploration process discussion.

Of course, there is nothing to stop us conditioning on the total size of the population. This is equivalent to conditioning on the hitting time of -1 for the corresponding random walk, and Donsker’s theorem gives several consequences of a convergence relation towards a rescaled Brownian excursion. Note that there is no a priori labelling for the resulting tree. This will have to be supplied later, with breadth-first and depth-first the most natural choices, which might cause annoyance if you actually want to use it. In particular, it is not obvious, and probably not true unless you are careful, that the distribution is invariant under permuting the labels (having initially assumed 1 is the root etc) which is not ideal if you are embedding into the complete graph.

However, we would like to have some more direct constructions of random trees on n vertices. We now consider perhaps the two best known such methods. These are of particular interest as they are applicable to finding random spanning trees embedded in any graph, rather than just the complete graph.

Uniform Spanning Tree

Given a connected graph, consider the set of all subgraphs which are trees and span the vertex set of the original graph. An element of this set is called a spanning tree. A uniform spanning tree is chosen uniformly at random from the set of spanning trees on the complex graph on n vertices. A famous result of Arthur Cayley says that the number of such spanning trees is n^{n-2}. There are various neat proofs, many of which consider a mild generalisation which gives us a more natural framework for using induction. This might be a suitable subject for a subsequent post.

While there is no objective answer to the question of what is the right model for random trees on n vertices, this is what you get from the Erdos-Renyi process. Formally, conditional on the sizes of the (tree) components, the structures of the tree components are given by UST.

To see why this is the case, observe that when we condition that a component has m vertices and is a tree, we are demanding that it be connected and have m-1 edges. Since the probability of a particular configuration appearing in G(n,p) is a function only of the number of edges in the configuration, it follows that the probability of each spanning tree on the m vertices in question is equal.

Interesting things happen when you do this dynamically. That is, if we have two USTs of sizes m and n at some time t, and condition that the next edge to be added in the process joins them, then the resulting component is not a UST on m+n vertices. To see why, consider the probability of a ‘star’, that is a tree with a single distinguished vertex to which every other vertex is joined. Then the probability that the UST on m vertices is a star is \frac{m}{m^{m-2}}=m^{-(m-3)}. By contrast, it is not possible to obtain a star on m+n vertices by joining a tree on m vertices and a tree on n vertices with an additional edge.

However, I think the UST property is preserved by the cycle deletion mechanism mentioned at the very start of this post. My working has been very much of the back of the envelope variety, but I am fairly convinced that once you have taken a UST and conditioned on the sizes of the smaller trees which result from cycle deletion. My argument is that you might as well fix the cycle to be deleted, then condition on how many vertices are in each of the trees coming off this cycle. Now the choice of each of these trees is clearly uniform among spanning trees on the correct number of vertices.

However, it is my current belief that the combination of these two mechanisms does not give UST-like trees even after conditioning on the sizes at fixed time.