Season 6 Episode 5

DPhil student Alex is a guest on the livestream in this episode to show us how to use probability to solve certain combinatorics (counting) problems.

Geometric Mean

By request, here's some content about the geometric mean.

For positive numbers $a$ and $b$, the geometric mean is $\sqrt{ab}$. More generally, for a collection of $n$ positive numbers, you calculate the geometric mean by first multiplying them all together, and then taking the $n$th root.

Why is it called the geometric mean? Well if you've got a geometric sequence (one where each term is $r$ times the previous term, for some number $r$), and if the first term is $a$ and the third term is $b$, then the second term could be $\sqrt{ab}$. Compare that with an arithmetic sequence (one where each term is $d$ more than the previous term, for some number $d$), where if the first term is $a$ and the third term is $b$, then the second term could be $\frac{1}{2}(a+b)$.

But what's that got to do with geometry? Here's one interpretation. Take a semicircle and choose a point on the curved side. Drop a perpendicular to the diameter; the foot of that perpendicular splits the diameter into two parts. Call the lengths of those two parts $a$ and $b$. So the diameter is $a+b$ and so the radius is $\frac{1}{2}\left(a+b\right)$, the arithmetic mean of $a$ and $b$. What's the length of the line from your point on the curved side down to the diameter, in terms of $a$ and $b$? I'll leave this as a geometry exercise for you (hint: similar triangles), and I'll tell you the answer; the length of the line is $\sqrt{ab}$, the geometric mean of $a$ and $b$.

As a consequence (because that line is at most as long as the diameter, if your point is in the middle of the semicircle), we can see that the geometric mean of $a$ and $b$ must be at most equal to the arithmetic mean of $a$ and $b$. $$ \frac{a+b}{2}\geq \sqrt{ab} $$ with equality if and only if $a=b$.

Exercise: prove this inequality with algebra.

This is called the AM-GM inequality. It's also true for $n$ numbers, if you put the arithmetic mean (add, then divide by $n$) on the left, and put the geometric mean (multiply, then take the $n$th root) on the right. It's a really good inequality.

Application: Find the minimum value of $\displaystyle x+\frac{1}{x}$ for $x>0$.

By the AM-GM inequality with $a=x$ and $b=\frac{1}{x}$, we have $x+\frac{1}{x} \geq 2$, with equality at $x=1$.

Application: Find the minimum value of $\displaystyle x+\frac{1}{\sqrt{x}}$ for $x>0$.

By the AM-GM inequality with $a=x$, $b=\frac{1}{2\sqrt{x}}$, $c=\frac{1}{2\sqrt{x}}$, we have

$$\frac{a+b+c}{3}\geq \sqrt[3]{\frac{x}{2\sqrt{x}2\sqrt{x}}}
\quad\text{so}\quad
x+\frac{1}{\sqrt{x}}\geq 3\sqrt[3]{\frac{1}{4}}.$$

Probabilistic Combinatorics

Alex has written some further reading notes for you. In some of the sections below, there's an outline of a proof for you to complete.

Removing the probability

Recall what we proved on the livestream: in any graph with $m$ edges, there is a cut (a colouring of its vertices with two colours) with value (the number of edges with differently coloured endpoints) at least $\frac{m}{2}$. The proof consisted of three steps:

Colour each vertex with one of the two colours at random.
Prove that the expected value of the cut is $\frac{m}{2}$.
Conclude that there has to be at least one cut with value at least $\frac{m}{2}$.

The purpose of this section is to come up with an "alternative" proof for the theorem, using counting only, and no coin flips. You can do this by following the steps given below.

Call the number of vertices of your graph $n$. Prove that you can colour this graph with two colours in $2^n$ ways.
[This part "de-aggregates" the colouring obtained by a random colouring.]
Now, suppose you look at all the different $2^n$ previously obtained graphs.
1. Fix some edge $e$. Prove that this edge has differently coloured endpoints in $2^{n-1}$ of the graphs.
2. Prove that the sum of all the cuts (taken over all of the $2^n$ graphs) is $m2^{n-1}$.
Conclude that at least one of the $2^n$ graphs has a cut of value at least $\frac{m}{2}$.

Can you tell how this is related to the proof we have done during the live stream? Hint: If you divide the numbers in part 2 by $2^n$, you get two values that were expectations.

Finding a large cut with high probability

During the livestream, Alex mentioned that there is an algorithm that finds a cut with value at least $\frac{m}{2}$ in any graph with $m$ edges. This algorithm is completely deterministic, but a bit hard to understand without more background in probability theory.

The purpose of this section is to prove something almost as satisfying: If we have a graph with $m$ edges, we'll run the following experiment $10(m+1)$ times. Colour the vertices with two colours at random (using a coin, as during the livestream). Then, with probability at least $99.99\%$, at least one of the runs will result in a cut of size at least $\frac{m}{2}$.

This is already more involved than what we have done, but we can break the proof into multiple parts to make it more manageable.

Markov's inequality

Suppose you have a random variable $X$ that takes values from $0,1,\ldots,n$ for some positive integer $n$. Recall that the expression for its expectation was

\[\mathbb{E}[X]=0\cdot\mathbb{P}(X=0)+1\cdot\mathbb{P}(X=1)+2\cdot\mathbb{P}(X=2)+\ldots+n\cdot\mathbb{P}(X=n)\]

Fix some positive integer $k\leq n$. Argue that \[\mathbb{E}[X]\geq k\mathbb{P}(X=k)+(k+1)\mathbb{P}(X=k+1)+\ldots+n\cdot\mathbb{P}(X=n)\]
Tehn using the fact that $P(X\geq k)=\mathbb{P}(X=k)+\mathbb{P}(X=k+1)+\ldots+\mathbb{P}(X=n)$, conclude that: \[\mathbb{P}(X\geq k)\leq \frac{\mathbb{E}[X]}{k}.\]
This is called Markov's inequality.

Bounding the failure probability of one run

Recall that we denoted the value of a (random) cut by $X$. We will say that the run has failed, in the end, $X<\frac{m}{2}$.

Argue why a failed run is one in which actually $X\leq\frac{m-1}{2}$.
By using the fact that $\mathbb{P}\left(X\leq\frac{m-1}{2}\right)+\mathbb{P}\left(X\geq\frac{m+1}{2}\right)=1$ (complementary events), along with Markov's inequality, prove that the failure probability is at most $1-\frac{1}{m+1}$.

Repetition makes perfect

The previous subsection might convey a pessimistic message: we can fail with a probability relatively close to 1. However, the important part is that we can bound it away from 1! After all, if we want to be optimistic, this also means that there is a (small) chance of success, in which case we just need enough runs to almost guarantee a successful run.

Suppose we make $10(m+1)$ independent runs. Argue why the probability that all of the runs fail is at most\[\left(1-\frac{1}{m+1}\right)^{10(m+1)}\]You might want to use (a generalization of) the fact that, for two independent events $E_1$ and $E_2$, $\mathbb{P}(E_1 \text{ and } E_2)=\mathbb{P}(E_1)\cdot\mathbb{P}(E_2)$.
Using the inequality $1-x\leq e^{-x}$ (true for any real number $x$), where $e\approx 2.7\geq 2$, prove that, with probability at least $99.99\%$, at least one of the runs will result in a cut with value at least $\frac{m}{2}$.

Other resources

This is a relatively advanced topic that Oxford students usually do not know much about at the end of their first year. The reason is that one needs a proper probability theory background for almost everything. Even coming up with an algorithm that works almost all the time (c.f. previous section) requires using Markov's inequality, which is not a trivial thing! However, if you want to learn more about this topic, the following might be good places to start:

Probabilistic method on Wikipedia, and see the links there too
Chapter 6 of Probability and Computing: Randomized Algorithms and Probabilistic Analysis by Michael Mitzenmacher, Eli Upfal but note that this is already very advanced.

If you want to get in touch with us about any of the mathematics in the video or the further reading, feel free to email us on oomc [at] maths.ox.ac.uk.