Simple random walks and the ballot problem
0. Introduction
Recently, I worked out [1, Exercise 2.4] which deals with the famous ballot problem and simple symmetric random walks. The problem reads essentially as follows.
Suppose there is a local election with only two candidates $A$ and $B$. We count the number of votes for both candidates. $A$ and $B$ are equally popular, i.e. the probability to vote for either one is equal, $p_A = \frac{1}{2}$ and $p_B = \frac{1}{2}$. There are $2N$ votes in total.
(i.) What is the probability that the first tie of votes occurs first after counting $2n$ votes with $n \leq N$?
(ii.) What is the probability that there is no tie after counting $2n$ votes?
The combinatorial part of the proof contains a very elegant reflection argument that I haven’t seen before. In the following, I present a detailed solution including the reflection argument. The main source I used beside [1] is the excellt exposition [2] by Sven Erick Alm.
1. Notation and auxiliary results
First, we need to determine the underlying probability space. There are only two possible outcomes, namely a vote for candidate $A$ or candidate $B$, which we can decode as $\{1, -1\}$. For $2n$ rounds of voting we therefore have $\Omega = \{-1, 1\}^{2n}$. $\Omega$ is obviously finite for finite $n$, therefore $\mathcal{A} = \mathcal{P}(\Omega)$ as underlying $\sigma$-algebra. Since each vote happens equally likely for each candidate we set $\mathbb{P} = \mathcal{U}_{\Omega}$. All in all, we arrive at the probability space
$$ (\Omega, \mathcal{A}, P) = (\{1, -1\}^{2n}, \mathcal{P}(\{1, -1\}^{2n}), \mathcal{U}_{\{1, -1\}}) $$
Now consider the graph $(A_i)_{i \in \{1, \ldots, n\}}$ where $$ A_i = (i, \sum\limits_{j = 0}^{i} \omega_i) $$
Formalise (i). as event $G_n$ and (ii). as $G_{> n}$. Then, the event $G_n \subset (A_i)_{1\leq i\leq 2n}$ consists of the number of paths that start in $(0, 0)$ and after $2n$ steps arrive in $(2n, 0)$ while inbetween it holds that $y > 0$ for $\{(i, y)\}_{1 \leq i \leq 2n}$. In other words, the graph does not cross the x axis.
To formalise the problem we need to introduce more notation. Let $p = p_A = \frac{1}{2}$ be the probability for a step upwards and $q = p_B = 1 - p = \frac{1}{2}$ the one for one step downwards. A path that starts in $(0, a)$ and after $n$ steps arrives in $(n, b)$, can be characterized as having $h$ steps up and $v$ steps down. These necessarily fulfill the relationship:
\begin{align}
h + v &= n \\
h - v &= b - a
\end{align}
After solving for $v$ and $h$, we obtain $h = \frac{n + (b - a)}{2}$ and $v = \frac{n - (b - a)}{2}$. The number of possibilities to perform $h$ steps up in $n$ steps total, is $\binom{n}{h} = \binom{n}{n - h} = \binom{n}{v}$.
Due to the uniformness assumption, the probability to go from $a$ to $b$ in $n$ steps is equal to $P_{n}(a, b) = \binom{n}{h}p^hq^v$.
Note that $\binom{n}{h}$ does not yet take into account whether we cross the x-axis.
2. Solving the problem
Let $N_{n}(a, b)$ be the number of paths from $a$ to $b$ in $n$ steps, $N^{\neq 0}_{n}(a, b)$ the number of paths from $a$ to $b$ that do not cross the $x$-axis and $N^{0}(a, b)$ the number of paths from $a$ to $b$ that cross the x-axis. It clearly holds that $N_n(a, b) = \binom{n}{h}$ as you can choose $h$ steps up among $n$ steps in total.
But it also holds that
\begin{equation} N_n(a, b) = N_n^0(a, b) + N_n^{\neq 0}(a, b) \end{equation}
and $N_n^0(a, b) = N_n(a, b)$ if $\textnormal{sgn}(a) \neq \textnormal{sgn}(b)$.
Therefore, it is obvious that \begin{equation} P(G_n) = N_{2n}^{\neq 0}(0, 0)\Big(\frac{1}{2}\Big)^{2n} \end{equation}
but on the other hand,
\begin{equation} N^{\neq 0}_{2n}(0, 0) = N^{\neq 0}_{2n - 1}(1, 0) + N^{\neq 0}_{2n - 1}(-1, 0) = 2\cdot N^{\neq 0}_{2n - 1}(1, 0) = 2\cdot N^{\neq 0}_{2n - 2}(1, 1) \end{equation}
since both candidates may be in the lead the whole time, i.e. the graph is either above or below the x-axis and, hence, wanders from $1$ (candidate A) to $0$ or from $-1$ (candidate B) to $0$ in $2n - 1$ steps. Since any path above the $x$-axis always represents the mirror image of a suitable graph below the $x$-axis, the number of paths double.
Eventually, there are exactly as many paths that wander in $2n - 1$ steps from $1$ to $0$ as there are that wander in $2n - 2$ steps from $1$ to $1$. The reason is that $2n - 1$th steps is predetermined as being a step down from $1$ to $0$.
Now, we need to use the mirror principle, the main argument of the proof that really appealed to me: A path that crosses the $x$-axis between the coordinates $(1, 1)$ and $(2n - 2, 1)$, can be represented as a path that starts in $(1, -1)$ and finishes in $(2n - 2, 1)$. We do this by mirroring the graph at the $x$-axis until its first root appears. Hence, the number of paths that do not cross the $x$-axis in $2n - 2$ steps is equal to the difference between all paths between $(1, 1)$ and $(1, 2n - 2)$ and those that start in $(1, 1)$, finish in $(2n - 2, 1)$ and cross the $x$-axis.
\begin{align}
N^{\neq 0}_{2n - 2}(1, 1) &= N_{2n - 2}(1, 1) - N^{0}_{2n - 2}(1, 1) \\
&= N_{2n - 2}(1, 1) - N_{2n - 2}(-1, 1) \\
&= \binom{2n - 2}{h_1} - \binom{2n - 2}{h_2}
\end{align}
With the formula for $h$ above, we obtain $h_1 = \frac{2n - 2 + (1 - 1)}{2} = n - 1$ and $h_2 = \frac{2n - 2 - (1 - (-1))}{2} = n$ In total, we get
\begin{equation} N^{\neq 0}_{2n - 2}(1, 1) = \binom{2n - 2}{n - 1} - \binom{2n - 2}{n} \end{equation}
and
\begin{equation} \boxed{P(G_n) = N_{2n}^{\neq 0}(0, 0)\Big(\frac{1}{2}\Big)^{2n} = 2\cdot N^{\neq 0}_{2n - 2}(1, 1)\Big(\frac{1}{2}\Big)^{2n} = 2^{-2n + 1}\Big(\binom{2n - 2}{n - 1} - \binom{2n - 2}{n}\Big) = u_{n - 1} - u_{n}} \end{equation}
To solve (ii)., [1] suggests to consider the complementary event $G^c_{> n}$. Then, it holds due to the disjointness of $G_n$ for different $n$ (which is clear intuitively) that
\begin{align}
P(G_{> n}) &= P\Big((G^c_{> n})^c\Big) \\
&= 1 - P(G^c_{> n}) \\
&= 1 - P\Big(\bigcup\limits_{k = 1}^n G_k\Big) \\
&= 1 - \sum\limits_{k = 1}^n P(G_k) \\
&= 1 - \sum\limits_{k = 1}^{n} (u_{k - 1} - u_k) \\
&= u_n
\end{align}
and hence,
\begin{equation} \boxed{P(G_{> n}) = u_n} \end{equation}
In the last step, we used the telescope sum together with $u_0 = 1$.
References
[1] H.-O. Georgii (2014): Stochastik. Eine Einführung.
[2] S. E. Alm (2006): Simple random walks.