Simple random walks and the ballot problem

0. Introduction

Recently, I worked out [1, Exercise 2.4] which deals with the famous ballot problem and simple symmetric random walks. The problem reads essentially as follows.

Suppose there is a local election with only two candidates AA and BB. We count the number of votes for both candidates. AA and BB are equally popular, i.e. the probability to vote for either one is equal, pA=12p_A = \frac{1}{2} and pB=12p_B = \frac{1}{2}. There are 2N2N votes in total.

(i.) What is the probability that the first tie of votes occurs first after counting 2n2n votes with nNn \leq N?

(ii.) What is the probability that there is no tie after counting 2n2n votes?

The combinatorial part of the proof contains a very elegant reflection argument that I haven’t seen before. In the following, I present a detailed solution including the reflection argument. The main source I used beside [1] is the excellt exposition [2] by Sven Erick Alm.

1. Notation and auxiliary results

First, we need to determine the underlying probability space. There are only two possible outcomes, namely a vote for candidate AA or candidate BB, which we can decode as {1,1}\{1, -1\}. For 2n2n rounds of voting we therefore have Ω={1,1}2n\Omega = \{-1, 1\}^{2n}. Ω\Omega is obviously finite for finite nn, therefore A=P(Ω)\mathcal{A} = \mathcal{P}(\Omega) as underlying σ\sigma-algebra. Since each vote happens equally likely for each candidate we set P=UΩ\mathbb{P} = \mathcal{U}_{\Omega}. All in all, we arrive at the probability space

(Ω,A,P)=({1,1}2n,P({1,1}2n),U{1,1})(\Omega, \mathcal{A}, P) = (\{1, -1\}^{2n}, \mathcal{P}(\{1, -1\}^{2n}), \mathcal{U}_{\{1, -1\}})

Now consider the graph (Ai)i{1,,n}(A_i)_{i \in \{1, \ldots, n\}} where Ai=(i,j=0iωi)A_i = (i, \sum\limits_{j = 0}^{i} \omega_i)

Formalise (i). as event GnG_n and (ii). as G>nG_{> n}. Then, the event Gn(Ai)1i2nG_n \subset (A_i)_{1\leq i\leq 2n} consists of the number of paths that start in (0,0)(0, 0) and after 2n2n steps arrive in (2n,0)(2n, 0) while inbetween it holds that y>0y > 0 for {(i,y)}1i2n\{(i, y)\}_{1 \leq i \leq 2n}. In other words, the graph does not cross the x axis.

To formalise the problem we need to introduce more notation. Let p=pA=12p = p_A = \frac{1}{2} be the probability for a step upwards and q=pB=1p=12q = p_B = 1 - p = \frac{1}{2} the one for one step downwards. A path that starts in (0,a)(0, a) and after nn steps arrives in (n,b)(n, b), can be characterized as having hh steps up and vv steps down. These necessarily fulfill the relationship:

h+v=nhv=ba

After solving for vv and hh, we obtain h=n+(ba)2h = \frac{n + (b - a)}{2} and v=n(ba)2v = \frac{n - (b - a)}{2}. The number of possibilities to perform hh steps up in nn steps total, is (nh)=(nnh)=(nv)\binom{n}{h} = \binom{n}{n - h} = \binom{n}{v}. Due to the uniformness assumption, the probability to go from aa to bb in nn steps is equal to Pn(a,b)=(nh)phqvP_{n}(a, b) = \binom{n}{h}p^hq^v.
Note that (nh)\binom{n}{h} does not yet take into account whether we cross the x-axis.

2. Solving the problem

Let Nn(a,b)N_{n}(a, b) be the number of paths from aa to bb in nn steps, Nn0(a,b)N^{\neq 0}_{n}(a, b) the number of paths from aa to bb that do not cross the xx-axis and N0(a,b)N^{0}(a, b) the number of paths from aa to bb that cross the x-axis. It clearly holds that Nn(a,b)=(nh)N_n(a, b) = \binom{n}{h} as you can choose hh steps up among nn steps in total.

But it also holds that

Nn(a,b)=Nn0(a,b)+Nn0(a,b)

and Nn0(a,b)=Nn(a,b)N_n^0(a, b) = N_n(a, b) if sgn(a)sgn(b)\textnormal{sgn}(a) \neq \textnormal{sgn}(b).

Therefore, it is obvious that P(Gn)=N2n0(0,0)(12)2n

but on the other hand,

N2n0(0,0)=N2n10(1,0)+N2n10(1,0)=2N2n10(1,0)=2N2n20(1,1)

since both candidates may be in the lead the whole time, i.e. the graph is either above or below the x-axis and, hence, wanders from 11 (candidate A) to 00 or from 1-1 (candidate B) to 00 in 2n12n - 1 steps. Since any path above the xx-axis always represents the mirror image of a suitable graph below the xx-axis, the number of paths double.

Eventually, there are exactly as many paths that wander in 2n12n - 1 steps from 11 to 00 as there are that wander in 2n22n - 2 steps from 11 to 11. The reason is that 2n12n - 1th steps is predetermined as being a step down from 11 to 00.

Now, we need to use the mirror principle, the main argument of the proof that really appealed to me: A path that crosses the xx-axis between the coordinates (1,1)(1, 1) and (2n2,1)(2n - 2, 1), can be represented as a path that starts in (1,1)(1, -1) and finishes in (2n2,1)(2n - 2, 1). We do this by mirroring the graph at the xx-axis until its first root appears. Hence, the number of paths that do not cross the xx-axis in 2n22n - 2 steps is equal to the difference between all paths between (1,1)(1, 1) and (1,2n2)(1, 2n - 2) and those that start in (1,1)(1, 1), finish in (2n2,1)(2n - 2, 1) and cross the xx-axis.

N2n20(1,1)=N2n2(1,1)N2n20(1,1)=N2n2(1,1)N2n2(1,1)=(2n2h1)(2n2h2)

With the formula for hh above, we obtain h1=2n2+(11)2=n1h_1 = \frac{2n - 2 + (1 - 1)}{2} = n - 1 and h2=2n2(1(1))2=nh_2 = \frac{2n - 2 - (1 - (-1))}{2} = n In total, we get

N2n20(1,1)=(2n2n1)(2n2n)

and

P(Gn)=N2n0(0,0)(12)2n=2N2n20(1,1)(12)2n=22n+1((2n2n1)(2n2n))=un1un

To solve (ii)., [1] suggests to consider the complementary event G>ncG^c_{> n}. Then, it holds due to the disjointness of GnG_n for different nn (which is clear intuitively) that

P(G>n)=P((G>nc)c)=1P(G>nc)=1P(k=1nGk)=1k=1nP(Gk)=1k=1n(uk1uk)=un

and hence,

P(G>n)=un

In the last step, we used the telescope sum together with u0=1u_0 = 1.

References

[1] H.-O. Georgii (2014): Stochastik. Eine Einführung.

[2] S. E. Alm (2006): Simple random walks.