Once again, you are a frog in a pond. But this time, the pond has infinitely many lily pads, which are numbered “1,” “2,” “3,” etc. As before, you are currently on pad 2, and your goal is to make it to pad 1, which you would happily stay on forever.
Whenever you are on pad k, you will hop to pad k−1 with probability 1/k, and you will hop to pad k+1 with probability (k−1)/k.
Now, what is the probability that you will ultimately make it to pad 1?
Taking off where we landed from the Classic Fiddler, let's expand the 4 lily pad Markov chain to an n state Markov chain and then see what happens when n→∞. Skipping right to the canonical form of the transition matrix we get ˆTn=(01n−100⋯0000n−2n−1n−3n−201n−20⋯000000n−4n−301n−3⋯00000⋮⋮⋮⋮⋯⋮⋮⋮⋮⋮0000⋯23013000000⋯01201200000⋯000100000⋯00001)=(QR0I). Again, we have the fundamental matrix N=(I−Q)−1 and P=NR, where here can focus of wanting Pn−2,1. Instead of trying to code up an (n−2)×(n−2) matrix inversion, let's instead rewrite the matrix equation as a tridiagonal system of equations (I−Q)x=(1−1n−100⋯000−n−3n−21−1n−20⋯0000−n−4n−31−1n−3⋯000⋮⋮⋮⋮⋯⋮⋮⋮0000⋯−231−130000⋯0−121)x=12en−2, and use the Thomas algorithm to find the probability of ending up on pad 1 forever in the case of n lily pads as p1,n=xn−2=Pn−2,1. In particular, the main diagonal entries are bi=1,i=1,2,…,n−2. The upper diagonal is ci=−1n−i,i=1,2,…,n−3 and lower diagonal are ai=−n−i−1n−i,i=2,3,…,n−2. The Thomas algorithm consists of two sweeps, first c′∈Rn−3 and d′∈Rn−2 and then another that starts with solving for xn−2 and then sweeps backward from i=n−1 to i=1. In particular, we define c′1=c1=−1n−1 and c′i=ci1−aic′i−1,i=2,…,n−3 and d′1=d1, and d′i=di−aid′i−11−aic′i−1,i=2,3,…,n−2, and finally xn−2=d′n−2 and xi=d′i−c′ixi+1,i=n−1,n−2,…,1. Additionally, since di=0,i=1,…,n−3, we have d′i=0,i=1,…,n−3, with p1,n=Pn−2,1=xn−2=d′n−2=121−(−12)c′n−3=12+c′n−3, and since we only care about xn−2, we don't have to do any of the backward sweep.
So we can simply code up one recursive function that sweeps through the c′i variables for any i=1,…,n−3, for instance the following Python script:
def prob_zero(N):
c = -1.0 / float(N - 1.)
for i in range(2, N-2):
c = (-1.0 / float(N - i)) / (1. - (-1. + 1.0 / float(N - i)) * c)
return (0.5 / (1 + 0.5 * c))
We can just get the values of the probability of absorption at pad 1 for many, many values of n and emprically see that as n gets large we get close (fairly quickly to lim which is disturbingly close to 1 - e^{-1}. So much so, that if we wanted we could likely call it quits here and claim that the probability of ending up forever on pad 1 in the inifinite lily pad case is p_{1,\infty} = 1 - e^{-1}.
# of Lily Pads |
Absorption Prob. |
4 | 60% |
5 | 62.5% |
6 | 63.07692307692307% |
7 | 63.19018404907976% |
8 | 63.20899335717935% |
9 | 63.21167883211679% |
10 | 63.21201448891889% |
11 | 63.21205178374104% |
12 | 63.21205551321910% |
13 | 63.21205585226252% |
14 | 63.21205588051614% |
15 | 63.21205588268949% |
16 | 63.21205588284473% |
17 | 63.21205588285508% |
18 | 63.21205588285572% |
19 | 63.21205588285577% |
20 | 63.21205588285577% |
However, perhaps we can do better than just our anecdotal, experimental evidence. Let's start at p_{1,n} = \frac{1}{2+c^\prime_{n-3}(n)}. Using the recursive formula, we get c^\prime_{n-3}(n) = \frac{c_{n-3}}{1 - a_{n-3} c^\prime_{n-4}(n)} = \frac{-\frac{1}{3}}{1 + \frac{2}{3} c^\prime_{n-4}(n)} = -\frac{1}{3 + 2 c^\prime_{n-4}(n)}. Going a step further we get c^\prime_{n-4}(n) = \frac{ -\frac{1}{4} }{1 - \frac{3}{4} c^\prime_{n-5}(n)} = -\frac{1}{4 + 3c^\prime_{n-5}(n)}, which plugging back into the previous equation gives c^\prime_{n-3}(n) = - \frac{1}{3 + 2 \frac{ -1 }{4 + 3 c^\prime_{n-5}(n)} } = - \frac{4 + 3 c^\prime_{n-5}(n)}{10 + 9 c^\prime_{n-5}(n)}. We see that we can define c^\prime_{n-3}(n) as a rational function of c^\prime_{n-k}(n), for each k = 4, \dots, n-1. In particular, let's say that c^\prime_{n-3}(n) = -\frac{ \alpha_k + (\alpha_k - 1) c^\prime_{n-k}(n) }{\beta_k + (\beta_k - 1) c^\prime_{n-k}(n)}, where \alpha_4 = 1 and \beta_4 = 3. From the recursion formula for c^\prime we get that c^\prime_{n-3}(n) = - \frac{\alpha_k + (\alpha_k - 1) \frac{ - 1 }{k + (k-1) c^\prime_{n-k-1}(n)} }{ \beta_k + (\beta_k - 1) \frac{ -1}{ k + (k-1) c^\prime_{n-k-1}(n)}} = -\frac{ k \alpha_k - (\alpha_k - 1) + (k-1) \alpha_k c^\prime_{n-k-1}(n)} { k \beta_k - (\beta_k - 1) + (k-1) \beta_k c^\prime_{n-k-1}(n)}, so we can derive the recursion formulae for \alpha_k and \beta_k as \begin{align*}\alpha_{k+1} &= (k-1) \alpha_k + 1 \\ \beta_{k+1} &= (k-1) \beta_k + 1,\end{align*} for all k \geq 4, with initial conditions \alpha_4 = 1 and \beta_4 = 3.
So, trying to organize the situation a bit, we notice that since c_1^\prime(n) = -\frac{1}{n-1} we would then have to stop the recursion, we get c^\prime_{n-3}(n) = - \frac{ \alpha_{n-1} - \frac{\alpha_{n-1} - 1}{n-1} }{ \beta_{n-1} - \frac{\beta_{n-1} - 1}{n-1} } = \frac{ \alpha_{n-1} (n-2) + 1 }{\beta_{n-1} (n-2) + 1}, where \alpha_{n-1} and \beta_{n-1} can be read from the recursion formulae above. Plugging this back into the absorption probability, we see that \begin{align*}p_{1,n} &= \frac{1}{2 + c_{n-3}^\prime(n)} \\ &= \frac{1}{2 - \frac{ \alpha_{n-1} (n-2) + 1}{ \beta_{n-1} (n-2) + 1} } \\ & = \frac{ \beta_{n-1} (n-2) + 1 }{ (2 \beta_{n-1} - \alpha_{n-1}) (n-2) + 1 } \\ &= \frac{ \beta_{n-1} + O (n^{-1}) }{ (2 \beta_{n-1} - \alpha_{n-1}) + O(n^{-1}) }.\end{align*} Now, let's turn our attention to the growth of \alpha_{n-1} and \beta_{n-1}.
Given the recursion formulae and initial conditions, we can show that in fact \alpha_k = \lfloor (e-2) (k-2)! \rfloor and \beta_k = \lfloor (e-1) (k-2)! \rfloor for all k \geq 4. Clearly, the initial conditions hold since \lfloor (e-2) \cdot (4-2)! \rfloor = \lfloor 1.436\dots \rfloor = 1 and \lfloor (e-1) \cdot (4-2)! \rfloor = \lfloor 3.436\dots \rfloor = 3. Assume that for some j \geq 4 that \begin{align*}\alpha_j &= \lfloor (e-2) (j-2)! \rfloor \\ \beta_j &= \lfloor (e-1) (j-2)! \rfloor.\end{align*} From the recursion formula, we get \begin{align*} \alpha_{j+1} &= (j-1) \alpha_j + 1 = (j-1) \lfloor (e-2) (j-2)! \rfloor + 1 \\ \beta_{j+1} &= (j-1) \beta_j + 1 = (j-1) \lfloor (e-1) (j-2)! \rfloor + 1.\end{align*} Now at this point we will rely on the following lemma (that can be proven using an application of the Taylor remainder theorem) that states that for any \ell = 2, \dots, \infty, \sum_{k = \ell}^\infty \frac{1}{k!} \lt \frac{1}{(\ell - 1)!}. So using this lemma we see that \begin{align*}\lfloor e (j-1)! \rfloor &= \left\lfloor \sum_{k=0}^{j-1} \frac{(j-1)!}{k!} + (j-1)! \sum_{k=j}^\infty \frac{1}{k!} \right\rfloor \\ &= (j-1)! \sum_{k=0}^{j-1} \frac{1}{k!} \\ &= (j-1)! \sum_{k=0}^{j-2} \frac{1}{k!} + 1 \\ &= (j-1) \left( (j-2)! \sum_{k=0}^{j-2} \frac{1}{k!} \right) + 1 \\ &= (j-1) \lfloor e (j-2)! \rfloor + 1.\end{align*} So plugging back into the recursion formulae above, since j! and 2 \cdot j! are always integers for any j, we confirm the induction step, since \begin{align*} \alpha_{j+1} &= (j-1) \lfloor (e-2) (j-2)! \rfloor + 1 = \lfloor (e-2) (j-1) ! \rfloor \\ \beta_{j+1} &= (j-1) \lfloor (e-1) (j-2)! \rfloor + 1 = \lfloor (e-1) (j-1)! \rfloor,\end{align*} thus proving the formulae \alpha_k = \lfloor (e-2) (k-2)! \rfloor, \beta_k = \lfloor (e-1) (k-2)! \rfloor, \,\, \forall k \geq 4.
Finally, returning our attention to the formula for the absorption probability we get \begin{align*}p_{1,n} &= \frac{ \beta_{n-1} + O (n^{-1}) }{ (2 \beta_{n-1} - \alpha_{n-1}) + O(n^{-1})}\\ &= \frac{ \lfloor (e-1) (n-2)! \rfloor + O(n^{-1}) }{ ( 2 \lfloor (e-1) (n-2)! \rfloor - \lfloor (e-2) (n-2)! \rfloor + O(n^{-1}) } \\ &= \frac{ (n-2)! \sum_{k=1}^{n-2} \frac{1}{k!} + O(n^{-1}) }{ (n-2)! \left( 2 \sum_{k=1}^{n-2} \frac{1}{k!} - \sum_{k=2}^{n-2} \frac{1}{k!} \right) + O(n^{-1})} \\ &= \frac{ \sum_{k=1}^{n-2} \frac{1}{k!} + O\left( \frac{1}{ (n-1)!}\right) }{ \left(2 \sum_{k=1}^{n-2} \frac{1}{k!} - \sum_{k=1}^{n-2} \frac{1}{k!} \right) + O \left(\frac{1}{(n-1)!}\right)},\end{align*} so we have that the probability of ending up on pad 1 in the infinite lily pad problem is \begin{align*}p_{1,\infty} = \lim_{n \to \infty} p_{1,n} &= \lim_{n\to \infty} \frac{ \sum_{k=1}^{n-2} \frac{1}{k!} + O\left( \frac{1}{ (n-1)!}\right)}{ \left(2 \sum_{k=1}^{n-2} \frac{1}{k!} - \sum_{k=1}^{n-2} \frac{1}{k!} \right) + O\left( \frac{1}{ (n-1)!}\right)}\\ &= \frac{\sum_{k=1}^\infty \frac{1}{k!}}{ 2 \sum_{k=1}^\infty \frac{1}{k!} - \sum_{k=2}^\infty \frac{1}{k!}} \\ &= \frac{e-1}{2(e-1) - (e-2)} = 1 - e^{-1},\end{align*} as experimentally derived.