You and your friends are singing the traditional song, “99 Bottles of Beer.” With each verse, you count down the number of bottles. The first verse contains the lyrics “99 bottles of beer,” the second verse contains the lyrics “98 bottles of beer,” and so on. The last verse contains the lyrics “1 bottle of beer.”
There’s just one problem. When completing any given verse, your group of friends has a tendency to forget which verse they’re on. When this happens, you finish the verse you are currently singing and then go back to the beginning of the song (with 99 bottles) on the next verse.
For each verse, suppose you have a 1 percent chance of forgetting which verse you are currently singing. On average, how many total verses will you sing in the song?
Extra credit: Instead of “99 Bottles of Beer,” suppose you and your friends are singing “N Bottles of Beer,” where N is some very, very large number. And suppose your collective probability of forgetting where you are in the song is 1/N for each verse. If it takes you an average of K verses to finish the song, what value does the ratio of K/N approach?
Let's define some variables. Let's say that B_n is the number of bottles of beer on the wall in the nth verse and let's say that you start with B_1 = N bottles of beer. Then the transition from B_n to B_{n+1} is given by B_{n+1} = \begin{cases} B_n - 1, &\text{with probability $1-p$}\\ N, &\text{with probability $p$}\end{cases} which holds even if B_n = 1. That is, the last verse is "1 bottle of beer on the wall, 1 bottle of beer, take 1 down pass it around, no bottles of beer on the wall," but somehow perhaps having ingested a few too many of the bottles of beer, your crew could still end up singing "1 bottle of beer on the wall, 1 bottle of beer, take 1 down pass it around, N bottles of beer on the wall" and you'd have to make your way all the way back down. So in this setup, we need to also state that the transition with B_{n+1} = 0 if B_n = 0, \forall n. Finally, let's define the expected number of verses as V_n = \mathbb{E}\left[ \min \{ k : B_k = 0 \} \mid B_1 = n\right], for any n = 1, \dots, N.
Now since there are only two possibilities at the end of each verse, we have \begin{align}V_n &= \mathbb{E} \left[ \min \{k : B_k = 0\} \mid B_2 = n-1, B_1 = n \right]
\mathbb{P} \{ B_2 = n-1 \mid B_1 = n \}\\ &\quad\quad + \mathbb{E} \left[ \min \{k : B_k=0\} \mid B_2 = N, B_1 = n \right] \mathbb{P} \left\{ B_2 = N \mid B_1 = n \right\} \\ &= (1 + V_{N-1}) (1-p) + (1 + V_N) p\\ &= (1-p) V_{n-1} + pV_N + 1,\end{align} for all n = 1, \dots, N, with V_0 = 0.
So in particular, we have V_N = (pV_N + 1) + (1-p) V_{N-1}. We can use this as the base case to prove that V_N = (pV_N+1) \sum_{i=0}^{k-1} (1-p)^i + (1-p)^k V_{N-k} for all k = 1, \dots, N. In particular, assume that for some k > 1, then substituting in the recurrence relationship for V_{N-k} we get \begin{align*} V_N &= (pV_N+1) \sum_{i=0}^{k-1} (1-p)^i + (1-p)^k ( (1-p) V_{N-k-1} + pV_N + 1) \\ &= (pV_N+1) \sum_{i=0}^k (1-p)^i + (1-p)^{k+1} V_{N-k-1},\end{align*} that is, the formula is also true for k+1. So, by induction, the formula must hold for all k = 1, \dots, N, including for instance k=N, so that \begin{align*}V_N &= (pV_N+1) \sum_{i=0}^{N-1} (1-p)^i + (1-p)^N V_0 = (pV_N+1) \sum_{i=0}^{N-1} (1-p)^i \\ &= (pV_N+1) \frac{1 - (1-p)^N}{p}.\end{align*}
Solving for V_N we get V_N = \frac{1 - (1-p)^N}{p(1-p)^N}. So in particular if N = 99 and p = 0.01, then we get an average of \frac{1-(0.99)^{99}}{0.01 * (0.99)^{99}} \approx 170.467 verses. However, if we set p = 1/N, then we get K=V_N = N\frac{1-(1-1/N)^N}{(1-1/N)^N}, so the fraction K/N = \frac{1 - (1-1/N)^N}{(1-1/N)^N} = (1-1/N)^{-N} - 1 \to e-1 as N\to \infty.