Ed Nigma's Answer Emporium: R. classicum ...? More like R. catalanii

The R. classicum reproductive behavior leads to some familiar combinatorial behavior involving the Catalan numbers. I'm not totally sure how to get the made up scientific nomenclature of a non-existent bacteria changed ... but I think we should look into it.

First the setup:
R. classicum either splits into two copies (with probability $p$) or dies (with probability $1-p$). If you start with a single R. classicum bacterium, what is the probability that it will lead to an everlasting colony (i.e., the colony will theoretically persist for an infinite amount of time)?

Theorem. The probability that a single R. classicum bacteria leads to an everlasting colony is \[E(p) = \max \left\{ 2 - \frac{1}{p}, 0 \right\}.\] In addition to the colony population at time $t$, $N_t,$ let's additionally catalog the history of a bacteria colony by tracking two quantities: $R_t$ and $D_t$, the number of reproductions/doublings and deaths that have occurred up until time $t.$ For consistency, let's define $R_0=D_0=0.$

Proposition. For each $t \geq 0,$ \[N_t = R_t - D_t + 1.\]
Proof. We will proceed by induction. The base case of $t = 0$ satisfies the equation.
Assume that for some time $t = \tau \geq 0$ that the formula holds, that is, $N_\tau = R_\tau - D_\tau + 1.$
Mechanically we see that $N_{\tau+1}$ is equal to twice the number of bacteria that reproduced at time $t = \tau+1$, so in particular \[R_{\tau+1} = \frac{N_{\tau+1}}{2} + R_\tau.\] Additionally, the remainder of the population at time $t=\tau$ that did not reproduce must have died, we have \[D_{\tau+1} = \left(N_\tau - \frac{N_{\tau+1}}{2}\right) + D_\tau.\] If we subtract the deaths equation from the reproductions, we get \[R_{\tau+1} - D_{\tau+1} = N_{\tau+1} - N_\tau + R_\tau - D_\tau.\] From the inductive hypothesis we have, $N_\tau = R_\tau - D_\tau + 1,$ so we can substitute above and get \[R_{\tau+1} - D_{\tau+1} = N_{\tau+1} - (R_\tau - D_\tau + 1) + R_\tau -D_\tau = N_{\tau+1} - 1,\] or equivalently $N_{\tau+1} = R_{\tau+1} - D_{\tau+1} + 1.$ Thus, by simple induction, the proposition holds for all time steps $t \geq 0.$

The corollary follows immediately.

Corollary. The colony population collapses if and only if there is some step $t > 0$ for which $D_t = R_t + 1.$

Using the corollary, we can then proceed with the proof of the main Theorem.

Proof. Let's calculate $1 - E(p),$ which is the the probability that the R. classicum colony eventually collapses. From the corollary, this means that there is some $t>0$ for which $D_t = R_t + 1.$ We see that the probability of a colony history with, say $k$ reproductions and $k+1$ deaths is $p^k (1-p)^{k+1}.$ Let $C_k$ denote the number of distinct colony histories that end in collapse after $k$ reproductions and $k+1$ deaths.
Then the probability of that the colony eventually collapses is \[1 - E(p) = \sum_{k=0}^\infty C_k p^k (1-p)^{k+1}.\] Now the only thing that remains to show is to find out what $C_k$ is (though I imagine you've already gathered where I'm going with this...). Since a population collapse needs to end it at least one death, the remaining $k$ reproductions and $k$ deaths can be ordered in any way such that the cumulative number of deaths is never more than the cumulative number of reproductions. Thus there is a simple bijection between colony histories with $k$ reproductions and $k+1$ deaths and nonnegative Dyck paths of length $2k$ (for instance set reproductions to ``up'' steps and deaths to ``down'' steps), of which there are \[C_k = \frac{1}{k+1} \binom{2k}{k},\] that is, the $k$th Catalan number, such paths.
Using the generating function for the Catalan numbers, \[c(x) = \sum_{n=0}^\infty C_n x^n = \frac{1 - \sqrt{1-4x}}{2x}\] we get that the probability of eventual colony collapse is \begin{align}1 - E(p) &= \sum_{k=0}^\infty C_k p^k (1-p)^{k+1}\\ &= (1-p) c\Big( p(1-p) \Big)\\ &= (1-p) \frac{ 1 - \sqrt{1 - 4p(1-p)}}{2p(1-p)} \\ &= \frac{1 - \sqrt{1-4p+4p^2}}{2p} = \frac{1 - |1-2p|}{2p}.\end{align} So, for $p > 0.5,$ we have \[E(p) = 1 - \frac{1-|1-2p|}{2p} = 1 - \frac{ 1 - (2p-1) }{2p} = 1 - \frac{2(1-p)}{2p} = 2 - \frac{1}{p};\] whereas for $p \leq 0.5,$ we have \[E(p) = 1 - \frac{1-|1-2p|}{2p} = 1 - \frac{1 - (1-2p)}{2p} = 1 - \frac{2p}{2p} = 0.\] Altogether, the formula $E(p) = \max\{ 2 - \frac{1}{p}, 0 \}$ is proved.

Editor's Note: I actually solved this brute force in Excel by setting up several steps of the process, seeing the when $p=0.8$ that the probability of having a zero population quickly to $0.25$. Varying the input value of $p$ quickly showed that the probability of having a zero population was $\min\{1,(1-p)/p\}$. However, I was convinced that there was a theoretical solution and was pleasantly surprised that it led solidly back to my undergraduate research in enumerative combinatorics with Prof. Ira Gessel at Brandeis University (cf. this sloppy arXiv article).

Editor's Note 2: This seems like a really convoluted way of proving that if the expected population of the colony decreases to zero over time (that is, when $p < 0.5$), that population collapse is almost certain. I suppose it is a roughly serviceable level of convolution for showing almost sure population collapse when the expected value is constant (that is, when $p=0.5$).

Ed Nigma's Answer Emporium

Saturday, June 13, 2020

R. classicum ...? More like R. catalanii

No comments:

Post a Comment