Processing math: 0%

Monday, April 7, 2025

Striking Ore while digging into Lipkovitz's Hamiltonian puzzle

A teacher is handing out candy to his students. He abides by the following rules:

  • He hands out candy to groups of three students (i.e., “trios”) at a time. Each member of the trio gets one piece of candy.
  • Each unique trio can ask for candy, but that same trio can’t come back for seconds. If students in the trio want more candy, they must return as part of a different trio.
  • When a trio gets candy, the next trio can’t contain any students from that previous trio.

It turns out that every possible trio can get a helping of candy together. What is the smallest class size for which this is possible?

Obviously, I assume that we are not talking about the trivial case of N=3 students, because, ... lame!

So let's assume that there are N>3 students. Then there will be M = \binom{N}{3} = \frac{1}{6}N(N-1)(N-2) possible trios in the class. Let's assume that the set V = \{ v_1, v_2, \dots, v_M \} enumerates all such trios. Based on the teacher's final condition, we can see that if v_i and v_j share a common student, then v_j cannot be the next trio to ask if v_i has just received candy. Let's define a set of edges on the vertex set E = \{ (v_i, v_j) \in V \times V \mid j \gt i, v_i \cap v_j = \emptyset \}, that is, if the trio v_i could immediately follow the trio v_j (or vice versa), then there is an edge between v_i and v_j. We will call vertices v_i and v_j adjacent if there is an edge e \in E connecting v_i and v_j. Let's define the graph \mathcal{G} = (V, E). Note that we define the degree of a vertex, denoted \delta(v), to be the number of vertices w \in V that are adjacent to v, that is, \delta(v) = \# \{ w \in V \mid e = (v,w) \in E \}.

Since it turns out that every possible trio can get a helping of candy together, but based on the second condition each trio can only ask for candy once, then there must be some path that on \mathcal{G} that visits each vertex exactly once. This type of path is known in graph theory as a Hamiltonian path. A graph is called Hamiltonian if it admits a Hamiltonian path starts and ends at adjacent vertices. The Norwegian graph theorist Øystein Ore provided the following sufficient condition for a graph to Hamiltonian based on the degrees of its vertices:

Ore's Theorem: A graph \mathcal{G} = (V, E) with |V| = n is Hamiltonian if for every pair of non-adjacent vertices v, w \in V, \delta(v) + \delta(w) \geq n.

Let's note that for our graph \mathcal{G}_N, since any vertex v is adjacent to any other trio made up of distinct students, we have \delta(v) = \binom{N-3}{3} = \frac{1}{6} (N-3)(N-4)(N-5) = \frac{N^3 -12N^2 + 47N - 60}{6}. So from an application of Ore's theorem, whenever M = \frac{N^3-3N^2+2N}{6} \leq 2 \frac{N^3 -12N^2 +47N - 60}{6} or equivalently, whenever N^3 - 21N^2 +92N -120 \geq 0, then \mathcal{G} is Hamiltonian. Since the real root of the polynomial p(t) = t^3 -21t^2 +92t -120 is at t^* \approx 15.59.... we have that whenever N \geq 16, then every possible trio can have a helping of candy together.

Monday, March 31, 2025

Boosting busts more seeds in bigger tourneys

Instead of four teams, now there are 2^6, or 64, seeded from 1 through 64. The power index of each team is equal to 65 minus that team’s seed.

The teams play in a traditional seeded tournament format. That is, in the first round, the sum of opponents’ seeds is 2^6+1, or 65. If the stronger team always advances, then the sum of opponents’ seeds in the second round is 2^5+1, or 33, and so on.

Once again, the underdog in every match gets a power index boost B, where B is some positive non-integer. Depending on the value of B, different teams will win the tournament. Of the 64 teams, how many can never win, regardless of the value of B?

After setting up the 64 team bracket, we can choose various values for B and then compute the winners directly. Since the outcome will remain the same for all non-integer values in (\lfloor B \rfloor, \lceil B \rceil), we need only select a total of 64 values, e.g., b+0.5, for b = 0, 1, \dots, 63. The summary of the outcomes of the 64 team tournament bracket are in the table below. We note that there are 27 seeds (namely, 7, 13-15, and 25-47) who never win the tournament. Most unlucky out of each of them are 7, 15 and 47, each of which have sets of B of measure 3 for which they will end up the runner up of the tournament, though they will never win it.

B Winning Seed Runner Up
(0,1) 1 2
(1,2) 1 3
(2,3) 3 1
(3,4) 2 1
(4,5) 6 5
(5,6) 5 3
(6,7) 5 3
(7,8) 4 3
(8,9) 12 11
(9,10) 11 9
(10,11) 11 9
(11,12) 10 9
(12,13) 10 9
(13,14) 9 7
(14,15) 9 7
(15,16) 8 7
(16,17) 24 23
(17,18) 23 21
(18,19) 23 21
(19,20) 22 21
(20,21) 22 21
(21,22) 21 19
(22,23) 21 19
(23,24) 20 19
(24,25) 20 19
(25,26) 19 17
(26,27) 19 17
(27,28) 18 17
(28,29) 18 17
(29,30) 17 15
(30,31) 17 15
(31,32) 16 15
(32,33) 48 47
(33,34) 49 47
(34,35) 49 47
(35,36) 50 49
(36,37) 50 49
(37,38) 51 49
(38,39) 51 49
(39,40) 52 51
(40,41) 52 51
(41,42) 53 51
(42,43) 53 51
(43,44) 54 53
(44,45) 54 53
(45,46) 55 53
(46,47) 55 53
(47,48) 56 55
(48,49) 56 55
(49,50) 57 55
(50,51) 57 55
(51,52) 58 57
(52,53) 58 57
(53,54) 59 57
(54,55) 59 57
(55,56) 60 59
(56,57) 60 59
(57,58) 61 59
(58,59) 61 59
(59,60) 62 61
(60,61) 62 61
(61,62) 63 61
(62,63) 63 61
\gt 63 64 63

Boosting will cause the 2 seed to always bust

Once again, there are four teams remaining in a bracket: the 1-seed, the 2-seed, the 3-seed, and the 4-seed. In the first round, the 1-seed faces the 4-seed, while the 2-seed faces the 3-seed. The winners of these two matches then face each other in the regional final.

Also, each team possesses a “power index” equal to 5 minus that team’s seed. In other words:

  • The 1-seed has a power index of 4.
  • The 2-seed has a power index of 3.
  • The 3-seed has a power index of 2.
  • The 4-seed has a power index of 1.

In any given matchup, the team with the greater power index would emerge victorious. However, March Madness fans love to root for the underdog. As a result, the team with the lower power index gets an effective “boost” B, where B is some positive non-integer. For example, B could be 0.5, 133.7, or 2\pi, but not 1 or 42.

As an illustration, consider the matchup between the 2- and 3-seeds. The favored 2-seed has a power index of 3, while the underdog 3-seed has a power index of 2+B. When B is greater than 1, the 3-seed will defeat the 2-seed in an upset.

Depending on the value of B, different teams will win the tournament. Of the four teams, how many can never win, regardless of the value of B?

As shown in the prompt, if B \lt 1 then 2 will beat 3 in the first round. On the other side of the bracket, 1 will beat 4, since 4 \gt B+1 in this case, so in the final round we have 1 beating 2 since again 4 \gt 3 + B when B \lt 1. So, since whenever B \lt 1, 1 will win the championship. On the other hand, whenever B \gt 1, 2 will lose to 3 in the first round. Therefore, 2 will never win the championship.

Whenever B \in (2,3), 3 will beat 1 for the championship, since 2 + B \gt 4. Whenver B \gt 3, 4 will beat 1 in the first round and then go on to beat 3 for the championship. Thus, there are values of B for which 1, 3 and 4 will win the championship. So out of the four remaining teams, only one of them (the 2 seed) will never win.

Sunday, March 16, 2025

An Extra Credit \pi day \pi-cnic in \pi-land

Suppose the island of \pi-land, as described above, has a radius of 1 mile. That is, Diametric Beach has a length of 2 miles. Again, you are picking a random point on the island for a picnic. On average, what will be the expected shortest distance to shore?

Here we want to calculate the average, minimum distance to shore E = \frac{1}{A} \int_{-R}^R \int_0^{\sqrt{R^2 - x^2}} \min \{ d_D (x,y), d_S (x,y) \} \,dy \,dx, where d_D and d_S were defined as above in the Classic Fiddler answer. As we saw in the Classic Fiddler answer, the region where d_D \leq d_S is given by \Omega, so we can break the integral above into two sections E = \frac{1}{A} \int_{-R}^R \int_0^{\frac{R^2 - x^2}{2R}} y \,dy\,dx + \frac{1}{A} \int_{-R}^R \int_{\frac{R^2-x^2}{2R}}^{\sqrt{R^2 - x^2}} \left(R - \sqrt{x^2+y^2}\right) \,dy \,dx := E_1 + E_2. The first integral, E_1, is relatively straightforward to do, \begin{align*}E_1 = \frac{1}{A} \int_{-R}^R \int_0^{\frac{R^2 - x^2}{2R}} y \, dy \,dx &= \frac{4}{\pi R^2} \int_0^R \left( \frac{y^2}{2} \right)_{y=0}^{y= \frac{R^2 - x^2}{2R}} \,dx\\ &= \frac{4}{\pi R^2} \int_0^R \frac{1}{2} \left( \frac{R^2 - x^2 }{2R} \right)^2 \,dx \\ &= \frac{4}{\pi R^2} \int_0^R \frac{1}{8R^2} \left( x^4 - 2R^2 x^2 + R^4 \right) \,dx \\ &= \frac{1}{2\pi R^4} \left[ \frac{x^5}{5} - \frac{2R^2x^3}{3} + R^4 x \right]_{x=0}^{x=R} \\ &= \frac{1}{2\pi R^4} \left( \frac{R^5}{5} - \frac{2R^5}{3} + R^5 \right) \\ &= \frac{R}{2\pi} \left( \frac{1}{5} - \frac{2}{3} + 1 \right) \\ &= \frac{R}{2\pi} \frac{3 - 10 + 15}{15} = \frac{4R}{15 \pi} \end{align*}

For the second integral, it will be easier to switch to polar coordinates, then do some trig-substitutions to get a handle on the resulting integral. The curve y = \frac{R^2 - x^2}{2R} is equivalent to the equation x^2 + 2R y - R^2 = 0 which is given by r^2 \cos^2 \theta + 2R r \sin \theta - R^2 = 0 in polar coordinates. Solving the quadratic in terms of r and choosing the positive solution when \sin \theta \geq 0 since \theta \in [0,\pi], we get \begin{align*}r &= \frac{ -2R \sin \theta + \sqrt{ 4R^2 \sin^2 \theta + 4R^2 \cos^2 \theta} }{2 \cos^2 \theta} \\ &= \frac{-2R \sin \theta + 2R}{2 \cos^2 \theta} \\ &= \frac{ R \left( 1 - \sin \theta \right) }{ \cos^2 \theta } \\ &= \frac{R}{1 + \sin \theta},\end{align*} since \cos^2 \theta = 1 - \sin^2 \theta = (1 - \sin \theta) (1 + \sin \theta). Therefore, we have \begin{align*}E_2 &= \frac{1}{A} \int_{-R}^R \int_{\frac{R^2 - x^2}{2R}}^{\sqrt{R^2 - x^2}} \left( R - \sqrt{x^2 + y^2} \right) \,dy \,dx \\ &= \frac{2}{\pi R^2} \int_0^\pi \int_{\frac{R}{1 + \sin \theta}}^R (R - r) r \,dr \, d\theta \\ &= \frac{2}{\pi R^2} \int_0^\pi \left[ \frac{Rr^2}{2} - \frac{r^3}{3} \right]^{r=R}_{r = \frac{R}{1 + \sin \theta}} \, d\theta \\ &= \frac{2}{\pi R^2} \int_0^\pi \left( \frac{R^3}{2} - \frac{R^3}{3} \right) - \left( \frac{R^3}{2(1 + \sin \theta)^2} - \frac{R^3}{3(1 + \sin \theta)^3} \right) \,d\theta \\ &= \frac{R}{3} - \frac{R}{3\pi} \int_0^\pi \frac{3 \sin \theta + 1}{(1 + \sin \theta)^3} \,d\theta = \frac{R}{3} - I.\end{align*} In order to calculate the last integral I, we need to do a trigonometric half-angle substitution with u = \tan \frac{\theta}{2}. Here we see that du = \frac{1}{2} \sec^2 \frac{\theta}{2} \,d\theta = \frac{1}{2} \left(1 + \tan^2 \frac{\theta}{2} \right) \,d\theta = \frac{1 + u^2}{2} \,d\theta and further that \sin \theta = 2 \sin \frac{\theta}{2} \cos \frac{\theta}{2} = 2 \tan \frac{\theta}{2} \cos^2 \frac{\theta}{2} = \frac{2 \tan \frac{\theta}{2}}{1 + \tan^2 \frac{\theta}{2}} = \frac{2u}{1+u^2}. We additionally see that when \theta = 0, then u = 0, whereas when \theta = \pi, then u = \tan \frac{\pi}{2} = \infty. Therefore we get \begin{align*} I &= \frac{R}{3\pi} \int_0^\pi \frac{3 \sin \theta + 1}{(1 + \sin \theta)^3} \,d\theta \\ &= \frac{2R}{3\pi} \int_0^\infty \frac{ 3 \frac{2u}{1+u^2} + 1 }{\left(1 + \frac{2u}{1 + u^2}\right)^3} \frac{2 \,du}{1 + u^2} \\ &= \frac{2R}{3\pi} \int_0^\infty \frac{ (1 + 6u + u^2) ( 1 + u^2 ) }{ (1 + u)^6 } \, du.\end{align*} Making a further substitution of v = 1 + u, we then get \begin{align*} I = \frac{R}{3\pi} \int_0^\infty \frac{ (1 + 6u + u^2) ( 1 + u^2 ) }{ (1 + u)^6 } \, du &= \frac{2R}{3\pi} \int_1^\infty \frac{ (1 + 6(v-1) + (v-1)^2) ( 1 + (v-1)^2 ) }{v^6} \, dv \\ &= \frac{2R}{3\pi} \int_1^\infty \frac{ (v^2 + 4v - 4)(v^2 - 2v + 2) }{v^6} \,dv \\ &= \frac{2R}{3\pi} \int_1^\infty v^{-2} + 2v^{-3} - 10v^{-4} + 16v^{-5} -8v^{-6} \,dv \\ &= \frac{2R}{3\pi} \left[ -\frac{1}{v} -\frac{1}{v^2} + \frac{10}{3v^3} -\frac{4}{v^4} + \frac{8}{5v^5} \right]_{v=1}^{v=\infty} \\ &= -\frac{2R}{3\pi} \left( -1 -1 + \frac{10}{3} -4 + \frac{8}{5} \right) = \frac{32R}{45\pi} \end{align*} Putting this altogether we get E_2 = \frac{R}{3} - \frac{32R}{45\pi} = \frac{R(15\pi - 32)}{45 \pi}.

Therefore, the average minimal distance to the beach on \pi-land if the picnic spot is uniformly randomly chosen over the area of \pi-land is E = E_1 + E_2 = \frac{4R}{15\pi} + \frac{R(15 \pi - 32)}{45 \pi} = \frac{R(15\pi -20)}{45 \pi} = \frac{R(3\pi - 4)}{9\pi}. So in particular, if R = 1, then we get an average distance to the beach of E = \frac{3 \pi - 4}{9\pi} \approx 0.191862272807\dots.

A \pi day \picnic on \pi-land

You are planning a picnic on the remote tropical island of \pi-land. The island’s shape is a perfect semi-disk with two beaches, as illustrated below: Semicircular Beach (along the northern semicircular edge of the disk) and Diametric Beach (along the southern diameter of the disk).

If you pick a random spot on \pi-land for your picnic, what is the probability that it will be closer to Diametric Beach than to Semicircular Beach? (Unlike the illustrative diagram above, assume the beaches have zero width.)

The local \pi-landers typically measure everything from the midpoint of the Diametric Beach, so let's assume that point is the origin of our xy-plane, with Diametric Beach coinciding with the x-axis. In this case, assuming that \pi-land has a radius of R, then the entire area of \pi-land is A = \frac{1}{2} \pi R^2. At any point (x,y) on the \pi-land the distance to the Diametric Beach is given by d_D (x,y) = y, while the distance to the Semicircular Beach is d_S (x,y) = R - \sqrt{x^2 + y^2}. So the region of \pi-land that is closer to Diametric Beach than to Semicircular Beach is given by \begin{align*} \Omega &= \left\{ (x,y) \in \mathbb{R}^2_+ \mid x^2 + y^2 \leq R^2, d_D(x,y) \leq d_S(x,y) \right\} \\ &= \left\{ (x,y) \in \mathbb{R}_+^2 \mid y \leq R - \sqrt{x^2+ y^2} \right\} \\ &= \left\{ (x,y) \in \mathbb{R}^2_+ \mid x^2 + y^2 \leq (R-y)^2 \right\} \\ &= \left\{ (x,y) \in \mathbb{R}^2_+ \mid x^2 \leq R^2 -2Ry \right\} \\ &= \left\{ (x,y) \in \mathbb{R}^2_+ \mid y \leq \frac{R^2 - x^2}{2R} \right\} \end{align*}

The area of \Omega is given by the integral A_\Omega = \int_{-R}^R \frac{R^2 - x^2}{2R} \,dx = 2 \left[ \frac{Rx}{2} - \frac{x^3}{6R} \right]^{x=R}_{x=0} = 2 \left( \frac{R^2}{2} - \frac{R^2}{6} \right) = \frac{2R^2}{3}. Therefore, the probability of randomly choosing a \pi-land picnic spot closer to Diametric Beach than to Semicircular Beach, that is, in \Omega, is given by p = \frac{A_\Omega}{A} = \frac{ \frac{2R^2}{3} }{ \frac{\pi R^2}{2} } = \frac{4}{3\pi} \approx 0.424413181578....

Sunday, March 9, 2025

Well below average domino placement

You are placing many, many dominoes in a straight line, one at a time. However, each time you place a domino, there is a 1 percent chance that you accidentally tip it over, causing a chain reaction that tips over all dominoes you’ve placed. After a chain reaction, you start over again.

If you do this many, many times, what can you expect the median (note: not the average) number of dominoes placed when a chain reaction occurs (including the domino that causes the chain reaction)?

Let's abstract to using a generic probability of accidentally tipping all of the dominoes over at p\gt 0. Let D_p be the random number representing the total number of dominoes (including the domino that you originally tip over) when the probability of accidentally tipping the dominoes over for any individual domino is p \gt 0, then we have \mathbb{P} \left[ D_p = d \right] = p (1-p)^{d-1}, since in order for you to tip over the dth domino you must first not have knocked over the first (d-1) dominoes, each with a probability of (1-p), and then knock over the dth domino, with probability p.

So for any d \gt 0, the probability of D \leq d is equal to \mathbb{P} \left[ D_p \leq d \right] = \sum_{m=1}^{d} \mathbb{P} \left[ D_p = m \right] = \sum_{m=1}^d p(1-p)^{m-1} = p \cdot \frac{1 - (1-p)^d}{1-(1-p)} = 1 - (1-p)^d. Alternatively, we have \mathbb{P} \left[ D_p \gt d \right] = 1 - \mathbb{P} \left[ D_p \leq d \right] = (1-p)^d. The integer M_p is the median of the distribution of D_p if and only if we have \mathbb{P} \left[ D_p \leq M_p \right] = 1 - (1 - p)^{M_p} \geq \frac{1}{2} and \mathbb{P} \left[ D_p \geq M_p \right] = \mathbb{P} \left[ D_p \gt M_p \right] + \mathbb{P} \left[ D_p = M_p \right] = (1-p)^{M_p-1} \geq \frac{1}{2}. Therefore, combining these two equations we need (1-p)^{M_p} \leq \frac{1}{2} \lt (1-p)^{M_p - 1}. Taking natural logarithms of all sides we get M_p \ln (1-p) \leq -\ln 2 \lt (M_p - 1) \ln(1-p), or equivalently M_p - 1 \lt -\frac{\ln 2}{\ln (1-p)} \leq M_p, or equivalently M_p = \Big\lceil -\frac{\ln 2}{\ln (1-p)} \Big\rceil.

Therefore, if p = 0.01, then we get M_{0.01} = \Big\lceil -\frac{\ln 2}{0.99} \Big\rceil = \lceil 68.9675639365\dots \rceil = 69 is the median number of dominoes that you will have placed when they all get knocked down. For further confirmation we see we get \mathbb{P} \left[ D_p \leq 68 \right] = 0.495114111213 \lt \frac{1}{2} \lt \mathbb{P} \left[ D_p \leq 69 \right] = 0.500162970101.

Though the note said to explicitly ignore the expectation, we see that \begin{align*}E_p = \mathbb{E} \left[ D_p \right] &= \sum_{m=1}^\infty m \mathbb{P} \left[ D_p = m \right]\\ &= \sum_{m=1}^\infty m p (1-p)^{m-1} \\ &= p \sum_{m=1}^\infty m (1-p)^{m-1} \\ &= p \left[ \frac{d}{dt} \left( \frac{1}{1-t} \right) \right|_{t=1-p} \\ &= p \left[ \frac{1}{(1-t)^2} \right|_{t = 1-p} \\ &= p \frac{1}{(1 - (1-p))^2} = p \frac{1}{p^2} = \frac{1}{p}.\end{align*} In particular, we see that for the Classic question, we have E_{0.01} = 100 \gt M_{0.01} = 69.

The Extra Credit question looks at the limit of M_p / E_p = p M_p as p \downarrow 0. Given the formulae above, we see that the limit of the ratio of the median to the average number of dominos placed is given by \lim_{p \to 0} p M_p = \lim_{p \to 0} p \cdot - \frac{\ln 2}{\ln ( 1- p) } = \lim_{p \to 0} \frac{ p\ln 2}{ - \ln (1-p) } = \lim_{p \to 0} \frac{ \ln 2 }{1 - p} = \ln 2, where the first equal sign comes from the squeeze theorem and the second to last equal sign is an application of L'Hôpital's Rule. So not only do we have M_p \lt E_p in the case of p = 0.01, but for all small p \approx 0.

Monday, March 3, 2025

Magical Rabbit Pulling

I have a hat with six small toy rabbits: two are orange, two are green, and two purple. I shuffle the rabbits around and randomly draw them out one at a time without replacement (i.e., once I draw a rabbit out, I never put it back in again).

Your job is to guess the color of each rabbit I draw out. For each guess, you know the history of the rabbits I’ve already drawn. So if we’re down to the final rabbit in the hat, you should be able to predict its color with certainty.

Every time you correctly predict the color of the rabbit I draw, you earn a point. If you play optimally (i.e., to maximize how many points you get), how many points can you expect to earn on average?

Let's define the Markov chain on \mathbb{N}^4 with states X = (g, o, p, s), where g is the number of green rabbits, o is the number orange rabbits, p is the number of purple rabbits and s is the current score. The state X = (g, o, p, s) in this magical rabbit pulling Markov chain is adjacent to the following states: (g-1, o, p, s), (g-1, o, p, s+1), (g, o-1, p, s), (g, o-1, p, s+1), (g, o, p-1, s) and (g,o, p-1, s+1), provided that the resulting quadruple remains in \mathbb{N}^4. The transition probabilities are given by the following: \begin{align*} p( X^{n+1} = (g-1,o,p,s) \mid X^n = (g,o,p,s) ) &= \begin{cases} \frac{g}{g+o+p} \left( 1 - \frac{1}{\arg\max \{g, o, p\}} \right), & \text{if $g = \max \{ g, o, p \};$}\\ \frac{g}{g+o+p}, &\text{if $g \lt \max \{ g, o, p \}$}\end{cases}\\ p( X^{n+1} = (g-1,o,p,s+1) \mid X^n = (g,o,p,s) ) &= \begin{cases} \frac{g}{g+o+p} \frac{1}{\arg\max \{g, o, p\}}, & \text{if $g = \max \{ g, o, p \};$}\\ 0, &\text{if $g \lt \max \{ g, o, p \}$}\end{cases} \\ p( X^{n+1} = (g,o-1,p,s) \mid X^n = (g,o,p,s) ) &= \begin{cases} \frac{o}{g+o+p} \left( 1 - \frac{1}{\arg\max \{g, o, p\}} \right), & \text{if $o = \max \{ g, o, p \};$}\\ \frac{g}{g+o+p}, &\text{if $o \lt \max \{ g, o, p \}$}\end{cases}\\ p( X^{n+1} = (g,o-1,p,s+1) \mid X^n = (g,o,p,s) ) &= \begin{cases} \frac{o}{g+o+p} \frac{1}{\arg\max \{g, o, p\}}, & \text{if $o = \max \{ g, o, p \};$}\\ 0, &\text{if $o \lt \max \{ g, o, p \}$}\end{cases} \\ p( X^{n+1} = (g,o,p-1,s) \mid X^n = (g,o,p,s) ) &= \begin{cases} \frac{p}{g+o+p} \left( 1 - \frac{1}{\arg\max \{g, o, p\}} \right), & \text{if $p = \max \{ g, o, p \};$}\\ \frac{g}{g+o+p}, &\text{if $p \lt \max \{ g, o, p \}$}\end{cases}\\ p( X^{n+1} = (g,o,p-1,s+1) \mid X^n = (g,o,p,s) ) &= \begin{cases} \frac{p}{g+o+p} \frac{1}{\arg\max \{g, o, p\}}, & \text{if $p = \max \{ g, o, p \};$}\\ 0, &\text{if $p \lt \max \{ g, o, p \}$}\end{cases} \end{align*} Here again we are only going to define the probabilities to the extend that these states remain in \mathbb{N}^4. In order to fully specify this as a Markov chain, let's assert that p( X^{n+1} = (0,0,0,s) \mid X^n = (0,0,0,s) ) = 1 for each s \in \mathbb{N} and that all other not explicitly stated transitions have a probability of zero.

Using these transition probabilities we can define the function f : \mathbb{N}^4 \to \mathbb{N} to be the expected total score when there are no more rabbits left in the hat conditional on starting at say X^0 = (g, o, p, s), or for instance we can have f(g, o, p, s) = \mathbb{E} \left[ e_4^T X^\infty \right] where \{ X^n \}_{n \in \mathbb{N}} is the Markov chain specified above. Given the transition probabilities, we can define the boundary conditions f(0,0,0,s) = s, for all s \in \mathbb{N}, with the recursion formula \begin{align*}f(g,o,p,s) &= I ( g \geq 1 ) \left( p( (g,o,p,s) \to (g-1, o, p, s) ) f(g-1, o, p, s) + p( (g, o, p, s) \to (g-1, o, p, s+1) ) f(g-1,o,p,s+1) \right) \\ & \quad + I ( o \geq 1 ) \left( p( (g,o,p,s) \to (g, o-1, p, s) ) f(g, o-1, p, s) + p( (g, o, p, s) \to (g, o-1, p, s+1) ) f(g,o-1,p,s+1) \right) \\ & \quad + I(p \geq 1) \left( p( (g,o,p,s) \to (g, o, p-1, s) ) f(g, o, p-1, s) + p( (g, o, p, s) \to (g, o, p-1, s+1) ) f(g,o,p-1,s+1) \right) \end{align*}

For the case of where we start out g = o = p = 2 rabbits and s =0 initial score, we can use this formula by hand repeatedly and eventually work out that we get the an average score of f(2,2,2,0) = \frac{101}{30} = 3.3666666666666666\dots if playing optimally.

For the case of starting with g=o=p=10 rabbits and s = 0, there are too many cases for many (or at least me) to keep track of, so we can appeal to a quick recursive function with memoization to make it relatively fast. For instance see the quick Python implementation below. This then gives the average score in the Extra Credit case of f(10,10,10,0) = 13.703132371084621\dots if playing optimally.

Sunday, February 23, 2025

Extra Credit LearnedLeague Defensive Efficiency

Now suppose your opponent is equally likely to get one, two, three, four, or five questions correct. As before, you randomly apply the six point values (0, 1, 1, 2, 2, 3) to the six questions. What is the probability that your defensive efficiency will be greater than 50 percent?

In the classic problem we costructed a table for N_2. Here we need to then construct similar tables for N_1, N_3, N_4 and N_5. For i=1, there is exactly one way apiece to score zero or 3 and two ways apiece to score 1 or 2, so we have the following table:

N_1 DE_1 \mathbb{P} \{ N_1 \}
0 1 \frac{1}{6}
1 \frac{2}{3} \frac{1}{3}
2 \frac{1}{3} \frac{1}{3}
3 0 \frac{1}{6}

Similarly, we can construct the other tables as follows:

N_3 DE_3 \mathbb{P} \{ N_3 \}
2 1 \frac{1}{20}
3 \frac{4}{5} \frac{1}{5}
4 \frac{3}{5} \frac{1}{4}
5 \frac{2}{5} \frac{1}{4}
6 \frac{1}{5} \frac{1}{5}
7 0 \frac{1}{20}
N_4 DE_4 \mathbb{P} \{ N_4 \}
4 1 \frac{2}{15}
5 \frac{3}{4} \frac{1}{5}
6 \frac{1}{2} \frac{1}{3}
7 \frac{1}{4} \frac{1}{5}
8 0 \frac{2}{15}
N_5 DE_5 \mathbb{P} \{ N_5 \}
6 1 \frac{1}{6}
7 \frac{2}{3} \frac{1}{3}
8 \frac{1}{3} \frac{1}{3}
9 0 \frac{1}{6}

In particular we see that \mathbb{P} \{ DE_i \gt \frac{1}{2} \} = \begin{cases} \frac{1}{2}, &\text{if $i \in \{1,3,5\}$;}\\ \frac{1}{3}, &\text{if $i \in \{2,4\}$.}\end{cases} So if your opponent answers I questions where I is uniformly distributed on the set \{1, 2,3,4,5\}, then the probability of having a defensive efficiency score greater than 50\% is equal to \begin{align*} \mathbb{P} \{ DE_I \gt \frac{1}{2} \} &= \sum_{i=1}^5 \mathbb{P} \{ DE_i \gt \frac{1}{2} \} \mathbb{P} \{ I = i \} \\ &= \frac{1}{5} \sum_{i=1}^5 \mathbb{P} \{ DE_i \gt \frac{1}{2} \} \\ &= \frac{1}{5} \left( \frac{1}{2} + \frac{1}{3} + \frac{1}{2} + \frac{1}{3} + \frac{1}{2} \right) \\ &= \frac{13}{30} = 43.\bar{3}\%.\end{align*}

LearnedLeague Defensive Efficiency

Every day, you and your opponent for the day are presented with the same six trivia questions. You each do your best to answer these, and you assign point values for your opponent, without knowing (until the following day) which questions your opponent answered correctly. You must assign point values of 0, 1, 1, 2, 2, and 3 to the six questions.

Now, when someone answers three questions correctly, like my opponent just hypothetically did, the fewest points they can earn is 0 + 1 + 1 = 2, while the most points they can earn is 2 + 2 + 3 = 7. The fact that they got 4 points wasn’t great (from my perspective), but wasn’t terrible. In LearnedLeague, my defensive efficiency is defined as the maximum possible points allowed minus actual points allowed, divided by the maximum possible points allowed minus the minimum possible points allowed. Here, that was (7−4)/(7−2), which simplified to 3/5, or 60 percent.

By this definition, defensive efficiency ranges somewhere between 0 and 100 percent. (That is, assuming it’s even defined—which it’s not when your opponent gets zero questions right or all six questions right.)

Suppose you know for a fact that your opponent will get two questions right. However, you have absolutely no idea which two questions these are, and so you randomly apply the six point values to the six questions.

What is the probability that your defensive efficiency for the day will be greater than 50 percent?

For each i = 1, 2, 3, 4, 5, let's N_i be the random score of your opponent, conditioned on getting exactly i answers correct. Given the set of weights we see that \begin{align*}0 \leq & N_1 \leq 3 \\ 1 \leq & N_2 \leq 5 \\ 2 \leq & N_3 \leq 7 \\ 4 \leq & N_4 \leq 8 \\ 6 \leq & N_5 \leq 9,\end{align*} though the distributions are not uniformly distributed. In this case, DE_i(N_i) = \begin{cases} \frac{3 - N_1}{3}, &\text{if $i=1;$}\\ \frac{5 - N_2}{4}, &\text{if $i=2;$}\\ \frac{7-N_3}{5}, &\text{if $i=3;$}\\ \frac{8-N_4}{4}, &\text{if $i=4;$}\\ \frac{9 - N_5}{3}, &\text{if $i=5.$}\end{cases}

The only thing that we need to do is determine the probabilities for each value of i=2.

Since there are two instances of the weight 1, there are two possible outcomes that give a total score of N_2 = 1. Let's call them (0,1_A) and (0, 1_B). Similarly, there are three possible outcomes that give a total score of N_2 = 2, that is (0,2_A), (0,2_B) and (1_A, 1_B). There are five different ways to get an outcome of N_2=3: (0,3), (1_A, 2_A), (1_B, 2_A), (1_A, 2_B), and (1_B, 2_B). There are three was to get N_2=4: (1_A, 3), (1_B, 3) and (2_A, 2_B). Finally, we have two was to get N_2=5: (2_A, 3) and (2_B, 3). There are 15 possible outcomes, so we can construct the following table:

N_2 DE_2 \mathbb{P} \{ N_2 \}
1 1 \frac{2}{15}
2 \frac{3}{4} \frac{1}{5}
3 \frac{1}{2} \frac{1}{3}
4 \frac{1}{4} \frac{1}{5}
5 0 \frac{2}{15}

Based on the table above, we see that if your opponent will answer two questions correctly, then the probability of having a defensive efficiency greater than 50\% is \mathbb{P} \{ DE_2 \gt \frac{1}{2} \} = \mathbb{P} \{ N_2 \leq 2 \} = \frac{2}{15} + \frac{1}{5} = \frac{1}{3}.

Sunday, February 16, 2025

Owner of a squished heart

You can generate a heart shape by drawing a unit square (i.e., a square with side length 1), and then attaching semicircles (each with radius 1/2) to adjacent edges, as shown in the diagram below:

What is the radius of the smallest circle that contains this heart shape?

Let's define our frame of reference. Let's assume that the center of this unit square is the origin and that as in the figure above, the sides of the square make angle of \frac{\pi}{4} with respect to the x- and y-axes. In particular, the lowest corner of the unit square is located at (0, -\frac{\sqrt{2}}{2}). The two semi-circles added on the upper sides of the square are centered at (\pm\frac{\sqrt{2}}{4}, \frac{\sqrt{2}}{4}). From symmetry, we will focus on the one in the first quadrant and parameterize its boundary as \left( \frac{\sqrt{2}}{4} + \frac{1}{2} \cos \left( \theta - \frac{\pi}{4} \right), \frac{\sqrt{2}}{4} + \frac{1}{2} \sin \left(\theta - \frac{\pi}{4}\right) \right), for 0 \leq \theta \leq \pi.

Due to symmetry, any smallest inscribing circle should be centered at some point (0,h) on the positive y-axis. Then we can set up the problem in the following way. Obviously in order for every point in the semi-circle centered at (\frac{\sqrt{2}}{4}, \frac{\sqrt{2}}{4}) inside a circle centered at (0,h) we would need to have the radius be \begin{align*}R(h) &= \max_{\theta \in [0,\pi]} \sqrt{ \left( \frac{\sqrt{2}}{4} + \frac{1}{2} \cos \left( \theta - \frac{\pi}{4} \right) \right)^2 + \left( \frac{\sqrt{2}}{4} - h + \frac{1}{2} \sin \left( \theta - \frac{\pi}{4}\right) \right)^2} \\ &= \max_{\theta \in [0,\pi]} \sqrt{ \frac{1}{8} + \frac{\sqrt{2}}{4} \cos \left( \theta - \frac{\pi}{4} \right) + \frac{1}{4} \cos^2 \left( \theta - \frac{\pi}{4} \right) + \left( \frac{\sqrt{2}}{4} - h \right)^2 + \left( \frac{\sqrt{2}}{4} - h \right) \sin \left( \theta - \frac{\pi}{4} \right) + \frac{1}{4} \sin^2 \left( \theta - \frac{\pi}{4} \right) } \\ &= \max_{\theta \in [0,\pi]} \sqrt{ \frac{3}{8} + \left( \frac{\sqrt{2}}{4} - h \right)^2 + \frac{\sqrt{2}}{4} \cos \left( \theta - \frac{\pi}{4} \right) + \left( \frac{\sqrt{2}}{4} - h \right) \sin \left( \theta - \frac{\pi}{4} \right) }.\end{align*} Since square roots are monotonic and several of the terms are constant with respect to \theta, the optimal value of \theta in to obtain R(h) will be equivalent to the optimizer of the function f_h(\theta) = \frac{\sqrt{2}}{4} \cos \left( \theta - \frac{\pi}{4} \right) + \left( \frac{\sqrt{2}}{4} - h \right) \sin \left( \theta - \frac{\pi}{4} \right) on \theta \in [0,\pi]. Taking the derivative with respect to \theta and setting equal to zero we see that the only critical value of f_h in that interval satisfies -\frac{\sqrt{2}}{4} \sin \left( \theta^* - \frac{\pi}{4} \right) + \left( \frac{\sqrt{2}}{4} - h \right) \cos \left( \theta^* - \frac{\pi}{4} \right) = 0, or equivalently \tan\left(\theta^* - \frac{\pi}{4}\right) = 1 - 2\sqrt{2}h, which implies that \begin{align*}\cos \left(\theta^* -\frac{\pi}{4} \right) &= \frac{1}{\sqrt{1 + (1 - 2\sqrt{2}h)^2}} \\ \sin \left( \theta^* - \frac{\pi}{4} \right) &= \frac{1-2\sqrt{2}h}{\sqrt{1 + (1 - 2\sqrt{2}h)^2}},\end{align*} since \cos ( \tan^{-1} x ) = \frac{1}{\sqrt{1+x^2}} and \sin ( \tan^{-1} x ) = \frac{x}{\sqrt{1+x^2}}. Plugging this back into f_h, we get \begin{align*}f_h(\theta^*) &= \frac{\sqrt{2}}{4} \frac{1}{\sqrt{1 + (1-2\sqrt{2}h)^2}} + \left(\frac{\sqrt{2}}{4} - h\right) \frac{1 - 2\sqrt{2}h}{\sqrt{1 + (1-2\sqrt{2}h)^2}} \\ &= \frac{ \sqrt{2} }{4} \frac{ 1 + (1-2\sqrt{2}h)^2 }{ \sqrt{ 1+ (1-2\sqrt{2}h)^2}} \\ &= \frac{\sqrt{2}}{4} \sqrt{1 + (1 - 2\sqrt{2}h)^2}.\end{align*} Plugging this value back into R(h) we get R(h) = \sqrt{\frac{3}{8} + \frac{1}{8} (1 - 2\sqrt{2}h)^2 + \frac{\sqrt{2}}{4} \sqrt{1 + (1-2\sqrt{2}h)^2}}.

Now in order for the entire unit square to be contained within the circle, the lowest point (0, -\frac{\sqrt{2}}{2}) must also be less than or equal to R(h) away from the center at (0,h). In particular, if we want to have the inscribed within the circle we should have the R(h) equal the distance between (0, -\frac{\sqrt{2}}{2}) and (0,h), that is h + \frac{\sqrt{2}}{2}. This leads to the equation h + \frac{\sqrt{2}}{2} = \sqrt{ \frac{3}{8} + \frac{1}{8} (1- 2\sqrt{2}h)^2 + \frac{\sqrt{2}}{4} \sqrt{ 1 + (1-2\sqrt{2}h)^2}}. Squaring both sides yields the equivalent equation \left(h + \frac{\sqrt{2}}{2}\right)^2 = \frac{3}{8} + \frac{1}{8} \left( 1 - 2\sqrt{2}h \right)^2 + \frac{\sqrt{2}}{4} \sqrt{ 1 + (1 - 2\sqrt{2}h)^2 } which is again equivalent to \frac{3}{2} \sqrt{2} h = \frac{\sqrt{2}}{4} \sqrt{1 + (1-2\sqrt{2}h)^2 }. Squaring both sides again yield \frac{9}{2} h^2 = \frac{1}{8} \left( 1 + (1 - 2\sqrt{2}h)^2 \right) = h^2 - \frac{\sqrt{2}}{2} h + \frac{1}{4} or equivalently \frac{7}{2} h^2 + \frac{\sqrt{2}}{2} h - \frac{1}{4} = 0. Solving the quadratic in terms of h leaves the only positive solution as h^* = \frac{-\frac{\sqrt{2}}{2} + \sqrt{ \frac{1}{2} - 4 \cdot \frac{7}{2} \cdot -\frac{1}{4}}}{2 \frac{7}{2}} = \frac{ -\frac{\sqrt{2}}{2} + \sqrt{ \frac{1}{2} + \frac{7}{2} }}{7} = \frac{ 2 - \frac{\sqrt{2}}{2} }{7} = \frac{4 - \sqrt{2}}{14}. Therefore, the radius of the smallest circle that contains this heart shape is R(h^*) = \frac{4-\sqrt{2}}{14} + \frac{\sqrt{2}}{2} = \frac{4 - \sqrt{2} + 7 \sqrt{2}}{14} = \frac{4 + 6 \sqrt{2}}{14} = \frac{2 + 3\sqrt{2}}{7} \approx 0.89180581\dots.

Sunday, February 9, 2025

Spinning squares and the balls that bounce in them

Suppose you have a unit square that’s rotating about its center at a constant angular speed, and there’s a moving ball inside. The ball has a constant linear speed, and there’s no friction or gravity. When the ball hits an edge of the square, it simply reflects as though the square is momentarily stationary during the briefest of moments they’re in contact. Also, the ball is not allowed to hit a corner of the square—it would get jammed in that corner, a situation we prefer to avoid.

Suppose the ball travels on a periodic (i.e., repeating) path, and that it only ever makes contact with a single point on the unit square. What is the shortest distance the ball could travel in one loop of this path?

Take a unit square that is rotating at a constant angular speed, \omega, where, without loss of generality, we assume that the sides of square are perpendicular to the x- and y-axes at time zero. Further assume that that for any n \geq 2, there is a ball traveling at some fixed velocity v_n in a periodic path with n bounces against the side of the spinning square. Since the point of contact is tracing out a circle of radius \frac{1}{2} and the path of the ball is some n-gon inscribed in that half-unit circle. Since the ball and the point of contact are each traveling at constant speeds, the length of time between collisions and hence length of the sides of the n-gon must all be the same. Thus, a ball tracing a periodic path must trace out a regular n-gon inscribed in the half-unit circle centered at the center of the unit square.

The side length of a regular n-gon inscribed in a half-unit circle can be given by the law of cosines \begin{align*}s_n^2 &= \left(\frac{1}{2}\right)^2 + \left(\frac{1}{2}\right)^2 - 2 \left(\frac{1}{2}\right) \left(\frac{1}{2}\right) \cos \frac{2\pi}{n}\\ &=\frac{1 - \cos \frac{2\pi}{n}}{2}\\ &= \sin^2 \frac{\pi}{n}.\end{align*} Therefore the length of this periodic path is \ell_n = n s_n = n \sin \left( \frac{\pi}{n} \right). We see that since \sin x = x - \frac{x^3}{6} + O(x^5) that \ell_n = \pi - \frac{\pi^3}{6n^2} + O(n^{-4}), so \ell_n \uparrow \pi as n \to \infty. So we can get that the smallest such path is \ell_2 = 2, which represents the path that traces the line segment from the midpoints of the top and bottom sides of the square twice (down and back).

If we were to look for the next longest path, it would be the equilateral triangle of side length \sin \frac{\pi}{3} = \frac{\sqrt{3}}{2}. The total travel distance of this equilateral triangular path is thus \ell_3 = \frac{3\sqrt{3}}{2} \equiv 2.598076211\dots.

Monday, February 3, 2025

Spinning surfaces and the vertical lines that test them

In a more advanced course, you’ve been asked to draw a 3D sketch of the function z = |x| + |y|. As you’re about to do this, you are struck by another bout of dizziness, and your resulting graph is rotated about the origin in a random direction in 3D space. In other words, the central axis of your graph’s “funnel” is equally likely to point from the origin to any point on the surface of the unit sphere.

What is the probability that the resulting graph you produce is in fact a function (i.e., z is a function of x and y)?

At first glance, this looks like a harder problem. There is after all one more direction to dizzily rotate your surfaces. However, first we note that the orthonormal rotation that transforms the point unit z-vector, e_3 = (0,0,1)^T to a random point on the unit sphere \hat{u} = ( \cos \theta \sin \vartheta, \sin \theta \sin \vartheta, \cos \vartheta) is composed of a rotation about the y-axis through an angle \vartheta \in ( 0^\circ, 180^\circ), followed by a rotation about the z-axis through an angle of \theta \in (0^\circ, 360^\circ). Now if we are testing from a vertical line test perspective using a vertical line parallel to the z-axis, then any rotation about the z-axis will not affect its success at passing this test. So we only are really worried about the rotation about the y-axis through an angle of \vartheta. This case then devolves into the two-dimensional case we saw before, though with only half of range. That is, the resulting parametric surface will pass the vertical line test whenever \vartheta \in [0^\circ, 45^\circ) \cup (135^\circ, 180^\circ].

The only thing that we have to be careful here of is assessing the probability. Here we want to measure the surface of the sphere that satisfies \Omega = \{ \vartheta \in [0^\circ, 45^\circ) \cup (135^\circ, 180^\circ], \theta \in [0^\circ, 360^\circ) \}. We can calculate the surface area of the region \Omega by converting to radians and then using the surface area of the shape bounded by the curve r = \sin \vartheta for \vartheta \in [0, \frac{\pi}{4}) \cup (\frac{3\pi}{4}, \pi], that is S = S(\Omega) = 2 \int_0^{\pi / 4} \pi \sin^2 \vartheta \,d\vartheta. Therefore, since the total surface area of the unit sphere is 4\pi, the probability of the rotated parametric surface passing the vertical line test is \begin{align*}q = \frac{S}{4\pi} &= \frac{1}{2} \int_0^{\pi/4} \sin^2 \vartheta \,d\vartheta \\ &= \frac{1}{2} \int_0^{\pi/4} \frac{1 - \cos 2\vartheta}{2} \,d\vartheta \\ &= \left[\frac{\vartheta}{4} - \frac{\sin 2\vartheta}{8} \right]^{\vartheta = \pi/4}_{\vartheta=0} \\ &= \frac{\pi}{16} - \frac{1}{8} = \frac{\pi-2}{16} \approx 7.13495\dots \%.\end{align*}

Spinning graphs and the vertical lines that test them

You’re taking a math exam, and you’ve been asked to draw the graph of a function. That is, your graph must pass the vertical line test, so that no vertical line intersects your function’s graph more than once.

You decide you’re going to graph the absolute value function, y = |x|, and ace the test.

There’s just one problem. You are dealing with a bout of dizziness, and can’t quite make out the x- and y-axes on the exam in front of you. As a result, your function will be rotated about the origin by a random angle that’s uniformly chosen between 0 and 360 degrees.

What is the probability that the resulting graph you produce is in fact a function (i.e., y is a function of x)?

Let's assume that the map of y = |x| is rotated about the origin by an angle of \theta \sim U(0^\circ,360^\circ) degrees with respect to the positive x-axis. We can think of this situation as defining a new set of variables \begin{pmatrix} \tilde{x} \\ \tilde{y} \end{pmatrix} = \begin{pmatrix} \cos \frac{\pi\theta}{180} & -\sin \frac{\pi\theta}{180} \\ \sin \frac{\pi\theta}{180} & \cos \frac{\pi\theta}{180} \end{pmatrix} \begin{pmatrix} x \\ |x| \end{pmatrix} = \begin{pmatrix} x \cos \frac{\pi\theta}{180} - |x| \sin \frac{\pi\theta}{180} \\ x \sin \frac{\pi\theta}{180} + |x| \cos \frac{\pi\theta}{180} \end{pmatrix}. So we have \tilde{x} = f(x) and \tilde{y} = g(x) as parametric functions of our previous variable x.

Now we see that if f is one-to-one, then the parametric curve (f(x), g(x)) will pass the vertical line test, since there is only one possible value of x such that \tilde{x} = f(x), then there can only be at most one intersection between any vertical line and the curve.

On the other hand, if f is not one-to-one, then for some x_1 \ne x_2, we have f(x_1) = f(x_2). In this case, since f(x_1) = f(x_2), we have equivalently (x_1 - x_2) \cos \frac{\pi \theta}{180} = ( |x_1| - |x_2| ) \sin \frac{\pi \theta}{180}. Now let's look at the values of g(x_1) and g(x_2). If \sin \frac{\pi\theta}{180} = 0, then f(x) = x \cos \frac{\pi\theta}{180}, which is a one-to-one function, so if f is not one-to-one then we can assume that \sin \frac{\pi \theta}{180} \ne 0. Therefore, we have \begin{align*}g(x_1) - g(x_2) &= (x_1 - x_2) \sin \frac{\pi \theta}{180} + (|x_1| - |x_2| ) \cos \frac{\pi \theta}{180} \\ &= (x_1 - x_2) \sin \frac{\pi \theta}{180} + (|x_1| - |x_2|) \sin \frac{\pi \theta}{180} \frac{ \cos \frac{\pi \theta}{180}}{\sin \frac{\pi\theta}{180}} \\ & = (x_1 - x_2) \sin \frac{\pi \theta}{180} + (x_1 - x_2) \cos \frac{\pi \theta}{180} \frac{\cos \frac{\pi \theta}{180}}{\sin \frac{\pi \theta}{180}} \\ &= (x_1 - x_2) \left( \sin \frac{\pi \theta}{180} + \frac{ \cos^2 \frac{\pi \theta}{180}}{\sin \frac{\pi \theta}{180}} \right) \\ &= \frac{ x_1 - x_2}{\sin \frac{\pi \theta}{180}},\end{align*} so since x_1 \ne x_2, we have g(x_1) \ne g(x_2). Thus, the vertical line \tilde{x} = f(x_1) = f(x_2) would intersect the curve at \tilde{y}_1 = g(x_1) and \tilde{y}_2 = g(x_2), with \tilde{y}_1 \ne \tilde{y}_2. Thus, if f is not one-to-one then the parametric curve will fail the vertical line test. Therefore, the parametric curve will pass the vertical line test if and only if f is one-to-one.

So let's analyze the function f(x) = x \cos \frac{\pi\theta}{180} - |x| \sin \frac{\pi\theta}{180}. We see that f^\prime(x) = \begin{cases} \cos \frac{\pi\theta}{180} + \sin \frac{\pi\theta}{180}, &\text{if $x \lt 0$;} \\ \cos \frac{\pi\theta}{180} - \sin \frac{\pi\theta}{180}, & \text{if $x \gt 0.$} \end{cases} If f is monotonic, then it is obviously one-to-one, so we need only determine for what values of \theta \in (0^\circ, 360^\circ) does \cos \frac{\pi\theta}{180} - \sin \frac{\pi\theta}{180} and \cos \frac{\pi\theta}{180} + \sin \frac{\pi\theta}{180} have the same signs, that is, when does 0 \lt \left(\cos \frac{\pi\theta}{180} - \sin \frac{\pi\theta}{180}\right) \left( \cos \frac{\pi\theta}{180} + \sin \frac{\pi\theta}{180} \right) = \cos^2 \frac{\pi\theta}{180} - \sin^2 \frac{\pi\theta}{180} = \cos \frac{\pi\theta}{90}. So since \cos x \gt 0 on x \in [ 0, \frac{\pi}{2} ) \cup ( \frac{3\pi}{2}, 2\pi], we see that the resulting graph will pass the vertical line test whenever \theta \in [ 0^\circ, 45^\circ ) \cup (135^\circ, 225^\circ) \cup ( 315^\circ, 360^\circ ], which means that the probability of passing the vertical line test if \theta \sim U(0,360) is p = \frac{|45 - 0| + |225 - 135| + |360 - 315|}{360} = \frac{1}{2}.

Sunday, January 26, 2025

The Four Lily Pad Problem

You are a frog in a pond with four lily pads, marked “1” through “4.” You are currently on pad 2, and your goal is to make it to pad 1. From any given pad, there are specific probabilities that you’ll jump to another pad:

  • Once on pad 1, you will happily stay there forever.
  • From pad 2, there’s a 1-in-2 chance you’ll hop to pad 1, and a 1-in-2 chance you’ll hop to pad 3.
  • From pad 3, there’s a 1-in-3 chance you’ll hop to pad 2, and a 2-in-3 chance you’ll hop to pad 4.
  • Once on pad 4, you will unhappily stay there forever.

Given that you are starting on pad 2, what is the probability that you will ultimately make it to pad 1 (as opposed to pad 4)?

Let's solve this in two different ways: (a) one directly from the probability distributions and some easy combinatorics; and (b) from the Markov chain properties that we will then expand in the Extra Credit portion.

The most straightforward way of solving this is by looking at all of the possible transitions from pad 2 to pad 1. Obviously in the first jump you have a 1/2 probability of everlasting happiness on pad 1. If not, then you will have to make your way back to pad 2 and try again, at some future point. In order to end up back on pad 2 we will need a jump to pad 3 followed by a jump back from pad 3 to pad 2. The probability of this round trip occuring is \frac{1}{2} \times \frac{1}{3} = \frac{1}{6}. So we can only arrive on paradise pad 1 as the result of an odd numbered jump from pad 2 to pad 1, with some number of round trips, say k, between pads 2 and 3 intervening. So the the probability of landing on lily pad 1 on the (2k+1)th jump is \frac{1}{2 \cdot 6^k}, meaning that the total probability of ever landing on lily pad 1 having started on lily pad 2 is p = \sum_{k=0}^\infty \frac{1}{2 \cdot 6^k} = \frac{1}{2} \frac{1}{1 - \frac{1}{6}} = \frac{3}{5} = 60 \%.

Of course another way of looking at the situation is that your location on pads 1, 2, 3, or 4 can be modeled by a Markov chain with transition matrix given by T = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 1/2 & 0 & 1/2 & 0 \\ 0 & 1/3 & 0 & 2/3 \\ 0 & 0 & 0 & 1 \end{pmatrix} , where T_{ij} is the probability of starting of pad i and going to pad j. Obviously, pads 1 and 4 are absorbing, recurrent states of this Markov chain, while pads 2 and 3 are transient. We can then transform our transition matrix by reordering the states to put it into a canonical block form, so that now the rows and columns represent pads 3, 2, 1, and 4, respectively: \hat{T} = \begin{pmatrix} 0 & 1/3 & 0 & 2/3 \\ 1/2 & 0 & 1/2 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} = \begin{pmatrix} Q & R \\ 0 & I \end{pmatrix}. The fundamental matrix of this Markov chain is N = (I - Q)^{-1} = \begin{pmatrix} 1 & -1/3 \\ -1/2 & 1 \end{pmatrix}^{-1} = \begin{pmatrix} \frac{6}{5} & \frac{2}{5} \\ \frac{3}{5} & \frac{6}{5} \end{pmatrix}. The resulting absorption probabilities are given by P = NR = \begin{pmatrix} \frac{6}{5} & \frac{2}{5} \\ \frac{3}{5} & \frac{6}{5} \end{pmatrix} \begin{pmatrix} 0 & \frac{2}{3} \\ \frac{1}{2} & 0 \end{pmatrix} = \begin{pmatrix} \frac{1}{5} & \frac{4}{5} \\ \frac{3}{5} & \frac{2}{5} \end{pmatrix} where P_{ij} is the probability of starting on lily pad 4-i and ending up at lily pad \frac{5 + (-1)^j 3}{2} for i, j = 1, 2. So in particular, the probability of starting on pad 2 and ending up living a life of happiness on pad 1 forever is, again, P_{21} = 60\%.

The Infinite Lily Pad Problem

Once again, you are a frog in a pond. But this time, the pond has infinitely many lily pads, which are numbered “1,” “2,” “3,” etc. As before, you are currently on pad 2, and your goal is to make it to pad 1, which you would happily stay on forever.

Whenever you are on pad k, you will hop to pad k−1 with probability 1/k, and you will hop to pad k+1 with probability (k−1)/k. Now, what is the probability that you will ultimately make it to pad 1?

Taking off where we landed from the Classic Fiddler, let's expand the 4 lily pad Markov chain to an n state Markov chain and then see what happens when n \to \infty. Skipping right to the canonical form of the transition matrix we get \begin{align*}\hat{T}_n &= \begin{pmatrix} 0 & \frac{1}{n-1} & 0 & 0 & \cdots & 0 & 0 & 0 & 0 & \frac{n-2}{n-1}\\ \frac{n-3}{n-2} & 0 & \frac{1}{n-2} & 0 & \cdots & 0 & 0 & 0 & 0 & 0 \\ 0 & \frac{n-4}{n-3} & 0 & \frac{1}{n-3} & \cdots & 0 & 0 & 0 & 0 & 0 \\ \vdots & \vdots & \vdots & \vdots & \cdots & \vdots & \vdots & \vdots & \vdots & \vdots \\ 0 & 0 & 0 & 0 & \cdots & \frac{2}{3} & 0 & \frac{1}{3} & 0 & 0 \\ 0 & 0 & 0 & 0 & \cdots & 0 & \frac{1}{2} & 0 & \frac{1}{2} & 0 \\ 0 & 0 & 0 & 0 & \cdots & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & \cdots & 0 & 0 & 0 & 0 & 1 \end{pmatrix} \\ &= \begin{pmatrix} Q & R \\ 0 & I \end{pmatrix}.\end{align*} Again, we have the fundamental matrix N = (I - Q)^{-1} and P = NR, where here can focus of wanting P_{n-2,1}. Instead of trying to code up an (n-2) \times (n-2) matrix inversion, let's instead rewrite the matrix equation as a tridiagonal system of equations (I-Q) x = \begin{pmatrix} 1 & -\frac{1}{n-1} & 0 & 0 & \cdots & 0 & 0 & 0 \\ -\frac{n-3}{n-2} & 1 & -\frac{1}{n-2} & 0 & \cdots & 0 & 0 & 0 \\ 0 & -\frac{n-4}{n-3} & 1 & -\frac{1}{n-3} & \cdots & 0 & 0 & 0 \\ \vdots & \vdots & \vdots & \vdots & \cdots & \vdots & \vdots & \vdots \\ 0 & 0 & 0 & 0 & \cdots & -\frac{2}{3} & 1 & -\frac{1}{3} & \\ 0 & 0 & 0 & 0 & \cdots & 0 & -\frac{1}{2} & 1 \\ \end{pmatrix} x = \frac{1}{2} e_{n-2}, and use the Thomas algorithm to find the probability of ending up on pad 1 forever in the case of n lily pads as p_{1,n} = x_{n-2} = P_{n-2,1}. In particular, the main diagonal entries are b_i = 1, i = 1, 2, \dots, n-2. The upper diagonal is c_i = -\frac{1}{n-i}, i = 1, 2, \dots, n-3 and lower diagonal are a_i = -\frac{n-i-1}{n-i}, i = 2, 3, \dots, n-2. The Thomas algorithm consists of two sweeps, first c^\prime\in \mathbb{R}^{n-3} and d^\prime \in \mathbb{R}^{n-2} and then another that starts with solving for x_{n-2} and then sweeps backward from i = n-1 to i=1. In particular, we define c^\prime_1 = c_1 = -\frac{1}{n-1} and c^\prime_i = \frac{c_i}{1 - a_ic^\prime_{i-1}}, i = 2, \dots, n-3 and d^\prime_1 = d_1, and d^\prime_i = \frac{d_i - a_id^\prime_{i-1}}{1 - a_ic^\prime_{i-1}}, i = 2, 3, \dots, n-2, and finally x_{n-2} = d^\prime_{n-2} and x_i = d^\prime_i - c^\prime_i x_{i+1}, i = n-1, n-2, \dots, 1. Additionally, since d_i = 0, i = 1, \dots, n-3, we have d^\prime_i = 0, i = 1, \dots, n-3, with p_{1,n} = P_{n-2,1} = x_{n-2} = d^\prime_{n-2} = \frac{\frac{1}{2}}{1 - \left(-\frac{1}{2}\right) c^\prime_{n-3}} = \frac{1}{2 + c^\prime_{n-3}}, and since we only care about x_{n-2}, we don't have to do any of the backward sweep.

So we can simply code up one recursive function that sweeps through the c^\prime_i variables for any i = 1, \dots, n-3, for instance the following Python script:

def prob_zero(N):
    c = -1.0 / float(N - 1.)
    for i in range(2, N-2):
        c = (-1.0 / float(N - i)) / (1. - (-1. + 1.0 / float(N - i)) * c)
    return (0.5 / (1 + 0.5 * c))

We can just get the values of the probability of absorption at pad 1 for many, many values of n and emprically see that as n gets large we get close (fairly quickly to \lim_{n \to \infty} p_{1,n} = 63.21205588285577....\%, which is disturbingly close to 1 - e^{-1}. So much so, that if we wanted we could likely call it quits here and claim that the probability of ending up forever on pad 1 in the inifinite lily pad case is p_{1,\infty} = 1 - e^{-1}.

# of Lily Pads Absorption Prob.
460%
562.5%
663.07692307692307%
763.19018404907976%
863.20899335717935%
963.21167883211679%
1063.21201448891889%
1163.21205178374104%
1263.21205551321910%
1363.21205585226252%
1463.21205588051614%
1563.21205588268949%
1663.21205588284473%
1763.21205588285508%
1863.21205588285572%
1963.21205588285577%
2063.21205588285577%

However, perhaps we can do better than just our anecdotal, experimental evidence. Let's start at p_{1,n} = \frac{1}{2+c^\prime_{n-3}(n)}. Using the recursive formula, we get c^\prime_{n-3}(n) = \frac{c_{n-3}}{1 - a_{n-3} c^\prime_{n-4}(n)} = \frac{-\frac{1}{3}}{1 + \frac{2}{3} c^\prime_{n-4}(n)} = -\frac{1}{3 + 2 c^\prime_{n-4}(n)}. Going a step further we get c^\prime_{n-4}(n) = \frac{ -\frac{1}{4} }{1 - \frac{3}{4} c^\prime_{n-5}(n)} = -\frac{1}{4 + 3c^\prime_{n-5}(n)}, which plugging back into the previous equation gives c^\prime_{n-3}(n) = - \frac{1}{3 + 2 \frac{ -1 }{4 + 3 c^\prime_{n-5}(n)} } = - \frac{4 + 3 c^\prime_{n-5}(n)}{10 + 9 c^\prime_{n-5}(n)}. We see that we can define c^\prime_{n-3}(n) as a rational function of c^\prime_{n-k}(n), for each k = 4, \dots, n-1. In particular, let's say that c^\prime_{n-3}(n) = -\frac{ \alpha_k + (\alpha_k - 1) c^\prime_{n-k}(n) }{\beta_k + (\beta_k - 1) c^\prime_{n-k}(n)}, where \alpha_4 = 1 and \beta_4 = 3. From the recursion formula for c^\prime we get that c^\prime_{n-3}(n) = - \frac{\alpha_k + (\alpha_k - 1) \frac{ - 1 }{k + (k-1) c^\prime_{n-k-1}(n)} }{ \beta_k + (\beta_k - 1) \frac{ -1}{ k + (k-1) c^\prime_{n-k-1}(n)}} = -\frac{ k \alpha_k - (\alpha_k - 1) + (k-1) \alpha_k c^\prime_{n-k-1}(n)} { k \beta_k - (\beta_k - 1) + (k-1) \beta_k c^\prime_{n-k-1}(n)}, so we can derive the recursion formulae for \alpha_k and \beta_k as \begin{align*}\alpha_{k+1} &= (k-1) \alpha_k + 1 \\ \beta_{k+1} &= (k-1) \beta_k + 1,\end{align*} for all k \geq 4, with initial conditions \alpha_4 = 1 and \beta_4 = 3.

So, trying to organize the situation a bit, we notice that since c_1^\prime(n) = -\frac{1}{n-1} we would then have to stop the recursion, we get c^\prime_{n-3}(n) = - \frac{ \alpha_{n-1} - \frac{\alpha_{n-1} - 1}{n-1} }{ \beta_{n-1} - \frac{\beta_{n-1} - 1}{n-1} } = \frac{ \alpha_{n-1} (n-2) + 1 }{\beta_{n-1} (n-2) + 1}, where \alpha_{n-1} and \beta_{n-1} can be read from the recursion formulae above. Plugging this back into the absorption probability, we see that \begin{align*}p_{1,n} &= \frac{1}{2 + c_{n-3}^\prime(n)} \\ &= \frac{1}{2 - \frac{ \alpha_{n-1} (n-2) + 1}{ \beta_{n-1} (n-2) + 1} } \\ & = \frac{ \beta_{n-1} (n-2) + 1 }{ (2 \beta_{n-1} - \alpha_{n-1}) (n-2) + 1 } \\ &= \frac{ \beta_{n-1} + O (n^{-1}) }{ (2 \beta_{n-1} - \alpha_{n-1}) + O(n^{-1}) }.\end{align*} Now, let's turn our attention to the growth of \alpha_{n-1} and \beta_{n-1}.

Given the recursion formulae and initial conditions, we can show that in fact \alpha_k = \lfloor (e-2) (k-2)! \rfloor and \beta_k = \lfloor (e-1) (k-2)! \rfloor for all k \geq 4. Clearly, the initial conditions hold since \lfloor (e-2) \cdot (4-2)! \rfloor = \lfloor 1.436\dots \rfloor = 1 and \lfloor (e-1) \cdot (4-2)! \rfloor = \lfloor 3.436\dots \rfloor = 3. Assume that for some j \geq 4 that \begin{align*}\alpha_j &= \lfloor (e-2) (j-2)! \rfloor \\ \beta_j &= \lfloor (e-1) (j-2)! \rfloor.\end{align*} From the recursion formula, we get \begin{align*} \alpha_{j+1} &= (j-1) \alpha_j + 1 = (j-1) \lfloor (e-2) (j-2)! \rfloor + 1 \\ \beta_{j+1} &= (j-1) \beta_j + 1 = (j-1) \lfloor (e-1) (j-2)! \rfloor + 1.\end{align*} Now at this point we will rely on the following lemma (that can be proven using an application of the Taylor remainder theorem) that states that for any \ell = 2, \dots, \infty, \sum_{k = \ell}^\infty \frac{1}{k!} \lt \frac{1}{(\ell - 1)!}. So using this lemma we see that \begin{align*}\lfloor e (j-1)! \rfloor &= \left\lfloor \sum_{k=0}^{j-1} \frac{(j-1)!}{k!} + (j-1)! \sum_{k=j}^\infty \frac{1}{k!} \right\rfloor \\ &= (j-1)! \sum_{k=0}^{j-1} \frac{1}{k!} \\ &= (j-1)! \sum_{k=0}^{j-2} \frac{1}{k!} + 1 \\ &= (j-1) \left( (j-2)! \sum_{k=0}^{j-2} \frac{1}{k!} \right) + 1 \\ &= (j-1) \lfloor e (j-2)! \rfloor + 1.\end{align*} So plugging back into the recursion formulae above, since j! and 2 \cdot j! are always integers for any j, we confirm the induction step, since \begin{align*} \alpha_{j+1} &= (j-1) \lfloor (e-2) (j-2)! \rfloor + 1 = \lfloor (e-2) (j-1) ! \rfloor \\ \beta_{j+1} &= (j-1) \lfloor (e-1) (j-2)! \rfloor + 1 = \lfloor (e-1) (j-1)! \rfloor,\end{align*} thus proving the formulae \alpha_k = \lfloor (e-2) (k-2)! \rfloor, \beta_k = \lfloor (e-1) (k-2)! \rfloor, \,\, \forall k \geq 4.

Finally, returning our attention to the formula for the absorption probability we get \begin{align*}p_{1,n} &= \frac{ \beta_{n-1} + O (n^{-1}) }{ (2 \beta_{n-1} - \alpha_{n-1}) + O(n^{-1})}\\ &= \frac{ \lfloor (e-1) (n-2)! \rfloor + O(n^{-1}) }{ ( 2 \lfloor (e-1) (n-2)! \rfloor - \lfloor (e-2) (n-2)! \rfloor + O(n^{-1}) } \\ &= \frac{ (n-2)! \sum_{k=1}^{n-2} \frac{1}{k!} + O(n^{-1}) }{ (n-2)! \left( 2 \sum_{k=1}^{n-2} \frac{1}{k!} - \sum_{k=2}^{n-2} \frac{1}{k!} \right) + O(n^{-1})} \\ &= \frac{ \sum_{k=1}^{n-2} \frac{1}{k!} + O\left( \frac{1}{ (n-1)!}\right) }{ \left(2 \sum_{k=1}^{n-2} \frac{1}{k!} - \sum_{k=1}^{n-2} \frac{1}{k!} \right) + O \left(\frac{1}{(n-1)!}\right)},\end{align*} so we have that the probability of ending up on pad 1 in the infinite lily pad problem is \begin{align*}p_{1,\infty} = \lim_{n \to \infty} p_{1,n} &= \lim_{n\to \infty} \frac{ \sum_{k=1}^{n-2} \frac{1}{k!} + O\left( \frac{1}{ (n-1)!}\right)}{ \left(2 \sum_{k=1}^{n-2} \frac{1}{k!} - \sum_{k=1}^{n-2} \frac{1}{k!} \right) + O\left( \frac{1}{ (n-1)!}\right)}\\ &= \frac{\sum_{k=1}^\infty \frac{1}{k!}}{ 2 \sum_{k=1}^\infty \frac{1}{k!} - \sum_{k=2}^\infty \frac{1}{k!}} \\ &= \frac{e-1}{2(e-1) - (e-2)} = 1 - e^{-1},\end{align*} as experimentally derived.

Monday, January 20, 2025

Big Trapezoidal Bean Machine

Suppose you have the board below, which starts with a row with six pins (and five slots), followed by a row with five pins (and six slots), and then back to six pins in an alternating pattern. Balls are only allowed to enter through the middle three slots on top.

Your goal is to create a trapezoidal distribution along one of the rows with six slots, which have been highlighted with dotted lines in the diagram above. More precisely, the frequency of balls passing through the six slots from left to right should be x−y, x, x+y, x+y, x, x−y, for some values of x and y with x \geq y.

Suppose the ratio by which you drop balls into the top three middle slots, from left to right, is a : b : c, with a + b + c = 1. Find all ordered triples (a, b, c) that result in a trapezoidal distribution in at least one row with six slots.

As opposed to the classic bean machine, let's go back to assuming that beans have an equal probability of going either to the left or the right. In this case, given that we want to end up with a symmetric probability distribution, we can reasonably deduce that a = c. In this case, we then can reduce the dimensionality by also recognizing that a + b + c = 2a + b = 1, that is, b = 1 -2a.

Let's define \pi_i^{(k)} as the probability of a bean passing through ith slot from the left on the kth dotted line row from the top. We can set up a recurrence formula by going through the various routes from one row to the next. In particular, we get \begin{align*} \pi^{(k+1)}_1 &= \frac{1}{2} \pi_1^{(k)} +\frac{1}{4} \pi_2^{(k)}\\ \pi^{(k+1)}_2 &= \frac{1}{2} \pi_1^{(k)} +\frac{1}{2} \pi_2^{(k)} +\frac{1}{4} \pi_3^{(k)}\\ \pi^{(k+1)}_3 &= \frac{1}{4} \pi_2^{(k)} +\frac{1}{2} \pi_3^{(k)} +\frac{1}{4} \pi_4^{(k)}\\ \pi^{(k+1)}_4 &= \frac{1}{4} \pi_3^{(k)} +\frac{1}{2} \pi_4^{(k)} +\frac{1}{4} \pi_5^{(k)}\\ \pi^{(k+1)}_5 &= \frac{1}{4} \pi_4^{(k)} +\frac{1}{2} \pi_5^{(k)} +\frac{1}{2} \pi_6^{(k)}\\ \pi^{(k+1)}_6 &= \frac{1}{4} \pi_5^{(k)} +\frac{1}{2} \pi_6^{(k)} \end{align*} If we have a trapezoidal distribution with \pi^{(k)}_2 = x, then since \sum_{i=1}^6 \pi^{(k)}_i = 6x = 1, it must mean that we have \pi^{(k)}_2 = \frac{1}{6}.

For the first dotted line, given ordered triples of (a, 1-2a, a), we have \pi^{(1)} = \left(0, \frac{a}{2}, \frac{1-a}{2}, \frac{1-a}{2}, \frac{a}{2}, 0\right). Solving \pi_2^{(1)} = \frac{a}{2} = \frac{1}{6}, we get the ordered triple (\frac{1}{3}, \frac{1}{3}, \frac{1}{3}). For the second dotted line, we have \pi^{(2)} = \left( \frac{a}{8}, \frac{a+1}{8}, \frac{3-2a}{8}, \frac{3-2a}{8}, \frac{a+1}{8}, \frac{a}{8} \right). Here again, solving \pi^{(2)}_2 = \frac{a+1}{8} = \frac{1}{6}, we get the ordered triple (\frac{1}{3}, \frac{1}{3}, \frac{1}{3}). For the third dotted line, we have \pi^{(3)} = \left( \frac{3a+1}{32}, \frac{2a+5}{32}, \frac{10-5a}{32}, \frac{10-5a}{32}, \frac{2a+5}{32}, \frac{3a+1}{32} \right). Solving \pi^{(3)}_2 = \frac{2a+5}{32} = \frac{1}{6}, we get the ordered triple (\frac{1}{6}, \frac{2}{3}, \frac{1}{6}).

Going further, we get \pi^{(4)} = \left(\frac{8a+7}{128}, \frac{5a+22}{128}, \frac{35-13a}{128}, \frac{35-13a}{128}, \frac{5a+22}{128}, \frac{8a+7}{128}\right). Here though, we come to the equation \frac{5a+22}{128} = \frac{1}{6}, which cannot be solved by an a \geq 0, since \frac{22}{128} \gt \frac{1}{6}. So there are no ordered triples that can provide a trapezoidal distribution on the fourth dotted line. Similarly, we get \pi^{(5)} = \left( \frac{21a+36}{512}, \frac{13a+93}{512}, \frac{127-34a}{512}, \frac{127-34a}{512}, \frac{13a+93}{512}, \frac{21a+36}{512}\right), where \frac{93}{512} \gt \frac{1}{6} so again no ordered triple can provide a trapezoidal distribution on the fifth dotted line. Finally, we have \pi^{(6)} = \left( \frac{55a+165}{2048}, \frac{34a+385}{2048}, \frac{474-89a}{2048}, \frac{474-89a}{2048}, \frac{34a+385}{2048}, \frac{55a+165}{2048}\right), where \frac{385}{2048} \gt \frac{1}{6} so again no ordered triple can provide a trapezoidal distribution on the sixth dotted line.

Therefore, the only possible ordered triples that result in a trapezoidal distribution are (\frac{1}{3}, \frac{1}{3}, \frac{1}{3}) and (\frac{1}{6}, \frac{2}{3}, \frac{1}{6}).

Big Skewed Bean Machine

Suppose you have a board like the one shown below. The board’s topmost row has three pins (and two slots for a ball to pass through), while the bottommost row has two pins (and three slots for a ball to pass through). The remaining rows alternate between having three pins and two pins. But instead of the 12 rows of pins in the illustrative diagram, suppose the board has many, many rows. And at the very bottom of the board, just below the two bottommost pins, are three buckets, labeled A, B, and C from left to right.

Whenever a ball encounters one of the leftmost pins, it travels down the right side of it to the next row. And whenever a ball encounters one of the rightmost pins, it travels down the left side of it to the next row. But this isn’t your garden variety bean machine. Whenever a ball encounters any of the other pins, it has a 75 percent chance of traveling down the right side of that pin, and a 25 percent chance of traveling down the left side of that pin.

A single ball is about to be dropped into the left slot at the top of the board. What is the probability that the ball ultimately lands in bucket A, the leftmost slot at the bottom?

Let's break down the many, many, many rows of this bean machine into repetitions of a two pin row followed by a three pin row. Let's zoom in on one of these blocks. Assume that the probability that bean will hit the left pin with probability p (and hence the probability it hits the right pin is 1-p). Let's further say that the probability that after it passes through the three pin row that it is leaving down the left slot is q = q(p). Let's work out the relationship between p and q.

As shown in the figure, there are three possible ways that it leaves down the left slot: (a) it hits the left pin in the upper row, goes to the left hitting the leftmost pin in the lower row, then leaves through down the left slot; (b) it hits the left pin in the upper row, goes to the right hitting the middle pin in the lower row, then goes to the left leaving through the left slot; or (c) it hits the right pin in the upper row, goes to the left hitting the middle pin in the lower row, then goes to the left, leaving through the left slot. The total probability of option (a) is p \cdot \frac{1}{4} \cdot 1 = \frac{p}{4}. The total probability of option (b) is p \cdot \frac{3}{4} \cdot \frac{1}{4} = \frac{3p}{16}. The total probability of option (c) is (1-p) \cdot \frac{1}{4} \cdot \frac{1}{4} = \frac{1-p}{16}. Therefore, we see that q = q(p) = \frac{p}{4} + \frac{3p}{16} + \frac{1 - p}{16} = \frac{6p + 1}{16}.

Let's say that the probability of hitting the left pin in the nth repitition of these blocks of consecutive two and three pin rows is p_n. Then, since the probability of leaving the nth block through the left slot is the same as the probability of hitting the left pin in the (n+1)th block, we have p_{n+1} = \frac{6p_n + 1}{16}. If we have a total of N of these blocks, then the probability of ending up in A is given by p_A = \frac{p_{N+1}}{4}. While it isn't entirely super impactful when N is large, we should note that since we are dropping a ball in the left slot that we start with p_0 = 1, from which we can derive the exact probability of getting ending up in A for any N.

For this particular problem, though, where we assume that N \gg 1 we can just take p_\infty = \lim_{n \to \infty} p_n which we can get by plugging p_\infty into both sides of the recursion formula, that is, p_\infty = \frac{6 p_\infty + 1}{16}, or equivalently p_\infty = \frac{1}{10}. Therefore, when the number of such blocks grows large, the probability of ending up in A is p_A = \frac{p_\infty}{4} = \frac{1}{40} = 2.5\%.

Sunday, January 12, 2025

The Squiddler, Extra Credit

There are N contestants about to play Mingle, where N is less than 500. This time, in each round, the number for group size can be anywhere from 1 to 10, with each number equally likely to be called. Also, the numbers called in the rounds are all independent of each other. It appears this version of Mingle will continue until there are no more survivors.

Before the game starts, one of the contestants, Young-il, does some quick computation. He says to his companion, Gi-hun: “Oh no, it’s too bad we only have N people.” (Of course, Young-il doesn’t actually say “N,” but rather the number that N represents.)

Young-il continues: “If we had started with N+1 people instead, I’d expect the game to last about 10 additional rounds, on average, than I’d expect from starting with N people.”

What is the value of N?

In the extra-credit version of Mingle, we again assume that the contestants are all genial, cooperative mathematicians, and there will me no nefarious and/or inadvertent issues with people not being able to form the appropriate groups. In this version, it is even easier to believe this, since rather than unceremonious death, let's say that all contestants in Extra Credit Mingle get paid proportionally for the number of rounds lasted, so that everyone is incentivized for someone to last to the next round.

The extra credit wrinkle of only ever needing to make groups of size 1 to 10 introduces some very weird behavior. For instance, if we ever had N = 2520 = \textsf{lcm} \{ 1, 2, \dots, 10 \} then the game will go on for an infinite number of rounds, since no matter what number is called, the contestants will be able to make appropriately round groups. Thus, while there is a nonzero expectation for the number of rounds in the Extra Credit Mingle for N \lt 2520, for any N \geq 2520 the expected number of rounds is infinite.

Anyway, let's set up the conditional expectation chain. Let's say that for some N, we know that E_N is the expected number of rounds when starting with N contestants. From the law of total probability, assuming that we know E_k for all k = 1, \dots, N, we can set up the following recursive formula: E_N = 1 + \sum_{i=1}^{10} \mathbb{P} \{ i \} E_{ i \lfloor N / i \rfloor } = 1 + \frac{1}{10} \sum_{i=1}^{10} E_{i \lfloor N / i \rfloor}, \,\, \text{for $N \lt 2520$}. (Note, and last one, sorry, but I just think it's kinda neat ... that we see here in the case of N=2520 that since i\lfloor N / i\rfloor = N for each i = 1, \dots, 10, that we get the nonsensical E_{2520} = 1 + E_{2520}.)

Now there are a few ways to think about this. We can turn the recursion formula into a giant system of linear equations and then solve the 500 \times 500 lower 10-diagonal matrix inverse to solve the values of E_N, N = 1, \dots, 500. Instead, I opted for setting up a memoized recursive function in a few lines of Python:

from math import floor
exp_num_rounds = {}
def compute_exp_num_rounds(N):
    if N == 0: return 0
    if N not in exp_num_rounds.keys():
        transitions = []
        for i in range(10):
            transitions.append( int((i+1) * floor( N / ( i+ 1.0) )) )
        others = sum([ compute_exp_num_rounds(t) for t in transitions if t < N ])
        self = sum([ 1 for t in transitions if t == N])
        exp_num = (1.0 + 0.1 * others)/(1.0 - 0.1 * self)
        exp_num_rounds[N] = exp_num
    return exp_num_rounds[N]

Using this function I quickly calculated E_N for N = 1, \dots, 500, and in fact for all N up to 2519. We find what looks from a distance as a relatively straight line with a few barely discernable jumps. If we plot the differences E_{N+1} - E_N versus N we see that there are only 9 total values of the N in \{1, \dots, 2519\} such that the number of expected rounds would be more than 9 more if they started with N+1. Conveniently, enough, since there is only one such value of below 500, we know that Young-il is upset that they are starting the round of Extra Credit Mingle with N=359 contestants.

Looking at the values with large jump sizes, we can start to build a theoretical argument for why these values exhibit this behavior. Note that in each of these cases, N \in \{ 359, 719, 839, 1079, 1259, 1439, 1679, 1799, 2159\} that we have i \not\mid N for each i = 2, 3, \dots, 10 while there is only one value of j \in \{ 2, 3, \dots, 10 \} for which j \not\mid N+1. In a sufficiently handwavy way, this relative 9:1 ratio in probability of dropping below N after round one, is what is driving the roughly 9 to 10 additional expected rounds.