Monday, September 8, 2025

High-Low Hijinks Extra Credit

Your friend is playing an epic game of “high-low” and has made it incredibly far, having racked up a huge number of points. Given this information, and only this information, what is the probability that your friend wins the next round of the game?

Let's define this problem as the following probabilistic statement, what is the conditional probability that $N \geq n+1$ given that $N \geq n,$ when $n$ is large? That is, find $$\lambda = \lim_{n \to \infty} \mathbb{P} \left \{ N \geq n+1 \mid N \geq n \right\} = \lim_{n\to \infty} \frac{\mathbb{P} \{ N \geq n+1 \}}{\mathbb{P} \{ N \geq n \} } = \lim_{n\to\infty} \frac{p_{n+1}}{p_n}.$$

One method that we can do is to just compute successive values of $p_n,$ as per the following table:

$n$ $p_n$
1 $\frac{3}{4}$
2 $\frac{13}{24}$
3 $\frac{25}{64}$
4 $\frac{541}{1920}$
5 $\frac{1561}{7680}$
6 $\frac{47293}{322560}$
7 $\frac{36389}{344064}$
8 $\frac{7087261}{92897280}$
9 $\frac{34082521}{619315200}$
10 $\frac{1622632573}{40874803200}$

Even by this point, it seems that the ratio of $p_{n+1} / p_n$ approaches roughly $0.72134752\dots,$ but let's see if we can make sense of this otherwise mysterious limit.

Let's focus on the recursion $h_{n+1}(t) = \int_0^1 I(t,x) h_n(x) \,dx$ that we saw in the Classic problem. Let's assume that we have $h_{n+1} (x) \approx \lambda h_n (x),$ for large values of $n \gg 1.$ Then the recursion formula would settle down to something like $$\lambda h^*(t) = \int_0^1 I(t,x) h^*(x) \,dx = \begin{cases} \int_t^1 h^*(x)\,dx, &\text{if $t \lt \frac{1}{2};$}\\ \int_0^t h^*(x) \,dx, &\text{if $t \gt \frac{1}{2};$}\end{cases}.$$ Differentiating both sides of the equation, we get $$\lambda \frac{d}{dt} h^*(t) = \begin{cases} -h^*(t), &\text{if $t \lt \frac{1}{2};$}\\ h^*(t), &\text{if $t \gt \frac{1}{2}.$}\end{cases}.$$ Let's focus on when $t \gt \frac{1}{2},$ in which case we have $\lambda \frac{d}{dt} h^*(t) = h^*(t),$ so we would have $h^*(t) = C \exp \left( \frac{t}{\lambda} \right)$ for $t \gt \frac{1}{2},$ with $h^*(t) = h^*(1-t) = C \exp \left( \frac{ 1-t}{\lambda} \right)$ for $t \lt \frac{1}{2}.$ In this case, plugging back in, if our recursion formula still holds for $t \gt \frac{1}{2},$ we would need to have \begin{align*}\lambda C e^{t/\lambda} &= \int_0^\frac{1}{2} C \exp \left( \frac{1-t}{\lambda} \right)\,dt + \int_\frac{1}{2}^t C\exp \left(\frac{t}{\lambda}\right) \,dt \\ &= \lambda C \left( e^{1/\lambda} - e^{1/(2\lambda)} \right) + \lambda C \left( e^{t/\lambda} - e^{1/(2\lambda)} \right) \\ &= \lambda C e^{t/\lambda} + \lambda C \left( e^{1/\lambda} - 2 e^{1/(2\lambda)} \right).\end{align*} Therefore, we must have either $\lambda = 0$ or $e^{1/\lambda} - 2 e^{1/(2\lambda)} = 0$ in order tof the recursion formula to hold, which implies that the conditional probability of winning the next round in a long game of high-low is $$\lambda = \lim_{n \to \infty} \frac{p_{n+1}}{p_n} = \frac{1}{2\ln 2} \approx 0.721347520444\dots.$$

High-Low Hijinks

You’re playing a game of “high-low,” which proceeds as follows:

  • First, you are presented with a random number, $x_1$, which is between $0$ and $1.$
  • A new number, $x_2$, is about to be randomly selected between $0$ and $1$, independent of the first number. But before it’s selected, you must guess how $x_2$ will compare to $x_1$. If you think $x_2$ will be greater than $x_1$ you guess “high.” If you think $x_2$ will be less than $x_1$, you guess “low.” If you guess correctly, you earn a point and advance to the next round. Otherwise, the game is over.
  • If you correctly guessed how $x_2$ compared to $x_1$ then another random number, $x_3$, will be selected between $0$ and $1$. This time, you must compare $x_3$ to $x_2$, guessing whether it will be “high” or “low.” If you guess correctly, you earn a point and advance to the next round. Otherwise, the game is over.

You continue playing as many rounds as you can, as long as you keep guessing correctly. You quickly realize that the best strategy is to guess “high” whenever the previous number is less than $0.5,$ and “low” whenever the previous number is greater than $0.5.$ With this strategy, what is the probability you will earn at least two points? That is, what are your chances of correctly comparing $x_2$ to $x_1$ and then also correctly comparing $x_3$ to $x_2$?.

Let's define the indicator function $$I(s,t) = \begin{cases} 1, &\text{if $(t-s)(s-\frac{1}{2}) \leq 0;$} \\ 0, &\text{otherwise,}\end{cases}$$ such that $I(x_1,x_2) = 1$ whenever you successfully earned a point in the first round and $I(x_2,x_3) = 1$ whenever you successfully earned a second point. Let $N = N(x_1, x_2, \dots, x_n, \dots)$ be the random variable that corresponds to your score from playing this game of high-low. We see that based on this setup we have $$\mathbb{P} \{ N \geq n \} = \int_0^1 \int_0^1 \cdots \int_0^1 \prod_{i=1}^n I(x_i, x_{i+1}) \,dx_{n+1} \,dx_n \,dx_{n-1}\, \cdots \,dx_2 \,dx_1.$$ Further, let's define $h_0(t) = 1$ and $$h_1(t) = \int_0^1 I(t, x) \,dx = \begin{cases} \int_t^1 \,dx = 1-t, &\text{if $t \lt \frac{1}{2};$}\\ \int_0^t \,dx = t, &\text{if $t \geq \frac{1}{2};$} \end{cases} = \max \{ t, 1-t \}.$$ Finally, for any $n \in \mathbb{N},$ define $$h_{n+1}(t) = \int_0^1 I(t,x) h_n(x) \,dx.$$

In particular, if we want to see the probability of haveing a score of at least $2$ is \begin{align*}p_2 = \mathbb{P} \{ N \geq 2 \} &= \int_0^1 \int_0^1 \int_0^1 I(x_1,x_2) I(x_2, x_3) \,dx_3\,dx_2\,dx_1 \\ &= \int_0^1 \int_0^1 I(x_1,x_2) \left(\int_0^1 I(x_2, x_3) \,dx_3 \right) \,dx_2\,dx_1 \\ &= \int_0^1 \int_0^1 I(x_1,x_2) h_1(x_2) \,dx_2 \,dx_1\\ &= \int_0^1 h_2(x_1) \,dx_1.\end{align*} If we extrapolate further, which we will need primarily for the Extra Credit problem, we see that $$p_n = \mathbb{P} \{ N \geq n \} = \int_0^1 h_n(t) \,dt, \,\, \forall n \in \mathbb{N}.$$

Returning to the Classic problem here, we have $h_1(t) = \max \{ t, 1-t \}$ and \begin{align*}h_2(t) = \int_0^1 I(t,x) \max \{x, 1-x\} \,dx &= \begin{cases} \int_t^1 \max \{ x, 1 - x\} \,dx, &\text{if $t \lt \frac{1}{2};$} \\ \int_0^t \max \{x, 1-x\}\,dx, &\text{if $t \geq \frac{1}{2};$}\end{cases} \\ &= \begin{cases} \int_t^\frac{1}{2} (1-x) \,dx + \int_\frac{1}{2}^1 x\,dx, &\text{if $t \lt \frac{1}{2};$} \\ \int_0^\frac{1}{2} (1-x) \,dx + \int_\frac{1}{2}^t x \,dx, &\text{if $t \geq \frac{1}{2};$}\end{cases} \\ &= \begin{cases} \frac{3}{4} - t + \frac{t^2}{2}, &\text{if $t \lt \frac{1}{2};$} \\ \frac{1}{4} + \frac{t^2}{2}, &\text{if $t \geq \frac{1}{2};$}\end{cases} \\ &= \max \left\{ \frac{3}{4} -t + \frac{t^2}{2}, \frac{1}{4} + \frac{t^2}{2} \right\}.\end{align*} So we have the probability of earning at least two points is \begin{align*}p_2 = \mathbb{P} \{ N \geq 2 \} &= \int_0^1 h_2(t) \,dt\\ &= \int_0^\frac{1}{2} \left( \frac{3}{4} - t + \frac{t^2}{2} \right) \,dt + \int_\frac{1}{2}^1 \left( \frac{1}{4} + \frac{t^2}{2} \right) \,dt\\ &= \left[ \frac{3}{4}t -\frac{1}{2}t^2 + \frac{1}{6} t^3 \right]^{t=\frac{1}{2}}_{t=0} + \left[ \frac{1}{4} t + \frac{1}{6} t^3 \right]^{t=1}_{t=\frac{1}{2}} \\ &= \left(\frac{3}{8} - \frac{1}{8} + \frac{1}{48}\right) + \left(\frac{1}{4} + \frac{1}{6}\right) - \left( \frac{1}{8} + \frac{1}{48} \right) \\ &= \frac{13}{24}.\end{align*}

Monday, August 25, 2025

Sundown Trail Race with a Mulligan

Now let’s add one more wrinkle. At some point during the race, if you’re unhappy with the loop you’ve just been randomly assigned, you’re granted a “mulligan,” allowing you to get another random assignment. (Note that there’s a $25$ percent chance you’ll be assigned the same loop again.) You don’t have to use your mulligan, but you can’t use it more than once.

As before, the time is $5$:$55$ p.m. You have just completed a loop, and you haven’t used your mulligan yet. With an optimal strategy (i.e., using the mulligan at the right moment, if at all), on average, what score can you expect to earn between $5$:$55$ p.m. and $7$ p.m.?

As we saw in the Classic post, let's use the absorption probabilities matrix $B = \left(B(t,s)\right)_{t \in \mathfrak{T}, s \in \mathfrak{S}},$ which is the probability of starting at transient state $t$ and ending up at the final, absorbing state $s.$ This will give us the expected score of starting at transient state $i$ with no mulligans $$E(i) = \mathbb{E} [ S \mid S_0 = i ] = \sum_{s \in \mathfrak{S}} s B(i,s).$$ In particular, we have the vector $$E = \left(E(t)\right)_{t \in \mathfrak{T}} = \begin{pmatrix} \frac{19933}{4096} \\ \frac{5245}{1024} \\ \frac{1437}{256} \\ \frac{317}{64} \\ \frac{293}{64} \\ \frac{69}{16} \\ \frac{77}{16} \\ \frac{21}{4} \\ \frac{23}{4} \end{pmatrix}.$$

We can only use a single mulligan, but it is enough to change the overall expected score. Let's denote $$\tilde{E}(t) = \tilde{\mathbb{E}}[ S \mid S_0 = s ],$$ as the expected score when starting at state $s$. In particular, let's say that we are currently finishing up our lap to attain the transient score of $t \in \mathfrak{T},$ where we haven't yet used our mulligan, and we are quickly decide for which new laps we want to use our mulligan. Let's say the randomly assigned lap will move us from $t$ to $t^\prime \in \mathfrak{T}.$ If we use our mulligan in this turn then we will end up with an expected score of $E(t);$ whereas if we don't use the mulligan, then we would end up with an expected score of $\tilde{E}(t^\prime),$ since we would still have our mulligan remaining. So the optimal choice if we end up moving from $t$ to $t^\prime \in \mathfrak{T}$ would be to give us $\max \{ E(t), \tilde{E}(t^\prime) \}.$ Similarly, if we are randomly assigned to move from $t$ to some $s \in \mathfrak{S}$, then if we use our mulligan we end up with expected score of $E(t)$ versus a score of $s,$ if we don't. So if we are would move from $t$ to $s \in \mathfrak{S}$ the optimal choice gives us $\max \{E(t), s \}.$ Putting this altogether, we the corresponding transition probabilities from the $Q$ and $R$ matrices given in the Classic problem, we get the recursive formula $$\tilde{E}(t) = \sum_{t^\prime \in \mathfrak{T}} Q(t, t^\prime) \max \{ E(t), \tilde{E}(t^\prime) \} + \sum_{s \in \mathfrak{S}} R(t,s) \max \{ E(t), s \}.$$

So let's go recursing .... \begin{align*} \tilde{E}(5.5) &= \frac{3}{4} \max\{ E(5.5), 5.5 \} + \frac{1}{4} \max \{ E(5.5), 6.5 \} = \frac{3}{4} E(5.5) + \frac{1}{4} 6.5 = \frac{95}{16} \\ \tilde{E}(5) &= \frac{3}{4} \max \{ E(5), 5 \} + \frac{1}{4} \max \{ E(5), 6 \} = \frac{3}{4} E(5) + \frac{1}{4} 6 = \frac{87}{16} \\ \tilde{E}(4.5) &= \frac{1}{4} \max \{ E(4.5), \tilde{E}(5.5) \} + \frac{3}{4} \max \{ E(4.5), 4.5 \} = \frac{1}{4} \tilde{E}(5.5) + \frac{3}{4} E(4.5) = \frac{163}{32} \\ \tilde{E}(4) &= \frac{1}{4} \max \{ E(4), \tilde{E}(5) \} + \frac{3}{4} \max \{ E(4), 4 \} = \frac{1}{4} \tilde{E}(5) + \frac{3}{4} E(4) = \frac{147}{32} \\ \tilde{E}(3.5) &= \frac{1}{4} \max \{ E(3.5), \tilde{E}(4.5) \} + \frac{1}{2} \max \{ E(3.5), 3.5 \} + \frac{1}{4} \max \{ E(3.5), 6.5 \} \\ &\quad\quad = \frac{1}{4} \tilde{E}(4.5) + \frac{1}{2} E(3.5) + \frac{1}{4} 6.5 = \frac{83}{16} \\ \tilde{E}(3) &= \frac{1}{4} \max \{ E(3), \tilde{E}(4) \} + \frac{1}{4} \max \{ E(3), 3 \} + \frac{1}{4} \max \{ E(3), 6 \} + \frac{1}{4} \max \{ E(3), 6.5 \} \\ &\quad\quad = \frac{1}{4} E(3) + \frac{1}{4} E(3) + \frac{1}{4} 6 + \frac{1}{4} 6.5 = \frac{1411}{256}\\ \tilde{E}(2) &= \frac{1}{4} \max \{ E(2), \tilde{E}(3) \} + \frac{1}{4} \max\{ E(2), \tilde{E}(5) \} + \frac{1}{4} \max \{ E(2), \tilde{E}(5.5) \} + \frac{1}{4} \max \{ E(2), 6.5 \} \\ &\quad\quad = \frac{1}{4} E(2) + \frac{1}{4} E(2) + \frac{1}{4} \tilde{E}(5) + \frac{1}{4} 6.5 = \frac{3029}{512}\\ \tilde{E}(1) &= \frac{1}{4} \max \{ E(1), \tilde{E}(2) \} + \frac{1}{4} \max \{ E(1), \tilde{E}(4) \} + \frac{1}{4} \max \{ E(1), \tilde{E}(4.5) \} + \frac{1}{4} \max \{ E(1), \tilde{E}(5.5) \} \\ &\quad\quad = \frac{1}{4} \tilde{E}(2) + \frac{1}{4} E(1) + \frac{1}{4} E(1) + \frac{1}{4} \tilde{E}(5.5) = \frac{5657}{1024} \end{align*}

This leaves us with one more step to show that the total expected score on accrued between $5$:$55$ pm and $7$ pm going at a constant $10$ minute per mile pace with one single mulligan is \begin{align*} \tilde{E}(0) &= \frac{1}{4} \max \{ E(0, \tilde{E}(1) \} + \frac{1}{4} \max \{ E(0), \tilde{E}(3) \} + \frac{1}{4} \max \{ E(0), \tilde{E}(3.5) \} + \frac{1}{4} \max \{ E(0), \tilde{E}(4.5) \} \\ &\quad\quad = \frac{1}{4} \tilde{E}(1) + \frac{1}{4} \tilde{E}(3) + \frac{1}{4} \tilde{E}(3.5) + \frac{1}{4} \tilde{E}(4.5)\\ &\quad\quad = \frac{21921}{4096} = 5.351806640625 \end{align*}

Sundown Trail Race

You’re participating in a trail run that ends at sundown at $7$ p.m. The run consists of four loops: $1$ mile, $3$ miles, $3.5$ miles, and $4.5$ miles. After completing any given loop, you are randomly assigned another loop to run—this next loop could be the same as the previous one you just ran, or it could be one of the other three. Being assigned your next loop doesn’t take a meaningful amount of time; assume all your time is spent running.

Your “score” in the race is the total distance you run among all completed loops. If you’re still out on a loop at $7$ p.m., any completed distance on that loop does not count toward your score!

It is now $5$:$55$ p.m. and you have just completed a loop. So far, you’ve been running $10$-minute miles the whole way. You’ll maintain that pace until $7$ p.m. On average, what score can you expect to earn between $5$:$55$ p.m. and $7$ p.m.?

OK, let's agree on how to parametrize this. There are $65$ minutes left in the race and you are running at a clean $10$ minutes per mile, so let's just divide everything by $10$ minute per mile and deal with things entirely in miles or score space. In this space, the problem looks like this: Your score is currently $0$, you will be assigned a random score increment $\delta S$ whichi if uniformly distributed over the set $\Delta S = \{1 , 3, 3.5, 4.5 \}.$ If $S + \delta S \leq 6.5,$ then your new score is $S+\delta S.$ Otherwise, if $S + \delta S \gt 6.5,$ then your final score will be $S.$ For clarity, we see that the possible final scores are $\mathfrak{S} = \{ 3, 3.5, 4, 4.5, 5, 5.5, 6, 6.5\}.$

With the transformation complete, we can either go through the combinatorial enumeration of all possible paths to get to a final score of, say, $S=3.5$. In that particular case, we see that in order to end with a score of $3.5$ you must first be assigned $3.5$ in your first loop and then be assigned either $3.5$ or $4.5$ in your second lap. This would then have a total probability of $\frac{1}{4} \times \frac{1}{2} = \frac{1}{8}.$ This can definitely be done for each of the other elements in $\mathfrak{S},$ but for the Extra Credit answer we will want to take advantage of modeling this as an absorbing Markov chain .... so let's do that here too.

The total number of states in this Markov Chain is $17$, there are nine (9) transient states $\mathfrak{T} = \{0, 1, 2, 3, 3.5, 4, 4.5, 5, 5.5\}$ and then the eight final, absorbing states in $\mathfrak{S}.$ Note that there two possible states where the current score is, say $3.5$, one in $\mathfrak{T},$ which represents a state that could still have some $\delta S \lt 3.5$ added to it and the race continuing and the final one where the race is over in $\mathfrak{S}.$ If at any time we have to differentiate between the transient and absorbing states that have the same numerical values, we will use $T$ and $A$ superscripts, e.g, $3.5^T \in \mathfrak{T}$ and $3.5^A \in \mathfrak{S}.$ If no superscript is provided, then let's safely assume that we are talking about the final, absorbing state.

OK, now let's get the canonical form of the transition matrix $P = \begin{pmatrix} Q & R \\ 0 & I_8 \end{pmatrix},$ where $Q$ is the $9 \times 9$ matrix representing the transition probabilities entirely within the transient states $\mathfrak{T}$ and $R$ is the $9 \times 8$ matrix representing the transition probabilities that start at transient states in $\mathfrak{T}$ and end in absorbing states $\mathfrak{S}.$ In particular, we have \begin{align*} Q &= \begin{pmatrix} 0 & \frac{1}{4} & 0 & \frac{1}{4} & \frac{1}{4} & 0 & \frac{1}{4} & 0 & 0 \\ 0 & 0 & \frac{1}{4} & 0 & 0 & \frac{1}{4} & \frac{1}{4} & 0 & \frac{1}{4} \\ 0 & 0 & 0 & \frac{1}{4} & 0 & 0 & 0 & \frac{1}{4} & \frac{1}{4} \\ 0 & 0 & 0 & 0 & 0 & \frac{1}{4} & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{4} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{4} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{4} \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{pmatrix} \\ R & = \begin{pmatrix} 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & \frac{1}{4} \\ \frac{1}{4} & 0 & 0 & 0 & 0 & 0 & \frac{1}{4} & \frac{1}{4} \\ 0 & \frac{1}{2} & 0 & 0 & 0 & 0 & 0 & \frac{1}{4} \\ 0 & 0 & \frac{3}{4} & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \frac{3}{4} & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & \frac{3}{4} & 0 & \frac{1}{4} & 0 \\ 0 & 0 & 0 & 0 & 0 & \frac{3}{4} & 0 & \frac{1}{4} \end{pmatrix} \end{align*} Based on the matrix $Q,$ we can compute the fundamental matrix $$N = (I_9 - Q)^{-1} = \begin{pmatrix} 1 & \frac{1}{4} & \frac{1}{16} & \frac{17}{64} & \frac{1}{16} & \frac{33}{256} & \frac{3}{8} & \frac{49}{1024} & \frac{11}{64} \\ 0 & 1 & \frac{1}{4} & \frac{1}{16} & 0 & \frac{17}{64} & \frac{1}{16} & \frac{33}{256} & \frac{3}{8} \\ 0 & 0 & 1 & \frac{1}{4} & 0 & \frac{1}{16} & 0 & \frac{17}{64} & \frac{1}{16} \\ 0 & 0 & 0 & 1 & 0 & \frac{1}{4} & 0 & \frac{1}{16} & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 & \frac{1}{4} & 0 & \frac{1}{16} \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & \frac{1}{4} & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & \frac{1}{4} \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{pmatrix}.$$

Using the fundamental matrix, we can get the absorption probabilities as $$B = NR = \begin{pmatrix} \frac{17}{256} & \frac{1}{8} & \frac{49}{512} & \frac{9}{32} & \frac{147}{4096} & \frac{33}{256} & \frac{321}{4096} & \frac{3}{16} \\ \frac{1}{64} & 0 & \frac{51}{256} & \frac{3}{16} & \frac{99}{1024} & \frac{9}{32} & \frac{49}{1024} & \frac{11}{64} \\ \frac{1}{16} & 0 & \frac{3}{64} & 0 & \frac{51}{256} & \frac{3}{16} & \frac{33}{256} & \frac{3}{8} \\ \frac{1}{4} & 0 & \frac{3}{16} & 0 & \frac{3}{64} & 0 & \frac{17}{64} & \frac{1}{4} \\ 0 & \frac{1}{2} & 0 & \frac{3}{16} & 0 & \frac{3}{64} & 0 & \frac{17}{64} \\ 0 & 0 & \frac{3}{4} & 0 & \frac{3}{16} & 0 & \frac{1}{16} & 0 \\ 0 & 0 & 0 & \frac{3}{4} & 0 & \frac{3}{16} & 0 & \frac{1}{16} \\ 0 & 0 & 0 & 0 & \frac{3}{4} & 0 & \frac{1}{4} & 0 \\ 0 & 0 & 0 & 0 & 0 & \frac{3}{4} & 0 & \frac{1}{4} \end{pmatrix},$$ where $B(t,s)$ is the probability of starting at transient state $t$ and eventually absorbing into the state $s \in \mathfrak{S}.$ For the purpose of the Classic problem, we are most interested in using the first row of $B,$ that is $B_{0,s},$ which is the probability of starting with score of $0$ and ending at score $s \in \mathfrak{S}$. More specifically, the expected score of running at a constant $10$ minute miles from $5$:$55$ pm until $7$ pm is \begin{align*}E(0) = \mathbb{E}[S \mid S_0 = 0] &= \sum_{s \in \mathfrak{S}} s B(0,s)\\ &= 3 \cdot \frac{17}{256} + 3.5 \cdot \frac{1}{8} + 4 \cdot \frac{49}{512} + 4.5 \cdot \frac{9}{32} + 5 \cdot \frac{147}{4096} \\ &\quad\quad\quad+ 5.5 \cdot \frac{33}{256} + 6 \cdot \frac{321}{4096} + 6.5 \cdot \frac{3}{16} \\ &= \frac{19933}{4096} = 4.866455078125.\end{align*}

Monday, August 18, 2025

Probable Free Money

You have the same $\$55$ worth of vouchers from the casino in the same denominations. But this time, you’re not interested in guaranteed winnings. Instead, you set your betting strategy so that you will have at least a $50$ percent chance of winning $W$ dollars or more.

What is the maximum possible value of W? In other words, what is the greatest amount of money you can have at least a 50 percent chance of winning from the outset, with an appropriate strategy? And what is that betting strategy?

Ok, well to explore this space, let's first explore the distribution of optimal strategy from the Classic answer. If we assume that we first bet one on each side then almost surely we end up with $2$ remaining $\$10$ vouchers and the $\$25$ voucher remaining, but with $W = 10.$ At this point the bet is $\$20$ on heads and $\$25$ on tails, so with probability $50\%$ you end up with $2$ $\$10$ vouchers and $W=30$. At this point in this strategy you would bet one $\$10$ voucher on heads and the other on tails, to reach the state of a single $\$10$ voucher and $W=40,$ again with a total probability of $50\%$. At this point you would bet your single $\$10$ voucher over and over again until you lose, which means you can attain $W=40+10n$ with probability $$p_n = \mathbb{P}\{W = 40+10n\} = \frac{1}{2^{n+2}}.$$

On the other hand, there is another possibility when you bet the $2$ $\$10$ vouchers against the single $\$25$ voucher, which is that, with probability $50\%$ you end up with on a single $\$25$ voucher and $W=35.$ In this case, you would just choose to wager your single $\$25$ voucher over and over again until you lose. So we have $W = 35+25n$ with probability $$q_n = \mathbb{P} \{W = 35+25n\} = \frac{1}{2^{n+2}}.$$ Putting these altogether we get the full distribution $$\mathbb{P} \{ W = w \} = \begin{cases} \frac{1}{4^{n+1}}, &\text{if $w=35+50n;$}\\ \frac{1}{2\cdot 4^{n+1}} + \frac{2}{32^{n+1}} = \frac{4\cdot 8^n +2}{32^{n+1}}, &\text{if $w = 60+50n$;} \\ \frac{1}{4 \cdot 2^n}, &\text{if $w = 40+10n$ and $n \mod 5 \not\equiv 2$;}\\ 0, \text{otherwise.}\end{cases}$$ In this case, we see that $$W^* = \max \{ w \in \mathbb{N} \mid \mathbb{P} \{ W \geq w \} = \frac{1}{2} \} = 50.$$ Additionally, we confirm that since $\mathbb{P} \{W \geq 35\} = 1,$ as we surmised in the classic answer.

However, as expected, we can do way ......... better. For instance, even if you bet all of your vouchers on heads each time you have a $50\%$ probability of getting at least $\$55,$ since in that greedy strategy your probability distribution if $$\hat{\mathbb{P}} \{ W = w \} = \begin{cases} \frac{1}{2^n}, &\text{ $w = 55n$;} \\ 0, &\text{otherwise.}\end{cases}$$

However, let's combine the greediness of throwing everything on one side and hoping for the best, with the optimal strategy to ensure a high of a minimum strategy as possible. In this case, you bet all $\$55$ on heads, let's say, in the first round. If you lose, well, that sucks. But if you win then you will be at the state $(3,1,55)$ with probability $50\%.$ Again, since we are looking to maximize the the value of $\max \{ w \in \mathbb{N} \mid \mathbb{P}^* \{ W \geq w \} \geq \frac{1}{2} \}$ and we already spent $50\%$ of the probability mass on the first wager, we see that our problem has now transformed into finding the maximum guaranteed winnings based on starting at the state $(3,1,55),$ since guaranteed winning from this point would still have a greater than or equal to $50\%$ overall probability from $(3,1,0).$ Thankfully, we've already solved this problem in the Classic Fiddler, and see that the greatest amount of money you can have at least a $50$ percent chance of winning from the outset is $$V(3,1,55) = 55+35 = \$90,$$ which employs the strategy of first betting everything on one side, say heads, and then if you win the first bet, then employing the strategy from the Classic Fiddler.

Sunday, August 17, 2025

Guaranteed Free Money

A casino offers you $\$55$ worth of “free play vouchers.” You specifically receive three $\$10$ vouchers and one $\$25$ voucher.

You can play any or all vouchers on either side of an even-money game (think red vs. black in roulette, without those pesky green pockets) as many times as you want (or can). You keep the vouchers wagered on any winning bet and get a corresponding cash amount equal to the vouchers for the win. But you lose the vouchers wagered on any losing bet, with no cash award. Vouchers cannot be split into smaller amounts.

What is the guaranteed minimum amount of money you can surely win, no matter how bad your luck? And what betting strategy always gets you at least that amount?

Let $S = (S_1, S_2, W)$ be the state of the system where you currently have $S_1$ $\$10$ vouchers, $S_2$ $\$25$ vouchers and $W$ winnings in cash. Further define $V(S) = V(S_1,S_2,W)$ to be the guaranteed minimum amount of money you can surely win if you start with state $S = (S_1,S_2,W).$

We can define $V(1,0,W) = V(0,1,W) = W,$ since regardless of whether you only have one voucher left, since you cannot split the vouchers, the fair game could go against your bet and you end up with no more money than you started.

Let's now consider $V(1,1,W).$ If you were to bet one $\$10$ voucher on say heads and one $\$25$ voucher on tails, then with a $50\%$ probability you would win $\$10$ and move to the state $S^\prime = (1,0,W+10)$ and with $50\%$ probability you would win $\$25$ and move to state $\tilde{S} = (0,1,W+25)$. Since $V$ measures the guaranteed minimum winnings we have $$\tilde{V}(1,1,W) = \min \{ V(1,0,W+10), V(0,1,W+25) \} = \min \{ W + 10, W + 25 \} = W+10.$$ If we were to pick another strategy, say only wager one $\$10$ voucher on heads, then we would get $$\bar{V}(1,1,W) = \min \{ V(1,1,W+10), V(0,1,W) \} = W,$$ which is sub-optimal with respect to the strategy of betting say $\$10$ on H and $\$25% on tails. So we see that that optimal strategy gives $$V(1,1,W) = W+10.$$

For $V(2,0,W),$ the optimal guaranteed strategy is to bet one voucher on heads and another on tails, since this gives $\tilde{V}(2,0,W) = V(1,0,W+10) = W+10,$ which is better than if you only wager on one side and get $\hat{V}(2,0,W) = W.$ So we get $$V(2,0,W) = \max \{ \tilde{V}(2,0,W), \hat{V}(2,0,W) \} = W+10.$$ Similarly, we get that $$V(3,0,W) = W+20,$$ by successively wagering one voucher on heads and another on tails, twice.

For $V(2,1,W)$, an optimal guaranteed strategy is to bet, say, two $\$10$ vouchers on heads and the $\$25$ voucher on tails. This gives $$\tilde{V}(2,1,W) = \min \{ V(2,0,W+20), V(0,1,W+25) \} = \min \{ W+30, W+25 \}= W+25.$$ This is better than first betting one $\$10$ on heads and another on tails, and then $$\bar{V}(2,1,W) = V(1,1,W+10) = W+20.$$ Both of these are better than betting only one $\$10$ voucher on heads and the $\$25$ voucher on tails which gives $$\hat{V}(2,1,W) = \min \{ V(2,0,W+10), V(1,1,W+25) \} = \min \{ W +20, W+25 \} = W + 20.$$ So we have $$V(2,1,W) = \max \{ \tilde{V}(2,1,W), \hat{V}(2,1,W), \bar{V}(2,1,W) \} = W+25.$$

Finally, for $V(3,1,W),$ we have an optimal guaranteed strategy to bet one $\$10$ on heads and another on tails, and get $$\tilde{V}(3,1,W) = V(2,1,W+10) = W+35.$$ Other strategies involve betting two $\$10$ vouchers on heads and the $\$25$ voucher on tails, which equivalently gives $$\bar{V}(3,1,W) = \min \{ V(3,0,W+20), V(1,1,W+25) \} = W+35.$$ Sub-optimal strategies includes bets on only one side, or betting only on $\$10$ voucher on heads and the $\$25$ voucher on tails, which given $$\hat{V}(3,1,W) = \min \{ V(3,0,W+10), V(2,1,W+25)\} = W+30.$$ So we have $$V(3,1,W) = \max \{ \tilde{V}(3,1,W), \bar{V}(3,1,W), \hat{V}(3,1,W) \} = W+35.$$ Therefore, the guaranteed minimum amount of money you can surely win is $V(3,1,0) = \$35,$ which you can get by first betting one $\$10$ voucher on each side of the game, then in the second round betting the two remaining $\$10$ vouchers on one side of the game and the $\$25$ voucher on the other.

Monday, August 11, 2025

Coldplay Canoodling - Extra Credit

Now, everyone at the concert spends at least some time canoodling. In particular, each member of a couple wants to spend some fraction of the time canoodling, where this fraction is randomly and uniformly selected between $0$ and $1$. This value is chosen independently for the two members of each couple, and the actual time spent canoodling is the product of these values. For example, if you want to canoodle during half the concert and your partner wants to canoodle during a third of the concert, you will actually canoodle during a sixth of the concert.

Meanwhile, the camera operators love to show canoodling couples. So instead of randomly picking couples to show on the jumbotron, they randomly pick from among the currently canoodling couples. (The time shown on the jumbotron is very short, so a couple’s probability of being selected is proportional to how much time they spend canoodling.)

Looking around the concert, you notice that the kinds of couples who most frequently appear on the jumbotron aren’t constantly canoodling, since there are very few such couples. Indeed, the couples who most frequently appear on the jumbotron spend a particular fraction $C$ of the concert canoodling. What is the value of $C$?

In this Extra Credit problem, we want to find the mode of the following distribution $$f_S(C) = \mathbb{P} \{ \text{shown canoodling}, \text{canoodling ratio} = C\} = k C \mathbb{P} \{ \text{canoodling ratio} = C \},$$ where the final equation comes from the assertin that a couple's probability of being selected is proportional to how much time they spend canoodling.

Let $X, Y \sim U(0,1)$ represent the canoodling propensity of the two members of the couple, then \begin{align*} \mathbb{P} \{ \text{canoodling ratio} = C \} &= \mathbb{P} \{ X \cdot Y = C \} \\ &= \frac{d}{dC} \mathbb{P} \{ X \cdot Y \leq C \} \\ &= \frac{d}{dC} \left( \int_0^1 \int_0^{\min \{ 1, C / x \}} \,dy \, dx \right) \\ &= \frac{d}{dC} \left( \int_0^1 \min \{ 1, \frac{C}{x} \} \,dx \right) \\ &= \frac{d}{dC} \left( \int_0^C 1\,dx + \int_C^1 \frac{C}{x} dx \right) \\ &= \frac{d}{dC} \left( C - C\ln C \right) \\ &= - \ln C.\end{align*} We already have $f_S(C) = - k C \ln C,$ and though it is irrelevant for answering the question, out of an abundance of rigor (??), let's find $k.$ From the law of total probability we get $$1 = \int_0^1 f_S(C) \,dC = k \int_0^1 (- C \ln C) \,dC = \left[ -\frac{C^2}{2} \ln C \right]_{C=0}^{C=1} + \int_0^1 \frac{C^2}{2} \,\frac{dC}{C} = \frac{k}{4},$$ so $k = 4.$

So the distribution is $f_S(C) = -4 C \ln C$ and we want to find the mode of this distribution. Since $$f^\prime_S(C) = -4 \ln C -4 = -4(1 + \ln C)$$ has only one zero in $[0,1]$ and $f_S(0) = f_S(1) = 0,$ we see that the most frequently appearing canoodling ratios on the Jumbotron is $$C^* = \frac{1}{e} \approx 0.367879441171\dots,$$ which gives $$f_S(C) \leq f_S(C^*) = \frac{4}{e}, \,\forall C \in [0,1].$$

Coldplay Canoodling

All the many attendees at a particular Coldplay concert are couples. As the CEO of Astrometrics, Inc., you are in attendance with your romantic partner, who is definitely not the head of HR at Astrometrics, Inc. During the concert, the two of you spend half the time canoodling.

The camera operators love to show people on the jumbotron during the concert, but time is limited and there are many attendees. As a result, the camera operators show just $1$ percent of couples during the concert. Couples are chosen randomly, but never repeat at any given concert.

You and your partner are shy when it comes to public displays of affection. While you don’t mind being shown on the jumbotron, you don’t want to be shown canoodling on the jumbotron. How many Coldplay shows can the two of you expect to attend without having more than a $50$ percent chance of ever being shown canoodling on the jumbotron?

For the classic problem, we first want to find what the probability of not being shown on the screen canoodling during any individual concert would be. Let's say that $p_s = 1\%$ is the probability of being shown on screen during a show and $p_c=50\%$ is the probability of canoodling at any moment during a show. So the probability of not being shown while canoodling at any individual show is $P=1-p_cp_s=99.5\%.$

The probability of not being shown canoodling in any of $N$ consecutive shows is $P^N,$ so the maximum number of shows that can be attended without there being a greater than $50\%$ chance of being shown is \begin{align*}N^* &= \max \{n \in \mathbb{N} \mid P^n \geq \frac{1}{2}\}\\ &= \max \{ n \in \mathbb{N} \mid n \ln P \geq - \ln 2\}\\ &= \left\lfloor -\frac{\ln 2}{\ln P}\right\rfloor = 138\end{align*} Coldplay shows.

Monday, July 28, 2025

Tour de Fiddler 2025 - Extra Credit

Instead of one opponent, now you have two—meaning three riders in total. As luck would have it, the managers for both other riders proclaimed that they’d sprint for the finish as long as their legs were feeling 50 percent or better. Note that your opponents’ feelings are independent of each other.

As the three of you near the finish, your own team manager radios you the following message: “If your legs feel <garbled> percent or better, sprint for the finish!”

You can’t make out what the garbled part of the message is, and you’re too tired to radio back for confirmation. Instead, you somehow muster the energy to randomly, uniformly pick a number between 0 and 100 to fill in the blank from your manager’s message, thereby determining your racing strategy—optimization be damned!

Right before you choose your random strategy and test your legs, what are your chances of winning the stage against both opponents?

Similar to the classic answer, we want to let your legs be $X \sim U(0,1),$ and your two opponents be $Y, Z \sim^{\text{i.i.d.}} U(0,1).$ Let your strategy be $U_a$ such that you will sprint if and only if $X \geq a.$ Here you again will win with $100\%$ probability if you are the only one sprinting. You will never win if you are not sprinting and at least one of opponents is sprinting. Here if no one is sprinting then your win probability drops to $\frac{1}{3}$ since you each have equal probability of winning. It gets a little more complicated here if multiple people are sprinting. Obviously, if only one opponent is sprinting, then your win probability is equivalent to case where both you and your single opponent from the Classic question are sprinting, that is, $\frac{1}{2} \left(p(a) - \frac{1}{2} - \frac{a}{4}\right).$ Leaving only the more complicated problem of what happens when all of the racers are sprinting. So to summarize we have \begin{align*}\tilde{p}(a) &= \mathbb{P} \{ X \geq a, \max\{Y,Z\} \lt \frac{1}{2} \} + 0 \mathbb{P} \{ X \lt a, \max \{Y,Z\} \geq \frac{1}{2} \} \\ &\quad\quad\quad +\frac{1}{3} \mathbb{P} \{ X \lt a, \max \{Y,Z\} \lt \frac{1}{2} \} + 2 \mathbb{P} \{ X \lt a, Y \geq \frac{1}{2}, X \geq Y, Z \lt \frac{1}{2} \} \\ &\quad\quad\quad\quad +\mathbb{P} \{ X \geq a, \min \{Y, Z \} \geq \frac{1}{2}, X \geq \max\{Y, Z\} \} \\ &= \frac{1-a}{4} + 0 + \frac{a}{12} + \frac{1}{8} - \frac{1}{2}\left( \max\{a-\frac{1}{2},0\}\right)^2 + \int_a^1 \left( \int_{\frac{1}{2}}^{\max\{\frac{1}{2},x\}} \,dy \right) \left( \int_{\frac{1}{2}}^{\max\{\frac{1}{2},x\}} \,dz \right) \,dx \\ &= \frac{3}{8} - \frac{a}{6} - \frac{1}{2} \left(\max \{a-\frac{1}{2},0\}\right)^2 + \int_a^1 \left(\max{x-\frac{1}{2},0} \right)^2 \,dx \\ &= \frac{5}{12} - \frac{a}{6} - \frac{1}{2} \left( \max \{ a - \frac{1}{2}, 0 \} \right)^2 + \frac{1}{24} - \frac{1}{3} \left( \max\{a - \frac{1}{2}, 0\}\right)^3 \\ &= \begin{cases} \frac{5}{12} - \frac{a}{6}, &\text{if $0 \leq a \leq \frac{1}{2}$;}\\ \frac{1}{3} + \frac{a}{12} - \frac{a^3}{3}, &\text{if $\frac{1}{2} \lt a \leq 1$.} \end{cases}\end{align*}

Therefore, the probability of winning just before choosing the value of $a$ for your strategy is \begin{align*}\Pi = \int_0^a \tilde{p}(a) \,da &= \int_0^{\frac{1}{2}} \left(\frac{5}{12} - \frac{a}{6}\right) \,da + \int_{\frac{1}{2}}^1 \left(\frac{1}{3} + \frac{a}{12} - \frac{a^3}{3}\right) \,da \\ &= \left[ \frac{5a}{12} - \frac{a^2}{12} \right]_{a=0}^{a= \frac{1}{2}} + \left[ \frac{a}{3} + \frac{a^2}{24} - \frac{a^4}{12} \right]^{a=1}_{a = \frac{1}{2}} \\ &= \left( \frac{5}{24} - \frac{1}{48} \right) - 0 + \left( \frac{1}{3} + \frac{1}{24} - \frac{1}{12} \right) - \left( \frac{1}{6} + \frac{1}{96} - \frac{1}{192} \right) \\ &= \frac{3}{16} + \frac{7}{24} - \frac{33}{192} = \frac{36 + 56 - 33}{192} = \frac{59}{192} = 0.3072916666\dots \end{align*}

Tour de Fiddler 2025

This time around, you and a competitor are approaching the finish of a grueling stage of the Tour de Fiddler. One of you will win the stage, the other will come in second. As you approach the finish, each of you will test the feeling of your legs, which will be somewhere between 0 percent (“I can barely go on!”) and 100 percent (“I can do this all day!”). For the purposes of this puzzle, these values are chosen randomly, uniformly, and independently.

Immediately after feeling your legs, you and your opponent each have a decision to make. Do you maintain your current pace, or do you sprint to the finish? Among those who sprint for the finish, whoever’s legs are feeling the best will win the stage. But if no one sprints for the finish, everyone has an equal chance of winning the stage. In the Tour de Fiddler, you must each decide independently whether to sprint for the finish based on your legs—you don’t have time to react to your opponent’s decision.

Normally, teams at the Tour de Fiddler keep their strategy and tactics close to the vest. But earlier today, your opponent’s manager declared on international television that if your opponent’s legs were feeling 50 percent or better, they’d sprint for the finish.

As you are about to test your legs for the final sprint and see how they feel, what are your chances of winning the stage, assuming an optimal strategy?

Let's explore possible strategies that mirror our opponent. Let $U_a$ represent the strategy that if when you check your legs and sprint whenever our legs are over the level $a \in [0,1].$ Let's assume that your legs are represented by $X \sim U(0,1)$ and your opponent's legs are represented by $Y \sim U(0,1)$ and that $X$ and $Y$ are independent. Note that your win probability is $100\%$ if you sprint but your opponent does not, $0\%$ if you do not sprint but your opponent does, and $50\%$ if neither of you sprint. The only rather complicated calculation is when you both sprint. If you have chosen the strategy $U_a,$ then your win probability is given by \begin{align*}p(a) &= \mathbb{P} \{ X \geq a, Y \lt \frac{1}{2} \} + 0 \mathbb{P} \{ X \lt a, Y \geq \frac{1}{2} \} \\ &\quad\quad\quad\quad +\frac{1}{2} \mathbb{P} \{ X \lt a, Y \lt \frac{1}{2} \} + \mathbb{P} \{ X \geq a, Y \geq \frac{1}{2}, X \geq Y \}\\ &= \frac{1-a}{2} + 0 + \frac{a}{4} + \int_a^1 \int_{\frac{1}{2}}^{\max\{x,\frac{1}{2}\}} \,dy \,dx \\ &= \frac{1}{2} - \frac{a}{4} + \int_a^1 \max \{x - \frac{1}{2}, 0\} \,dx \\ &= \frac{1}{2} - \frac{a}{4} + \int_{\max\{a-\frac{1}{2},0\}}^{\frac{1}{2}} t\,dt \\ &= \frac{1}{2} - \frac{a}{4} + \frac{1}{8} - \frac{1}{2} \left( \max \{ a - \frac{1}{2}, 0 \} \right)^2 \\ &= \begin{cases} \frac{5}{8} - \frac{a}{4}, &\text{if $0 \leq a \leq \frac{1}{2}$;}\\ \frac{1}{2} + \frac{a}{4} - \frac{a^2}{2}, &\text{if $\frac{1}{2} \lt a \leq 1$.}\end{cases}\end{align*} Therefore, the optimal strategy (choice of $a \in [0,1]$) will given the maximal value of $p(a)$, which since $p$ is uniformly decreasing, occurs when $a^* = 0,$ and gives an optimal win probability of $$p^* = \max_{a \in [0,1]} p(a) = p(0) = \frac{5}{8}.$$

Monday, July 21, 2025

Was this what life was like before cell phones? Extra Credit

Instead of three total friends, now suppose there are four total friends (yourself included). As before, you all arrive at random times during the hour and each stay for 15 minutes. On average, what would you expect the maximum number of friends meeting up to be?

OK, well now your other friend Wenceslas, whose arrival time is modeled by $W \sim U(0,1),$ joins in on the "plan" to randomly meet up at the mall. In this case, we can still calculate the maximum number of friends based on $W, X, Y, Z$ to be $$\tilde{N} = \tilde{N}(W,X,Y,Z) = \max_{t \in [0,3/4]} \# \left\{ s \in \{W,X,Y,Z\} \mid t \leq s \leq t + \frac{1}{4} \right\}.$$

Here I will resort to just coding up the function $\tilde{N}$ and throwing a Monte Carlo simulator at the problem. In particular, let's define the following two functions, since we might as well also code up and confirm our earlier function $N$, while we're at it.

import numpy as np
def max_num_friends3(x,y,z):
    sort = np.sort(np.array([x,y,z]))
    d1 = sort[1]-sort[0]
    d2 = sort[2]-sort[1]
    if d1+d2 <= 0.25:
        return 3
    elif min(d1,d2) <= 0.25:
        return 2
    else:
        return 1
    
def max_num_friends4(w,x,y,z):
    sort = np.sort(np.array([w,x,y,z]))
    d1 = sort[1]-sort[0]
    d2 = sort[2]-sort[1]
    d3 = sort[3]-sort[2]
    if d1+d2+d3 <= 0.25:
        return 4
    elif (d1+d2 <=0.25) or (d2+d3 <= 0.25):
        return 3
    elif min(d1,d2,d3) <= 0.25:
        return 2
    else:
        return 1

The following Monte Carlo setup was run, generating $S=100000$ simulations for $\tilde{N}$ and $4S$ simulations for $N$ (since we can take 4 selections of three from among $W, X, Y,$ and $Z$).

sims = 100000
t = []
f = []
x = np.random.uniform(size=(sims,))
y = np.random.uniform(size=(sims,))
z = np.random.uniform(size=(sims,))
u = np.random.uniform(size=(sims,))
for i in range(sims):
    mn4 = max_num_friends4(x[i], y[i], z[i], u[i])
    f.append(mn4)
    mn3 = max_num_friends3(x[i], y[i], z[i])
    t.append(mn3)
    mn3 = max_num_friends3(u[i], y[i], z[i])
    t.append(mn3)
    mn3 = max_num_friends3(x[i], u[i], z[i])
    t.append(mn3)
    mn3 = max_num_friends3(x[i], y[i], u[i])
    t.append(mn3)

The sample average of $f$ was $m_f = 2.4752$ with sample standard deviation of $s_f = 0.59770,$ meaning that the SEM is $0.00189.$ Therefore, when there are a total of four friends randomly showing up at the mall, the 95% confidence interval on the expected maximum number of friends meeting up at one time is $2.4752 \pm 1.96 \cdot 0.00189 = [2.4715, 2.4789].$

For comparison, when there are three friends randomly showing up at the mall, the 95% confidence interval on the expected maximum number of friends meeting up at one time is $[2.029059, 2.032336],$ which contains the theoretical answer of $\mathbb{E}[N] = 65/32 = 2.03125.$

Sunday, July 20, 2025

Was this what life was like before cell phones? I don't remember anymore

You and two friends have arranged to meet at a popular downtown mall between 3 p.m. and 4 p.m. one afternoon. However, you neglected to specify a time within that one-hour window. Therefore, the three of you will be arriving at randomly selected times between 3 p.m. and 4 p.m. Once each of you arrives at the mall, you will be there for exactly 15 minutes. When the 15 minutes are up, you leave.

At some point (or points) during the hour, there will be a maximum number of friends at the mall. This maximum could be one (sad!), two, or three. On average, what would you expect this maximum number of friends to be?

Let's abstract a bit and assume that your buddy Xena, you, and second buddy Zevulon each arrive at random times $X$, $Y$, $Z$ which are i.i.d. $U(0,1).$ The maximum number of friends at any one time between 3 and 4 p.m. is $$N = N(X,Y,Z) = \begin{cases} 3, &\text{if $\max \{ |X-Y|, |X-Z|, |Y-Z| \} \leq \frac{1}{4}$;}\\ 1, &\text{if $\min \{ |X-Y|, |X-Z|, |Y-Z| \} \gt \frac{1}{4}$;}\\ 2, &\text{otherwise.}\end{cases}$$ One method for solving this problem would be to just throw some Monte Carlo simulations at this distribution, to get $\mathbb{E}[N]$; however, let's see if we can't find an analytical answer first.

Let's define $E(x,y) = \mathbb{E} [ N(X,Y,Z) \mid X=x, Y=y ].$ If we understand $E(x,y),$ then we see that $$\mathbb{E}[N] = \int_0^1 \int_0^1 E(x,y) \, dx \, dy.$$ From the symmetry of $N$ we have $$\mathbb{E}[N] = 2 \int_0^1 \int_0^x E(x,y) \, dy\, dx.$$ Now let's study the following cases:

  • $A = \{ (x,y) \in [0,1]^2 \mid 0 \leq x \leq 1, \max\{0,x-\frac{1}{4}\} \leq y \leq x \};$
  • $B = \{ (x,y) \in [0,1]^2 \mid \frac{1}{4} \leq x \leq 1, \max\{0, x - \frac{1}{2} \} \leq y \leq x - \frac{1}{4} \};$
  • $C = \{ (x,y) \in [0,1]^2 \mid \frac{1}{2} \leq x \leq 1, 0 \leq y \leq x - \frac{1}{2}\}.$

In case A, we have that $|x-y| \leq \frac{1}{4}$ already, so at the very least we have $N \geq 2$ regardless of the value of $Z \sim U(0,1)$. Furthermore, whenever $\max{0,x-\frac{1}{4}} \leq Z \leq \min \{ 1, y + \frac{1}{4} \}$ we have $N=3,$ so we must have \begin{align*}E_A(x,y) &= \mathbb{P} \{ N \geq 1 \} + \mathbb{P} \{ N \geq 2\} + \mathbb{P} \{ N \geq 3 \} \\ &= 2 + \min \{ 1, y + \frac{1}{4} \} - \max \{ 0, x - \frac{1}{4} \}.\end{align*} So let \begin{align*}I_A &= \iint_A E_A(x,y) \,dx \,dy \\ &= \int_0^1 \int_{\max \{ 0, x - \frac{1}{4} \}}^x \left( 2 + \min \{ 1, y + \frac{1}{4} \} - \max \{0, x - \frac{1}{4} \} \right) \,dy \,dx\\ &= \frac{33}{64}\end{align*}

In case B, we cannot have $N = 3,$ but depending on the value of $Z \sim U(0,1)$ we can have either $N=1$ or $N=2.$ Furthermore, we see that for $(x,y) \in B$ we have $x -\frac{1}{4} \lt y + \frac{1}{4}$ so we have $N = 2$ whenever $\max\{0, y - \frac{1}{4}\} \leq Z \leq \min \{ 1, x + \frac{1}{4} \},$ so we must have \begin{align*}E_B(x,y) &= \mathbb{P} \{ N \geq 1 \} + \mathbb{P} \{ N \geq 2\} + \mathbb{P} \{ N \geq 3 \} \\ &= 1 + \min \{ 1, x + \frac{1}{4} \} - \max \{ 0, y - \frac{1}{4} \}.\end{align*} So let \begin{align*}I_B &= \iint_B E_B(x,y) \,dx \,dy \\ &= \int_{\frac{1}{4}}^1 \int_{\max \{ 0, x - \frac{1}{2} \}}^{x-\frac{1}{4}} \left( 1 + \min \{ 1, x + \frac{1}{4} \} - \max \{0, y - \frac{1}{4} \} \right) \,dy \,dx\\ &= \frac{53}{192}\end{align*}

In case C, we cannot have $N = 3,$ but depending on the value of $Z \sim U(0,1)$ we can have either $N=1$ or $N=2.$ Furthermore, we see that for $(x,y) \in B$ we have $x -\frac{1}{4} \gt y + \frac{1}{4}$ so we have $N = 2$ whenever $$\max\{0, y - \frac{1}{4}\} \leq Z \leq y + \frac{1}{4},$$ which has probability $\min \{ \frac{1}{2}, y + \frac{1}{4} \}$, or $$x - \frac{1}{4} \leq Z \leq \min \{1, x+ \frac{1}{4}\},$$ which has probability $\min \{ \frac{1}{2}, \frac{5}{4} - x \},$ so we must have \begin{align*}E_C(x,y) &= \mathbb{P} \{ N \geq 1 \} + \mathbb{P} \{ N \geq 2\} + \mathbb{P} \{ N \geq 3 \} \\ &= 1 + \min \{ \frac{1}{2}, y + \frac{1}{4} \} + \min \{ \frac{1}{2}, \frac{5}{4}-x \}.\end{align*} So let \begin{align*}I_C &= \iint_C E_C(x,y) \,dx \,dy \\ &= \int_{\frac{1}{2}}^1 \int_{0}^{x-\frac{1}{2}} \left( 1 + \min \{ \frac{1}{2}, y + \frac{1}{4} \} + \min \{\frac{1}{2}, \frac{5}{4} -x \} \right) \,dy \,dx\\ &= \frac{43}{192}\end{align*}

So we have that the expected nmaximum number of friends is \begin{align*}\mathbb{E}[N] &= 2 \left( I_A + I_B + I_C \right)\\ &= 2 \left( \frac{33}{64} + \frac{53}{192} + \frac{43}{192} \right)\\ &= \frac{65}{32} = 2.03125\end{align*}

Sunday, July 13, 2025

Efficient bowling

If you knock down 100 pins over the course of a game, you are guaranteed to have a score that’s at least 100. But what’s the minimum total number of pins you need to knock down such that you can attain a score of at least 100?

Indeed, you can in fact obtain a score of just 100 by knocking down exactly 100 pins, for instance, by getting a total of 9 pins in each of the first 9 frames, for a score of 81, then doing ever so slightly better and bowling a 9, 1, and 9 in the 10th frame. Note that in the last frame you don't really end up getting any "bonus" points than the number of pins that you knock down (you knocked down a total of 19 pins and your score went up by 19 points).

On the other hand, consecutive strikes gives you the most amount of bonus points, so let's look at how much points you end up with by hitting $n \leq 10$ strikes in a row and then getting all gutterballs ($0$'s) for the rest of the game. With the trailing gutterballs the last strike is score as just 10 points. The second-to-last strike (if applicable) gets only 10 points of bonus from the last strike and then a gutterball, so it's score is 20 points. Any strike before the second to last one receives a full 30 points since it is followed by two trailing strikes. So the score of $n$ consecutive strikes followed by gutterballs is \begin{align*}X_n &= 10 + 20 \min \{ 1, \max \{ n- 1, 0 \} \} + 30 \max \{ n-2, 0 \} \\ &= \begin{cases} 10, &\text{if $n=1$}\\ 30(n-1), &\text{if $n = 2, 3, \dots, 10.$}\end{cases}\end{align*} At this point we can stop for the purpose of the Classic Fiddler, but just for completeness, if we have 11 strikes in a row followed by a gutterball, we have $X_{11} = 290,$ and a perfect 12 strikes yields $X_{12} = 300.$ So we see that we can certainly get more than 100 by getting 5 consecutive strikes, but probably need 4 consecutive strikes plus some other trailing pins to most efficiently get to 100.

Let's assume that our first 4 frames were strikes, and let's assume that we are being efficient and want to take frames 6 through 10 off by devising ever more childish ways of gutterballing or foot faulting or other bowling alley hijinx. In this case the outcome of our total score depends on the values of our two balls in frame 5, say $b_1$ and $b_2.$ In particular, the final score for frame 3 will be $20+b_1,$ while the final score for frame 4 will be $10+b_1+b_2,$ and the score for frame 5 will be $b_1+b_2,$ so the overal final score of your entire round is $$S(b_1,b_2) = 60 + (20+b_1) + (10+b_1+b_2) + (b_1+b_2) = 90 + 3b_1 + 2b_2,$$ since the first two frames will have already been scored at their maximal values of 30, each. Here we want to then solve the integer linear program \begin{align*}\min \,\, & b_1+b_2 \\ \text{s.t.} \,\, & 3b_1 + 2b_2 = 10 \\ & b_1, b_2 \in \{0,1,2,3,4,5,6,7,8,9\}.\end{align*} There are really only two possible feasible integer solutions to choose from $b_1 = b_2 = 2$ and $b_1 = 0$, $b_2 = 5,$ so it is easy to determine that in fact $\hat{b}_1=\hat{b}_2=2$ is the optimal solution so that the minimum number of pins you need to knock down to attain a score of 100 is 44 pins.

Dominant bowling

You and your opponent each play a single game in which you both knock down the same number of pins. However, your scores are quite different.

Your opponent remarks, “Given only the information that we knocked down the same number of pins in our two games, there’s no way the difference between our scores could have been any greater!” What is this difference between your two scores?

As we saw in the Classic Fiddler, there is a way to knock down exactly 100 pins and get a final score of exactly 100 points. As part of that we saw that any pins knocked down in the 10th frame incur no bonus points, and we can attain any number from 0 up to 30 total pins knocked down in the 10th frame.

We will extend the result of getting a score equal to the total number of pins knocked down to all values 0 to 119. For any number less than 90, it should be relatively trivial to find some combination with no strikes or spares, so that there would be no way of getting any bonus points. For any score $S \in \{91, 92, \dots, 111\},$ you will need your extra ball in the 10th frame, so you can for instance get 81 points in the first 9 frames and then score $S-81$ in the 10th frame.

For any score $S \in \{112, 113, \dots, 119\},$ we have to get a bit trickier and layer on consecutive frames of gutterball then spare ($-/$). The fact that the spare is followed by a gutterball means that the spared frame still receives no bonus points. So for instance you can get a score of $S=119$ by $8$ consecutive gutterball then spare frames, followed by gutterball then 9 in the 9th frame, then three strikes in the 10th frame.

The minimum score for hitting all 120 pins down is 130, since if you gutterball then spare the 9th frame, then the first of the three strikes in the 10th frame will be added to the score in the 9th frame. So at a certain point, you must engage accrue some bonus points, but thankfully that is not until you get to 120 pins. So we have established the minimum score per number of pins knocked down as $$m(P) = \begin{cases} P, &\text{if $P \leq 119$;} \\ P+10, &\text{if $P = 120.$}\end{cases}$$

Now it is time to establish the highest score possible to attain given a certain number of pins knocked down. Here again we will look to the Classic Fiddler where we observed that the total score of $n$ consecutive strikes followed by all gutterballs is $$X_n = \begin{cases} 10, &\text{if $n=1$;}\\ 30(n-1), &\text{if $n=2,\dots, 10$;}\\ 180+10n, &\text{if $n=11, 12.$}\end{cases}$$ So just by observation we see that if we only focus on multiples of 10, we get the maximum score per number of pins knocked down as $$M(P) = X_{P/10}$$ whenever $10 \mid P.$ For completeness, we can fill in the gaps and note similar to the solution for the Classic Fiddler that if $20 \leq P \leq 100$ and the next two balls are $b_1$ and $b_2$ that we are trying to find \begin{align*}M(P) = \max \,\, & X_{\lfloor P / 10 \rfloor} + 3b_1 + 2b_2 \\ \text{s.t.}\,\, & b_1 + b_2 = P - 10 \lfloor P / 10 \rfloor \\ & b_1, b_2 \in \{0, 1, 2, 3, 4, 5, 6, 7, 8, 9\}\end{align*}, which has optimal solution $\hat{b}_1 = P - 10 \lfloor P/10 \rfloor$ and $\hat{b}_2 = 0,$ that is, $$M(P) = X_{\lfloor P / 10 \rfloor} + 3\left(P - 10 \lfloor P / 10 \rfloor\right) = 3P-30, \,\, 20 \leq P \leq 100.$$

Now let's go back and clean up the corner cases when $P \lt 20$ and $P \gt 100.$ For $P \leq 10,$ we have $M(P) = P.$ If we have $10 \lt P \lt 20,$ then the optimization problem becomes \begin{align*}M(P) = \max \,\, & 10 + 2b_1 + 2b_2 \\ \text{s.t.}\,\, & b_1 + b_2 = P - 10 \\ & b_1, b_2 \in \{0,1,2,3,4,5,6,7,8,9\},\end{align*} which for all feasible solutions gives $$M(P) = 10 + 2(P-10) = 2P-10, \,\, 10 \lt P \lt 20.$$ Whenever $100 \lt P \lt 110,$ then the optimization problem becomes \begin{align*}M(P) = \max \,\, & 270 + 2b_1 + b_2 \\ \text{s.t.}\,\, & b_1 + b_2 = P - 100 \\ & b_1, b_2 \in \{0,1,2,3,4,5,6,7,8,9, 10\},\end{align*} which for all feasible solutions give $$M(P) = 270 + 2(P-100) = 70 + 2P, \,\, 100 \lt P \lt 110.$$ And finally, it is clear that $$M(P) = 290+(P-110) = 180+P, for 110 \lt P \leq 120.$$ So putting this altogether we get the maximum score per number of pins knocked down is $$M(P) = \begin{cases} P, &\text{if $0\leq P \lt 10;$}\\ 2P-10, &\text{if $10 \leq P \lt 20$;}\\ 3P-30, &\text{if $20 \leq P \lt 100$;}\\ 2P + 70, &\text{if $100 \leq P \lt 110$;} \\ P + 180, &\text{if $110 \leq P \leq 120$.}\end{cases}$$

So having exhaustive determined the minimum and maximum scores for a given number of pins knocked down the only thing left to do is to determine that largest possible margin of victory given that the two bowlers knocked down the same number of pins, that is $$V(P) = M(P) - m(P) = \begin{cases} 0, &\text{if $0 \leq P \lt 10;$} \\ P-10, &\text{if $10 \leq P \lt 20$;} \\ 2P-30, &\text{if $20 \leq P \lt 100;$} \\ P+70, &\text{if $100 \leq P \lt 110;$} \\ 180, &\text{if $110 \leq P \leq 119$;} \\ 170, &\text{if $P = 120.$}\end{cases}$$ Scrutinizing the margin of victory function, we see that the largest possible margin of victory is 180 points, which can be attained $P \in \{110, 111, \dots, 119\}.$

Monday, July 7, 2025

Or maybe the 5,918th one?

In celebration of America’s birthday, let’s count more shapes—not in a board game, but in the American flag:

In particular, consider the centers of the 50 stars depicted on the flag. How many distinct parallelograms can you find whose vertices are all centers of stars? (If two parallelograms are congruent but have different vertices, they should still be counted as distinct.)

Here we will code everything up directly and let the same nice computer that helped us find those extra equilateral triangles hiding in plain sight find all of the parallelograms in the vertices. One note, just as we ignored the degenerate, length $0$ equilateral triangles in the classic Fiddler, let's ignore degenerate parallelograms that have zero area. Let's establish the vertices as $$\mathcal{V} = \{ (x,y) \in \mathbb{N}^2 \mid x+y \mod 2 \equiv 0, x \leq 10, y \leq 8 \},$$ which will give us the distinctive 6-5-6-5-6-5-6-5-6 pattern on the American flag.

We could loop through all possible 4-ples of distinct vertices in $\mathcal{V}$ and test for both being a parallelogram and having nonzero area. However, this would require testing over 5 million possible 4-ples ($=50 \cdot 49 \cdot 48 \cdot 47$). Let's instead take a somewhat more reasonable approach and take triples of points (meaning only about 100,000 tests) and first determine the area of the resulting parallelogram using the determinant formulation. That is the area of a parallelogram with three vertices $X = (x_1,x_2)$, $Y=(y_1, y_2)$, and $Z=(z_1,z_2)$ is given by $$A = \left| \det \begin{pmatrix} x_1 & x_2 & 1 \\ y_1 & y_2 & 1 \\ z_1 & z_2 & 1 \end{pmatrix} \right|.$$

If the resulting area is nonzero, then we can test whether the fourth vertex of a parallelogram with vertices $X$, $Y$, and $Z$ is in $\mathcal{V}.$ To do this, we see that if $X$, $Y$ and $Z$ are three vertices of a parallelogram, then the fourth vertex is given by either $U_1 = X + Y - Z$, $U_2 = X + Z - Y$ or $U_3 = Y + Z - X.$ If $U_i \in \mathcal{V},$ for some $i = 1, 2, 3,$ then we can add the parallelogram with vertices $X$, $Y$, $Z$ and $U_i$ to the list of parallelograms.

Looping through all possible distinct triples in $\mathcal{V},$ we find that there are 5,918 non-degenerate parallelograms that can be made using the centers of the stars on the American flag.

Sunday, July 6, 2025

Are you sure it's not the 126th Fiddler?

Dozo is a strategy game with a rather distinctive board:

The board features 28 holes in which players place markers, with the goal of making an equilateral triangle of any size with one color. How many distinct equilateral triangles can you find whose vertices are the centers of holes on the board? (If two triangles are congruent but have different vertices, they should still be counted as distinct.)

Let me first say that this is one of those Fiddlers that I wish I had not programmed up to try to verify my scratch work. I had a bunch of lovely handwritten work that mistakenly classified the only possible equilateral triangles on the Dozo board as those that had up, down, left or right orientation, with respect to an $xy$-plane that chose one side of the triangular board as the $x$-axis. Examples of up-, down-, left-, and right-oriented equilateral triangles are shown below in red, blue, orange and green, respectively. If I had stuck with the combinatorial work I had done with these up, down, left and right oriented equilateral triangles, then the ultimate answer would have be .... 100, oh the delightful, thoughtful deliberate work to get an answer that aligns with the official tally of how many of these we have ever done.

But alas, to confirm, I outsourced the work to some nice computer by defining the coordinates of the holes on the Dozo board and then running through all triplets of holes to find cases where the distances between each vertex are identical. Let's first define the coordinates. Let the top of the Dozo board, as shown, be the $x$-axis, with the seven holes located at $(-3,0)$, $(-2, 0)$, $\dots$, $(2,0)$, $(3,0).$ This means that the distance between each vertex with have unit length, and means that the second row should then be $(-2.5, -\frac{\sqrt{3}}{2})$, $(-1.5, -\frac{\sqrt{3}}{2})$, $\dots$, $(1.5, -\frac{\sqrt{3}}{2})$, $(2.5, -\frac{\sqrt{3}}{2}),$ and so on and so forth until we get to the final lowest vertex at $(0, -3\sqrt{3}).$

As described above, once all $28$ of the vertices were coded up, the search for all possible equilateral triangles loops through all possible unique triplets $(i, j, k) \in \{1,2,\dots,28\}^3$ with $i \lt j \lt k$ and tests whether $$\|v_i - v_j\|_2 = \|v_j-v_k\|_2 = \|v_i - v_k\|_2.$$ In this case, it yields all 100 of the up, down, left and right oriented equilateral triangles along with 26 additional ones for a total of $126$ equilateral triangles on the Dozo board. Some examples of the equilateral triangles that do not have one of the four orientations given above are shown in the chart below, with the fuschia triangle having side lengths of $\sqrt{7}$, the yellow triangle having side lenghts of $\sqrt{13}$, and the purple triangle having side lengths of $\sqrt{21}.$ There are a total 18 possible equialteral triangles with side lengths of $\sqrt{7}$, 6 with side lengths of $\sqrt{13}$ and 2 with side lenghts $\sqrt{21}$.

Monday, June 30, 2025

Would a comma have hurt anyone? Extra Credit

Having successfully hacked your way through the first keypad, the door opens to reveal a second door with yet another keypad that has six numerical inputs: “I,” “II,” “III,” “IV,” “V,” “VI,” “VII,” and “VIII.”

You were expecting this, which is why your accomplice had handed you a second scroll of paper. You unfurl this one as well, hoping they remembered to add spaces between the numbers.

No such luck. This paper reads: “IIIVIIIVIIIVIII.” That’s 15 characters in total. How many distinct combinations are possible for this second door?

Since there are three V characters, let's iterate on the various possibilities for these characters. Let's write the code word as $$C = R(w_1)R(x_1)R(w_2)R(x_2)R(w_3)R(x_3)R(w_4),$$ where $R(t)$ be the Roman numeral of the number $t,$ where we see that $w_i \in \{1, 2, 3\}$ and $x_i \in \{ 4, 5, 6, 7, 8 \}.$ In order for the our word $C$ to match the code provided, we would need to make sure that if $x_i = 4$ then $x_{i-1} \leq 7,$ for $i = 2, 3.$ Let $$\mathcal{X} = \{ (x_1, x_2, x_3) \in \{4, 5, 6, 7, 8 \}^3 \mid x_i = 4 \Rightarrow x_{i-1} \leq 7, i = 2, 3 \}$$ be the solution set for the values of the V characters.

Let's define the indicator function $$\delta(t) = \begin{cases} 1, &\text{if $t=1$}\\ 0, &\text{if $t=0$.}\end{cases}$$ Additionally let $N(t)$ be the number of possible combinations of numerals that in "I", "II", and "III" that can be found in $R(t).$ In particular, $N(0)=N(1)=1,$ $N(2)=2$ and $N(3)=4.$ For any $(x_1, x_2, x_3) \in \mathcal{X},$ we would also need to have the following conditions be met in order for our word $C$ to match the code provided: \begin{align*}w_1 = w_1(x_1) &= 3 - \delta(x_1-4) \\ w_2 = w_2(x_1,x_2) &= \min \{ 3, 8 - x_1 \} - \delta(x_2 - 4) \\ w_3 = w_3(x_2,x_3) &= \min \{ 3, 8 - x_2 \} - \delta (x_3-4) \\ w_4 = w_4(x_3) &= \min \{3, 8 - x_3 \}\end{align*} Then, the total number of distinct code combinations where the V characters are assigned to the numbers $(x_1,x_2,x_3)$ is given by $N(w_1(x_1))N(w_2(x_1,x_2))N(w_3(x_2,x_3))N(w_4(x_3)).$

Therefore, the total possible number of distinct combinations for the second door is $$\hat{N} = \sum_{(x_1,x_2,x_3) \in \mathcal{X}} N(w_1(x_1))N(w_2(x_1,x_2))N(w_3(x_2,x_3))N(w_4(x_3)) = 4000.$$

Would a comma have hurt anyone?

You are breaking into a vault that contains ancient Roman treasure. The vault is locked, and can be opened via a modern-day keypad. The keypad contains three numerical inputs, which are (of course) expressed using Roman numerals: “I,” “II,” and “III.”

It’s a good thing your accomplice was able to steal the numerical key code to the vault. Earlier in the day, they handed you this code on a scroll of paper. Once at the keypad, you remove the scroll from your pocket and unfurl it. It reads: “IIIIIIIIII.” That’s ten vertical marks, without any clear spacing between them.

With some quick mental arithmetic, you realize the combination to unlock the door could be anywhere from four digits long to 10 digits long. (Or is it IV digits to X digits?) How many distinct combinations are possible? If two combinations use the same numbers but in a different order, they are considered distinct.

Let $n_1$ be the number of "I" numerals, $n_2$ the number of "II" numerals and $n_3$ the number of "III" numerals. Since there are total of ten vertical marks, we must have $n_1 + 2n_2 + 3n_3 = 10.$ If we have a nonnegative solution to this Diophantine equation, then since any order of the different numbers is a distinct combination, then for any solution there are a total of $$\frac{(n_1+n_2+n_3)!}{n_1! n_2! n_3!}$$ possible combinations associated with the solution $(n_1, n_2, n_3).$

We can solve this Diophantine equation by iterating through the possible cases for $n_3 \in \{0,1,2,3\}.$ When $n_3 = 0,$ the remaining equation is $$n_1 + 2n_2 = 10,$$ which has nonnegative solutions $$n_1 = 10 - 2k, n_2 = k, k = 0, 1, 2, 3, 4, 5.$$ When $n_3 = 1,$ the remaining equation is $$n_1 + 2n_2 = 7,$$ which has nonnegative solutions $$n_1 = 7-2k, n_2 = k, k = 0, 1, 2, 3.$$ When $n_3 = 2,$ the remaining equation is $$n_1 + 2n_2 = 4,$$ which has nonnegative solutions $$n_1 = 4-2k, n_2 = k, k = 0, 1, 2.$$ Finally, when $n_3 = 3,$ the remaining equation is $n_1 + 2n_2 = 1,$ which has only a single nonnegative solution $n_1 = 1, n_2 = 0.$ So overall, the solution set is \begin{align*}\mathcal{N} &= \{ (0,2,2), (0,5,0), (1,0,3), (1,3,1), (2,1,2), (2,4,0), (3,2,1), \\ &\quad\quad (4,0,2), (4,3,0), (5,1,1), (6,2,0), (7,0,1), (8,1,0), (10,0,0) \}\end{align*} So, the total number of possible combinations is \begin{align*}N &= \sum_{(n_1,n_2,n_3) \in \mathcal{N}} \frac{(n_1+n_2+n_3)!}{n_1! n_2! n_3!} \\ &= \frac{4!}{0!2!2!} + \frac{5!}{0!5!0!} + \frac{4!}{1!0!3!} + \frac{5!}{1!3!1!} + \frac{5!}{2!1!2!} + \frac{6!}{3!2!1!} \\ & \quad\quad +\frac{6!}{4!0!2!} + \frac{7!}{4!3!0!} + \frac{7!}{5!1!1!} + \frac{8!}{6!2!0!} + \frac{8!}{7!0!1!} + \frac{9!}{8!1!0!} + \frac{10!}{10!0!0!} \\ &= 6+1+4+20+30+15+60+15+35+42+28+8+9+1 \\ &=274\end{align*}

Monday, June 23, 2025

Greedy Mowing Madness

You’re mowing a circular lawn with a radius of $1$ unit. You can mow in straight strips that are $1$ unit wide. The fewest number of passes you would need to mow the entire lawn is two, as shown below. In one pass (shown in blue) you can mow half the circle, and in the second pass (shown in red) you can mow the other half of the circle.

However, instead of minimizing the number of passes, you greedily choose how to orient each pass so it cuts as much of the unmowed grass as possible. A pass doesn’t have to go through the center of the circle and can be in any direction, but must be straight and cannot bend.With this “greedy” approach, how many passes will it take for you to mow the entire lawn?

First let's establish the first greedy mowing pass. Without loss of generality, we can set our circular lawn's center at the origin, and by rotating, we can assume that the first pass will be parallel to the $y$-axis. In this case, the center of the pass will be at the line $x = x_0$ for some $x_0 \in (-1,1)$ and the strip will cover the set $$\Omega_1(x_0) = \{ (x,y) \in B(0,1) \mid |x - x_0| \lt \frac{1}{2} \}.$$ The area of the circular lawn that this pass will mow is $$A_1(x_0) = \int_{\max \{ -1, x_0 - \frac{1}{2} \}}^{\min \{ 1, x_0 + \frac{1}{2} \}} 2 \sqrt{1-x^2} \,dx.$$ Differentiating under the integral sign, we get \begin{align*}A_1^\prime(x_0) &= 2\sqrt{ 1 - \min \{1, x+ \frac{1}{2} \}^2 } \frac{d}{dx} \min \{ 1, x + \frac{1}{2} \} - 2\sqrt{1 - \max \{ -1, x - \frac{1}{2} \}^2 } \frac{d}{dx} \max \{ -1, x - \frac{1}{2} \} \\ &= \begin{cases} 2\sqrt{1 - (x+\frac{1}{2})^2}, &\text{if $x \in (-1, -\frac{1}{2})$;} \\ 2\left( \sqrt{1 - (x+\frac{1}{2})^2} - \sqrt{1 - (x-\frac{1}{2})^2} \right), &\text{if $ x \in (-\frac{1}{2}, \frac{1}{2})$;}\\ -2 \sqrt{1 - (x-\frac{1}{2})^2}, &\text{if $x \in (\frac{1}{2}, 1).$}\end{cases}\end{align*} Solving we for $A^\prime_1(x_0) = 0$ we get the only critical point as $x^*_0 = 0,$ which yields $A_1(0) = \int_{-1/2}^{1/2} 2\sqrt{1-x^2}\,dx = \frac{\pi}{3} + \frac{\sqrt{3}}{2}.$ So let's assume that the first pass is $\Omega_1 = \{ (x, y) \in B(0,1) \mid |x| \lt \frac{1}{2} \}.$

Let's move on to the next greedy pass. Now obviously from the first pass we see that the optimal method is to have some path of the mower that goes through the origin, so without loss of generality, let's assume that the center of the second pass goes through the line $y = mx,$ for some $m \in (-\infty, \infty).$ The second pass is thus the area $$\Omega_2(m) = \{ (x,y) \in B(0,1) \mid | mx - y | \lt \frac{\sqrt{1 + m^2}}{2} \}.$$ The newly mowed area in pass two is the entirety of $A_1(0)$ minus the area of the overlap between the two passes $\Omega_1(0) \cap \Omega_2(m).$ For $m \in (-\frac{\sqrt{3}}{3}, \frac{\sqrt{3}}{3}),$ we see that the overlap is the parallelogram formed by the lines $x = \pm \frac{1}{2},$ $y = mx + \frac{\sqrt{1+m^2}}{2},$ and $y=mx-\frac{\sqrt{1+m^2}}{2},$ that is $O(m) = \sqrt{1+m^2}.$ Therefore, the new area mowed by pass two is $$A_2 (m) = A_1(0) - O(m) = \frac{\pi}{3} + \frac{\sqrt{3}}{2} - \sqrt{1+m^2}.$$ This is maximized when $m^* = 0,$ and when $$A_2^* = A_2(0) = \frac{\pi}{3} + \frac{\sqrt{3}}{2} - 1.$$ We can see that if $|m| \gt \frac{\sqrt{3}}{3},$ the overlapping area of $\Omega_1(0) \cap \Omega_2(m)$ is larger than the area given by the parallelogram when $m = \pm \frac{\sqrt{3}}{3},$ so $A_2(m) \lt A_2( \sqrt{3}/3)$ for all $|m| \gt \frac{\sqrt{3}}{3},$ so we can claim that not only is $m^*=0$ the optimizer within the neighborhood $(-\frac{\sqrt{3}}{3}, \frac{\sqrt{3}}{3})$ but in fact for all $m \in \mathbb{R}.$

So after two passes we are left with the unmowed area $$U = \{ (x,y) \in B(0,1) \mid \max \{ |x|, |y| \} \gt \frac{1}{2} \}.$$ This has four sub-regions, one in each quadrant of the plane. It is easy enough to construct a mowing path that covers at least two of the four sub-regions, but you cannot cover all four sub-regions with an acceptable mowing path, for instance, take the path $$\Omega_3 = \{ (x,y) \in B(0,1) \mid |x + y| \lt \frac{\sqrt{2}}{2} \}.$$ Though there is in fact definitely a better approach that can cover strictly greater than two complete sub-regions, once at least two of the four sub-regions are covered, all remaining unmowed area is coverable within one more pass. Therefore, the greedy approach will take four passes to cover the lawn.

Monday, June 16, 2025

Zeno's Paradoxical 5K - Extra Credit

I still want to run a negative split, but upping my tempo in discrete steps is such a slog. Instead, my next plan is to continuously increase my pace in the following way:

At the beginning of the race, I’ll start with a 24-minute pace. Then, wherever I am on the race course, I’m always running 10 percent faster than I was when I was twice as far from the finish line. Also, my speed should always be increasing in a continuous and smooth fashion. Using this strategy, how long will it take me to complete the 5K? Once again, I’m hoping it’s faster than my steady 23-minute pace, even though I started out slower.

Let's again assume that the race start at $x=0$ and finish at $x=5,$ but we will need to understand some sort of history, since otherwise when I am one step into the race, I would need to know how fast I was going about 2 steps closer than $5$ kilometers before the beginning of the race. So here let's define $x_n = 5(1 - 2^{-n})$ for $n = \mathbf{-1}, 0, 1, 2, \dots$ be breakpoints where we will be interested in defining our speed. Let $v(x)$ denote the velocity you are running (in kmph), when you are at some $x \in (-5,5).$ Again, here we are given that $v(0) = 12.5$ and that at any point $x \in (0,5)$ we have $$v(x) = 1.1 v(2x-5).$$ Since $x_0 = 0,$ and $2x_n-5 = x_{n-1}$ for each $n = 0, 1, \dots,$ we can use the recursive formula to find the values of $$v_n = v(x_n) = 12.5 \cdot 1.1^n, \,\, n = -1, 0, 1, 2, \dots$$

Now let's tackle what smooth means here. The link provided in the prompt to Wolfram suggests that we are talking about some sort of $C^n(-5,5)$ space of functions on $(-5,5)$ that are continuous and have continuous derivatives up to order $n$ for some $n.$ We could of course overachieve and go with $C^\infty(-5,5),$ which we will start with here, but then see if we can relax this for some more Zeno's 5K fun. So without further adieu, let's start with the $C^\infty(-5,5)$ solution:

First, by design and analogy, we are likely to have a velocity function $v$ that blows up to $\infty$ as $x \to 5^-.$ Let's try something from the parametric family $$v_\alpha(x) = \frac{12.5}{(1-\frac{x}{5})^\alpha}$$ for $\alpha \gt 0,$ which satisfy $v_\alpha \in C^\infty(-5,5),$ $v_\alpha(0) = 12.5$ and $v_\alpha^\prime (x) \geq 0$ for all $x \in (-5,5).$ The only thing left to do is to decide which value of $\alpha$ satisfied the recursion formula. Note that $$v_\alpha(2x-5) = \frac{12.5}{\left(1 - \frac{2x-5}{5}\right)^\alpha} = \frac{v_\alpha(x)}{2^\alpha},$$ for all $x \in (0, 5)$ so in order to satisfy the recursion formula $v(x) = 1.1 v(2x-5)$ we must have $$\alpha = \frac{\ln 1.1}{\ln 2}.$$ So we have a $C^\infty (-5,5)$ solution for velocity at position $x \in (0,5),$ $$v^*(x) = 12.5 \left( 1 - \frac{x}{5} \right)^{-\frac{\ln 1.1}{\ln 2}},$$ which gives a completion time for the $C^\infty$ extra credit Zeno's 5K of \begin{align*}T = \int_0^5 \, \frac{dx}{v(x)}& = \int_0^5 \frac{ \left(1 - \frac{x}{5} \right)^{\frac{\ln 1.1}{\ln 2}} }{12.5} \,dx\\ &= \frac{5}{12.5} \int_0^1 u^{\frac{\ln 1.1}{\ln 2}} \,du \\ &= \frac{5}{12.5} \frac{1}{1 + \frac{\ln 1.1}{\ln 2}} = 0.351647262315\dots \,\,\,\text{hours},\end{align*} or $21.0988357389\dots$ minutes.

Obviously, $C^\infty$ qualifies as smooth, but what if we are willing to go with only $C^1(-5,5)$ solutions. Obviously, since $C^\infty \subsetneq C^1,$ we still have the solution $v^*$ from above, but what if we use our points $\mathcal{X} = \{x_{-1}, x_0, x_1, \dots \}$ as knots to define a $C^1$ quadratic spline. Can we still get a solution that is C^1 and satisfies the recursion formula? It turns out, $\dots$ yes!

Lets define the velocity function $u = u(x)$ by spline components $$u_n(x) = a_n + b_nx + c_nx^2$$ for $x_{n-1} \leq x \leq x_n,$ for $n = 0, 1, 2, \dots.$ In order to satisfy the recursion formula, since $2x-5 \in (x_{n-1},x_n)$ for all $x \in (x_n, x_{n+1}),$ we would need \begin{align*}u_{n+1}(x) &= a_{n+1} + b_{n+1} x + c_{n+1} x^2 \\ &= 1.1 u_{n}(2x-5)\\ &= 1.1 \left( a_n + b_n (2x-5) + c_n (4x^2 - 20x + 25) \right)\\ &= 1.1 (a_n - 5 b_n + 25 c_n) + 1.1 (2b_n - 20c_n) x + 4.4 c_n x^2,\end{align*} so we must have the recursion formulae \begin{align*} a_{n+1} &= 1.1 \left( a_n - 5b_n + 25 c_n \right) \\ b_{n+1} &= 2.2 \left( b_n - 10 c_n \right) \\ c_{n+1} &= 4.4 c_n \end{align*} for all $n = 0, 1, \dots.$

Using this recursion formula we can define each quadratic spline component in terms of the parameters $(a_0, b_0, c_0).$ We need to confirm that we can choose appropriate values of $(a_0, b_0, c_0)$ such that the resulting spline $u = u(x) \in C^1(-5,5),$ that $u^\prime(x) \geq 0$ for all $x \in (-5,5)$ and the $u(0) = 12.5.$ Since $x_0 = 0$ is on both the $u_0$ and $u_1$ components of the spline and we want both $u$ and $u^\prime$ continuous at $x_0$ we must have $$u_0(0) = a_0 = u_1(0) = a_1 = 1.1 (a_0 - 5b_0 + 25 c_0) = 12.5$$ and $$u^\prime_0(0) = b_0 = u^\prime_1(0) = b_1 = 2.2 (b_0 - 10c_0).$$ So we can group these conditions into the system of three equations and three unknowns \begin{align*}a_0 &= 12.5 \\ 0.1 a_0 - 5.5 b_0 + 27.5 c_0 &= 0 \\ 1.2 b_0 - 22 c_0 &= 0,\end{align*} which can be solved by $(a_0,b_0,c_0) = (12.5, \frac{5}{16}, \frac{3}{176}).$

From the choice of $(a_0,b_0,c_0)$ we have already established the base case $u_0(x_0) = u_1(x_0) = v_0 = 12.5.$ Suppose that for some $n$ we have $u_n(x_{n-1}) = u_{n-1}(x_{n-1}) = v_{n-1}.$ By construction, we have $u_n(x_n) = 1.1u_{n-1}(2x_n-5) = 1.1u_{n-1}(x_{n-1}) = v_n.$ Similarly, by construction, we have $u_{n+1}(x_n) = 1.1u_n(2x_n-5) = 1.1u_n(x_{n-1}) = v_n.$ Therefore, for each n = 0, 1, \dots, by induction we have $u_n(x_n) = u_{n+1}(x_n) = v_n,$ which proves that $u \in C(-5,5).$ Similarly, we have from the choice of $(a_0,b_0,c_0)$ already established that $u_0^\prime(x_0) = u_1^\prime(x_0).$ Assume that for some $n$ we have $u_n^\prime(x_n) = u_{n+1}^\prime(x_n).$ Since $u_{n+1}(x) = 1.1u_n(2x-5)$ we have $u^\prime_{n+1}(x) = 2.2u_n^\prime(2x-5),$ so in particular $$u^\prime_{n+1}(x_{n+1})=2.2u_n^\prime(2x_{n+1}-5) = 2.2u_n^\prime(x_n).$$ Similarly, we have $u^\prime_{n+2}(x) = 2.2u_{n+1}^\prime (2x-5),$ so in particular $$u^\prime_{n+2}(x_{n+1}) = 2.2u^\prime_{n+1}(2x_{n+1}-5) = 2.2u_{n+1}^\prime(x_n) = 2.2u_n^\prime(x_n) = u^\prime_{n+1}(x_{n+1}),$$ so we conclude that $u_n^\prime(x_n) = u_{n+1}^\prime (x_n),$ for all $n = 0, 1, 2, \dots,$ so $u \in C^1(-5,5).$ From the solution of $(a_0,b_0,c_0)$ we have $u_0^\prime(x) \geq 0$ for all $x \in (x_{-1}, x_0),$ and since $$u^\prime_{n}(x) = 2.2^n u_0(2^nx - 5(2^n-1)) \geq 0$$ for each $n = 1, 2, \dots,$ we have $u^\prime(x) \geq 0$ for all $x \in (-5,5).$ So the $C^1$ monotonic increasing quadratic spline $u$ is another $C^1$ solution to the extra credit Zeno 5K. This quadratic spline solution provides a slightly slower $C^1$ completion time of $$T = \int_0^5 \,\frac{dx}{u(x)} \approx 0.354718139\dots \,\,\text{hours}$$ or $21.28308834\dots$ minutes. Ultimately, it would appear that for any $n$ there is a $C^n$ solution comprised of a monotonically increasing $(n+1)$th order polynomial spline that satisfies the recursion formula of $v(x) = 1.1v(2x-5)$ for all $x \in (0,5).$

Zeno's Paradoxical 5K

I’ve been experimenting with different strategies in $5000$-meter races (known as “5K”s). If I run the distance at a steady pace, I’ll finish in precisely $23$ minutes. However, I tend to find a burst of energy as I near the finish line. Therefore, I’ve tried intentionally running what’s called a “negative split,” meaning I run the second half of the race faster than the first half—as opposed to a “positive split,” where I run a slower second half.

I want to take the concept of a negative split to the next level. My plan for an upcoming race—the “Zeno Paradox 5K”—is to start out with a $24$-minute pace (i.e., running at a speed such that if I ran the whole distance at that speed, I’d finish in $24$ minutes). Halfway through the race by distance (i.e., after $2500$ meters), I’ll increase my speed by $10$ percent. Three-quarters of the way through, I’ll increase by another $10$ percent. If you’re keeping track, that’s now $21$ percent faster than my speed at the start.

I continue in this fashion, upping my tempo by $10%$ every time I’m half the distance to the finish line from my previous change in pace. (Let’s put aside the fact that my speed will have surpassed the speed of light somewhere near the finish line.) Using this strategy, how long will it take me to complete the 5K? I’m really hoping it’s faster than my steady $23$-minute pace, even though I start out slower (at a $24$-minute pace).

Let the race start at $x=0$ and finish at $x=5.$ Let $x_n = 5(1 - 2^{-n})$ for $n = 0, 1, 2, \dots$ be the breakpoints where you are increasing your speed. Let $v(x)$ denote the velocity you are running (in kmph), when you are at some $x \in (0,5).$ Then $$v(x) = 12.5 \cdot 1.1^{\max \{ i : x_i \lt x \} }.$$ The time spent at the velocity $v_n = 12.5 \cdot 1.1^n$ is $$t_n = \frac{ x_{n+1}-x_n }{ v_n } = \frac{ 5 \cdot 2^{-(n+1)} }{ 12.5 \cdot 1.1^n } = \frac{2.2^{-n}}{5},$$ so the total time to complete the 5K in this fashion is $$T = \sum_{n=0}^\infty \frac{2.2^{-n}}{5} = \frac{1}{5} \frac{1}{1- \frac{1}{2.2}} = \frac{2.2}{5 \cdot 1.2} = \frac{11}{30}\,\,\text{ hours},$$ or $22$ minutes.

Monday, June 9, 2025

Overlapping bubble packing

Draw a unit circle (i.e., a circle with radius $1$). Then draw another unit circle whose center is not inside the first one. Then draw a third unit circle whose center is not inside either of the first two.

Keep doing this until you have drawn a total of seven circles. What is the minimum possible area of the region that’s inside at least one of the circles?

Let's work smarter rather than harder and first note that the if we want to pack a circle full of seven non-overlapping unit circles then the enclosing circle's radius would be $3$, with one circle concentric with the enclosing circle and the other $6$ circles arranged with their centers forming a regular hexagon, which even Wikipedia snarkily notes is "Trivially optimal". Of course, we're not here for non-overlapping circles, but let's use this same configuration. Situate a unit circle at the origin, $x_0=(0,0),$ and then $6$ unit circles centered at $x_i = ( \cos \frac{2\pi (i-1)}{6}, \sin \frac{2\pi (i-1)}{6} ),$ for $i = 1, 2, \dots, 6.$ See a Desmos illustration of the configuration below:

First we note that $\| x_i - x_j \|_2 = 1,$ for $i \ne j,$ so none of the $7$ centers of the unit circles are inside any other circle. Next we note that the bounding circle around these $7$ circles has radius $$R = \max \{ \|x\|_2 \mid \min_{i = 0, 1, \dots, 6} \{ \|x - x_i \|_2 \} \leq 1 \} = 2,$$ which is (perhaps again trivially, per Wikipedia) optimal. However, we do not want the minimal bounding radius, but rather the shaded area. Let's use inclusion / exclusion and note that there are six regions between the bounding circle of radius two and the shaded area, so once we know the area of one of these sections, say A^\prime, then the shaded area is $A = 4\pi - 6A^\prime.$ Using symmetry, we can denote the area of one of those sections as $$A^\prime = 2 \int_0^1 \left( \sqrt{4-x^2} - \left(\sqrt{1 - (x-\frac{1}{2})^2} + \frac{\sqrt{3}}{2}\right) \right) \,dx = A_1 - A_2 - \sqrt{3},$$ where $$A_1 = 2 \int_0^1 \sqrt{4 - x^2} \,dx$$ and $$A_2 = 2 \int_0^1 \sqrt{1 - (x-\frac{1}{2})^2} \,dx.$$ Using trig substitution $x = 2 \cos \theta,$ we see that \begin{align*}A_1 = 2\int_0^1 \sqrt{4-x^2} \,dx &= -8 \int_{\pi/2}^{\pi/3} \sin^2 \theta \,d\theta \\ &= 4 \int_{\pi/3}^{\pi/2} 1 - \cos 2\theta \,d\theta \\ &= \frac{2\pi}{3} + 2 \sin \pi - 2\sin \frac{2\pi}{3} = \frac{2\pi}{3} + \sqrt{3}.\end{align*} Similarly, using the trig substitution $x = \frac{1}{2} + \cos \theta,$ we see that \begin{align*}A_2 = 2 \int_0^1 \sqrt{1 - (x-\frac{1}{2})^2} \,dx &= -2 \int_{2\pi/3}^{\pi/3} \sin^2 \theta \,d\theta \\ &= \int_{\pi / 3}^{2\pi/3} 1 - \cos 2 \theta\, d\theta \\ &= \frac{\pi}{3} + \frac{1}{2}\sin \frac{2\pi}{3} - \frac{1}{2}\sin \frac{4\pi}{3} = \frac{\pi}{3} + \frac{\sqrt{3}}{2}.\end{align*}

Therefore, we have $$A^\prime = \left( \frac{2\pi}{3} + \sqrt{3} \right) - \left( \frac{\pi}{3} + \frac{\sqrt{3}}{2} \right) - \sqrt{3} = \frac{\pi}{3} - \frac{\sqrt{3}}{2},$$ so we have the minimal area of the region that's inside at least one of the circles as $$A = 4\pi - 6A^\prime = 2\pi + 3\sqrt{3} \approx 11.4793377299\dots$$