Ed Nigma's Answer Emporium: Sundown Trail Race with a Mulligan

Now let’s add one more wrinkle. At some point during the race, if you’re unhappy with the loop you’ve just been randomly assigned, you’re granted a “mulligan,” allowing you to get another random assignment. (Note that there’s a $25$ percent chance you’ll be assigned the same loop again.) You don’t have to use your mulligan, but you can’t use it more than once.

As before, the time is $5$:$55$ p.m. You have just completed a loop, and you haven’t used your mulligan yet. With an optimal strategy (i.e., using the mulligan at the right moment, if at all), on average, what score can you expect to earn between $5$:$55$ p.m. and $7$ p.m.?

As we saw in the Classic post, let's use the absorption probabilities matrix $B = \left(B(t,s)\right)_{t \in \mathfrak{T}, s \in \mathfrak{S}},$ which is the probability of starting at transient state $t$ and ending up at the final, absorbing state $s.$ This will give us the expected score of starting at transient state $i$ with no mulligans $$E(i) = \mathbb{E} [ S \mid S_0 = i ] = \sum_{s \in \mathfrak{S}} s B(i,s).$$ In particular, we have the vector $$E = \left(E(t)\right)_{t \in \mathfrak{T}} = \begin{pmatrix} \frac{19933}{4096} \\ \frac{5245}{1024} \\ \frac{1437}{256} \\ \frac{317}{64} \\ \frac{293}{64} \\ \frac{69}{16} \\ \frac{77}{16} \\ \frac{21}{4} \\ \frac{23}{4} \end{pmatrix}.$$

We can only use a single mulligan, but it is enough to change the overall expected score. Let's denote $$\tilde{E}(t) = \tilde{\mathbb{E}}[ S \mid S_0 = s ],$$ as the expected score when starting at state $s$. In particular, let's say that we are currently finishing up our lap to attain the transient score of $t \in \mathfrak{T},$ where we haven't yet used our mulligan, and we are quickly decide for which new laps we want to use our mulligan. Let's say the randomly assigned lap will move us from $t$ to $t^\prime \in \mathfrak{T}.$ If we use our mulligan in this turn then we will end up with an expected score of $E(t);$ whereas if we don't use the mulligan, then we would end up with an expected score of $\tilde{E}(t^\prime),$ since we would still have our mulligan remaining. So the optimal choice if we end up moving from $t$ to $t^\prime \in \mathfrak{T}$ would be to give us $\max \{ E(t), \tilde{E}(t^\prime) \}.$ Similarly, if we are randomly assigned to move from $t$ to some $s \in \mathfrak{S}$, then if we use our mulligan we end up with expected score of $E(t)$ versus a score of $s,$ if we don't. So if we are would move from $t$ to $s \in \mathfrak{S}$ the optimal choice gives us $\max \{E(t), s \}.$ Putting this altogether, we the corresponding transition probabilities from the $Q$ and $R$ matrices given in the Classic problem, we get the recursive formula $$\tilde{E}(t) = \sum_{t^\prime \in \mathfrak{T}} Q(t, t^\prime) \max \{ E(t), \tilde{E}(t^\prime) \} + \sum_{s \in \mathfrak{S}} R(t,s) \max \{ E(t), s \}.$$

So let's go recursing .... \begin{align*} \tilde{E}(5.5) &= \frac{3}{4} \max\{ E(5.5), 5.5 \} + \frac{1}{4} \max \{ E(5.5), 6.5 \} = \frac{3}{4} E(5.5) + \frac{1}{4} 6.5 = \frac{95}{16} \\ \tilde{E}(5) &= \frac{3}{4} \max \{ E(5), 5 \} + \frac{1}{4} \max \{ E(5), 6 \} = \frac{3}{4} E(5) + \frac{1}{4} 6 = \frac{87}{16} \\ \tilde{E}(4.5) &= \frac{1}{4} \max \{ E(4.5), \tilde{E}(5.5) \} + \frac{3}{4} \max \{ E(4.5), 4.5 \} = \frac{1}{4} \tilde{E}(5.5) + \frac{3}{4} E(4.5) = \frac{163}{32} \\ \tilde{E}(4) &= \frac{1}{4} \max \{ E(4), \tilde{E}(5) \} + \frac{3}{4} \max \{ E(4), 4 \} = \frac{1}{4} \tilde{E}(5) + \frac{3}{4} E(4) = \frac{147}{32} \\ \tilde{E}(3.5) &= \frac{1}{4} \max \{ E(3.5), \tilde{E}(4.5) \} + \frac{1}{2} \max \{ E(3.5), 3.5 \} + \frac{1}{4} \max \{ E(3.5), 6.5 \} \\ &\quad\quad = \frac{1}{4} \tilde{E}(4.5) + \frac{1}{2} E(3.5) + \frac{1}{4} 6.5 = \frac{83}{16} \\ \tilde{E}(3) &= \frac{1}{4} \max \{ E(3), \tilde{E}(4) \} + \frac{1}{4} \max \{ E(3), 3 \} + \frac{1}{4} \max \{ E(3), 6 \} + \frac{1}{4} \max \{ E(3), 6.5 \} \\ &\quad\quad = \frac{1}{4} E(3) + \frac{1}{4} E(3) + \frac{1}{4} 6 + \frac{1}{4} 6.5 = \frac{1411}{256}\\ \tilde{E}(2) &= \frac{1}{4} \max \{ E(2), \tilde{E}(3) \} + \frac{1}{4} \max\{ E(2), \tilde{E}(5) \} + \frac{1}{4} \max \{ E(2), \tilde{E}(5.5) \} + \frac{1}{4} \max \{ E(2), 6.5 \} \\ &\quad\quad = \frac{1}{4} E(2) + \frac{1}{4} E(2) + \frac{1}{4} \tilde{E}(5) + \frac{1}{4} 6.5 = \frac{3029}{512}\\ \tilde{E}(1) &= \frac{1}{4} \max \{ E(1), \tilde{E}(2) \} + \frac{1}{4} \max \{ E(1), \tilde{E}(4) \} + \frac{1}{4} \max \{ E(1), \tilde{E}(4.5) \} + \frac{1}{4} \max \{ E(1), \tilde{E}(5.5) \} \\ &\quad\quad = \frac{1}{4} \tilde{E}(2) + \frac{1}{4} E(1) + \frac{1}{4} E(1) + \frac{1}{4} \tilde{E}(5.5) = \frac{5657}{1024} \end{align*}

This leaves us with one more step to show that the total expected score on accrued between $5$:$55$ pm and $7$ pm going at a constant $10$ minute per mile pace with one single mulligan is \begin{align*} \tilde{E}(0) &= \frac{1}{4} \max \{ E(0, \tilde{E}(1) \} + \frac{1}{4} \max \{ E(0), \tilde{E}(3) \} + \frac{1}{4} \max \{ E(0), \tilde{E}(3.5) \} + \frac{1}{4} \max \{ E(0), \tilde{E}(4.5) \} \\ &\quad\quad = \frac{1}{4} \tilde{E}(1) + \frac{1}{4} \tilde{E}(3) + \frac{1}{4} \tilde{E}(3.5) + \frac{1}{4} \tilde{E}(4.5)\\ &\quad\quad = \frac{21921}{4096} = 5.351806640625 \end{align*}

Ed Nigma's Answer Emporium

Monday, August 25, 2025

Sundown Trail Race with a Mulligan

No comments:

Post a Comment