Monday, October 28, 2024

Conditional candy

It’s Halloween time! While trick-or-treating, you encounter a mysterious house in your neighborhood.

You ring the doorbell, and someone dressed as a mathematician answers. (What does a “mathematician” costume look like? Look in the mirror!) They present you with a giant bag from which to pick candy, and inform you that the bag contains exactly three peanut butter cups (your favorite!), while the rest are individual kernels of candy corn (not your favorite!).

You have absolutely no idea how much candy corn is in the bag—any whole number of kernels (including zero) seems equally possible in this monstrous bag.

You reach in and pull out a candy at random (that is, each piece of candy is equally likely to be picked, whether it’s a peanut butter cup or a kernel of candy corn). You remove your hand from the bag to find that you’ve picked a peanut butter cup. Huzzah!

You reach in again and pull a second candy at random. It’s another peanut butter cup! You reach in one last time and pull a third candy at random. It’s the third peanut butter cup!

At this point, whatever is left in the bag is just candy corn. How many candy corn kernels do you expect to be in the bag?

The probability of drawing three peanut butter cups in a row, conditional on there being $k$ candy corn kernels, is $$\mathbb{P} \{ c = 3 \mid k \} = \frac{ 3 }{k + 3 } \frac{ 2}{k + 2} \frac{1}{k+1} = \frac{6}{(k+3) (k+2) (k+1) }.$$ Using Bayes' theorem, we can retrieve the conditional distribution of the number of candy corn kernels conditional on pulling three peanut butter cups in a row, namely, \begin{align*}\mathbb{P} \{ k \mid c = 3 \} &= \frac{ \mathbb{P} \{ c = 3 \mid k \} \mathbb{P} \{ k \} }{ \mathbb{P} \{ c = 3 \} } \\ &= \frac{ \mathbb{P} \{ c = 3 \mid k \} }{ \sum_{\ell = 0}^\infty \mathbb{P} \{ c = 3 \mid \ell \} } \\ &= \frac{ \frac{6}{(k+1)(k+2)(k+3)} }{ \sum_{\ell=0}^\infty \frac{6}{(\ell+1)(\ell+2)(\ell+3)} }\\ &= \frac{1}{M (k+1)(k+2)(k+3)},\end{align*} where $$M = \sum_{\ell=0}^\infty \frac{1}{(\ell +1)(\ell+2)(\ell+3)}.$$

We can calculate the convergent series $M$ by method of partial fractions. Let \begin{align*} \frac{1}{(\ell + 1)(\ell + 2)(\ell + 3)} &= \frac{A}{\ell+1} + \frac{B}{\ell+2} + \frac{C}{\ell+3} \\ &= \frac{ A (\ell + 2) (\ell + 3) + B(\ell + 1)(\ell + 3) + C(\ell + 1)(\ell + 2)}{ (\ell + 1) (\ell+2) (\ell+3) } \\ &= \frac{ (A + B + C) \ell^2 + (5A + 4B + 3C) \ell + (6A + 3B + 2C)}{(\ell+1)(\ell+2)(\ell+3)}.\end{align*} So we have the resulting system of linear equations \begin{align*} A + B + C &= 0 \\ 5A + 4B + 3C &= 0 \\ 6A + 3B + 2C & = 1 \end{align*}, which has solution $A = C = \frac{1}{2}$ and $B = -1.$ Therefore, $$\frac{1}{(\ell + 1) (\ell + 2)(\ell + 3)} = \frac{1}{2} \frac{1}{\ell+1} - \frac{1}{\ell+2} + \frac{1}{2} \frac{1}{\ell+3},$$ so we have \begin{align*} M &= \sum_{\ell = 0}^\infty \frac{1}{(\ell+1)(\ell+2)(\ell+3)} = \lim_{L\to \infty} \sum_{\ell=0}^L \frac{1}{(\ell+1)(\ell+2)(\ell+3)} \\ &= \lim_{L \to \infty} \sum_{ell=0}^L \left( \frac{1}{2} \frac{1}{\ell+1} - \frac{1}{\ell+2} + \frac{1}{2} \frac{1}{\ell+3} \right) \\ &= \lim_{L \to \infty} \frac{1}{2} \left( 1 + \frac{1}{2} + \frac{1}{3} + \cdots + \frac{1}{L+1} \right) - \left( \frac{1}{2} + \frac{1}{3} + \cdots + \frac{1}{L+1} + \frac{1}{L+2} \right) + \frac{1}{2} \left( \frac{1}{3} + \cdots + \frac{1}{L+1} + \frac{1}{L+2} + \frac{1}{L+3} \right) \\ &= \lim_{L\to \infty} 1 \cdot \frac{1}{2} + \frac{1}{2} \cdot \left( \frac{1}{2} - 1 \right) + \left( \frac{1}{2} - 1 + \frac{1}{2} \right) \cdot \left( \frac{1}{3} + \cdots + \frac{1}{L+1} \right) + \frac{1}{L+2} \cdot \left(-1 + \frac{1}{2} \right) + \frac{1}{L+3} \cdot \frac{1}{2} \\ &= \lim_{L \to \infty} \frac{1}{2} - \frac{1}{4} + O(L^{-2}) = \frac{1}{4}.\end{align*}

Therefore, we have $$\mathbb{P} \{ k \mid c = 3 \} = \frac{4}{(k+1)(k+2)(k+3)},$$ for $k = 0, 1, \dots,$ so we can calculated the conditional expectation as $$\mathbb{E} \left[ K \mid c = 3 \right] = \sum_{k=0}^\infty k \mathbb{P} \{ k \mid c = 3 \} = 4 \sum_{k=0}^\infty \frac{k}{(k+1)(k+2)(k+3)}.$$ As before, we can solve this series by the method of partial fractions. Here instead of the earlier system of equations, we now want to solve \begin{align*} A + B + C &= 0 \\ 5A + 4B + 3C &= 1 \\ 6A + 3B + 2C &= 0 \end{align*} which has solution $A = -\frac{1}{2},$ $B = 2,$ $C = - \frac{3}{2}.$ Thus the conditional expected number of candy corn kernels given that I drew the three peanut butter cups is \begin{align*}\mathbb{E} \left[ K \mid c = 3 \right] &= 4 \sum_{k=0}^\infty \frac{k}{(k+1)(k+2)(k+3)}\\ &= \lim_{L \to \infty} 4 \left(-\frac{1}{2} + \left( -\frac{1}{2} + 2 \right) \cdot \frac{1}{2} + O(L^{-2}) \right) = 4 \cdot \frac{1}{4} = 1.\end{align*}

Sunday, October 20, 2024

How boring can you get?

I have a large, hemispherical piece of bread with a radius of $1$ foot. I make a bread bowl by boring out a cylindrical hole with radius $r,$ centered at the top of the hemisphere and extending all the way to the flat bottom crust.

What should the radius of my borehole be to maximize the volume of soup my bread bowl can hold?

N.B. I originally conceived of this week's Fiddler problem in my mind's eye with the hemispherical bread having a flat upper crust, which led to initially thinking that it was a very weird setup where you keep cutting your cylindrical hole downward until you ever so slightly bump up against the curved bottom crust at which point you stop since obviously otherwise your bread bowl would have a hole in it and your soup would leak out. This is essentially the inverted logic of the handwavy no-surface-tension argument I make below, so in the end I think the math ends up being roughly the same ....

So anyway, freely choosing the coordinate system that best suits me, assume that your bread fills the space $B = \{ (x,y,z) \mid x^2 + y^2 + z^2 \leq 1, z \geq 0 \}.$ Assume that a cylindrical borehole takes would remove the portion of the $B$ that satisfyies $x^2 + y^2 \leq r^2,$ for some radius $r \gt 0.$ Ultimately, since we cannot rely on molecular properties of the varying soups that might fill the bread bowl to postulate any additional volume of soup due to surface tension, let's assume that the bread bowl can only be filled up to its upper rim, that is, the cylindrical cavity is given by $C(r) = \{ (x,y,z) \mid x^2 + y^2 \leq r^2, 0 \lt z \leq \sqrt{1-r^2}\}.$ The volume of $C(r)$ is given by $V(r) = \pi r^2 \sqrt{1-r^2}.$

Differentiating $V(r)$ gives $$V^\prime(r) = 2\pi r \sqrt{1-r^2} + \pi r^2 \left( \frac{-r}{\sqrt{1-r^2}} \right) = \frac{\pi r \left( 2(1-r^2) -r^2 \right)}{\sqrt{1-r^2}} = \frac{\pi r (2-3r^2)}{\sqrt{1-r^2}}.$$ Thus $V$ has critical points as $r_1 = 0$, $r_2 = \sqrt{\frac{2}{3}} = \frac{\sqrt{6}}{3},$ and $r_3 = 1.$ Since $V(0)=V(1)=0,$ the maximum possible volume of soup contained in this bread bowl is $$V^* = \frac{2\pi\sqrt{3}}{9} \approx 1.20919957616\dots$$ cubic feet, which occurs when choosing a radius of $$r^* = \frac{\sqrt{6}}{3} \approx 0.816496580928\dots$$ feet.

Instead of a hemisphere, now suppose my bread is a sphere with a radius of $1$ foot. Again, I make a bowl by boring out a cylindrical shape with radius $r,$ extending all the way to (but not through) the curved bottom crust of the bread. The central axis of the hole must pass through the center of the sphere.

What should the radius of my borehole be to maximize the volume of soup my bread bowl can hold?

Again hearkening back to my earlier spatial reasoning struggles, I could not for the life of me understand why this extra credit problem was in any way different from just having double the volume since you would then stop cutting as soon as you hit the curved bottom crust. Since I take Axiom of the Benevolent Fiddlermeister as a given, I have to assume that the extra credit problem is somehow different from the regular problem, so we will assume that I have a precision instrument that can bore a perfectly cylindrical hole in the spherical bread until I ever so slightly approach the curved lower crust and then somehow liquefy and extract the remaining bready portions all the way down to curved lower crust. So in this case the bowl takes the shape $\tilde{C}(r) = \{ (x,y,z) \mid x^2 + y^2 \leq r^2, -\sqrt{1-x^2-y^2} \lt z \leq \sqrt{1-r^2} \}.$ In this case, the volume is \begin{align*}\tilde{V}(r) &= \int_{ x^2 + y^2 \leq r^2 } \int_{-\sqrt{1-x^2-y^2}}^{\sqrt{1-r^2}} \, dz \,dy \,dx\\ &= \int_{x^2 + y^2 \leq r^2} \left( \sqrt{1-r^2} + \sqrt{1-x^2 -y^2}\right) \,dy \,dx \\ &= \int_0^{2\pi} \int_0^r \left( \sqrt{1-r^2} + \sqrt{1 - \rho^2} \right) \rho \,d\rho \, d\theta \\ &= \pi r^2 \sqrt{1-r^2} + 2\pi \int_0^r \rho \sqrt{1-\rho^2} \,d\rho \\ &= \pi r^2 \sqrt{1-r^2} + \frac{2\pi}{3} \left( 1 - (1-r^2)^{3/2} \right).\end{align*} We can write $(1-r^2)^{3/2} = (1-r^2) \sqrt{1-r^2}$ to then factor even further and see that $$\tilde{V}(r) = \pi r^2 \sqrt{1-r^2} + \frac{2\pi}{3} \left( 1 - (1-r^2) \sqrt{1-r^2} \right) = \frac{2\pi}{3} + \frac{\pi \sqrt{1-r^2}}{3} (5r^2 - 2).$$

Differentiating we get \begin{align*}\tilde{V}^\prime (r) &= \frac{10\pi r}{3} \sqrt{1-r^2} + \frac{\pi}{3} (5r^2 - 2) \left( \frac{-r}{\sqrt{1-r^2}} \right)\\ &= \frac{\pi r}{3\sqrt{1-r^2}} \left( 10 (1-r^2) - (5r^2 - 2) \right)\\ &= \frac{\pi r (4-5r^2)}{\sqrt{1-r^2}}.\end{align*} Thus $\tilde{V}$ has critical points at $r_1 = 0,$ $r_2 = \sqrt{\frac{4}{5}} = \frac{2\sqrt{5}}{5},$ and $r_3 = 1.$ Here since we have $\tilde{V}^\prime \gt 0$ when $r \lt \frac{2\sqrt{5}}{5}$ and $\tilde{V}^\prime \lt 0$ when $r \gt \frac{2\sqrt{5}}{5},$ we see that the maximum possible volume of our miraculously scooped out bread bowls with curved lower crusts is $$\tilde{V}^* = \frac{2\pi(5+\sqrt{5})}{15} \approx 3.03103706653\dots$$ cubic feet, which occurs when choosing a radius of $$\tilde{r}^* = \frac{2\sqrt{5}}{5} \approx 0.894427191\dots$$ feet.

Monday, October 14, 2024

Leading the logarithmic pack

You’re doing a $30$-minute workout on your stationary bike. There’s a live leaderboard that tracks your progress, along with the progress of everyone else who is currently riding, measured in units of energy called kilojoules. Once someone completes their ride, they are removed from the leaderboard.

Suppose many riders are doing the $30$-minute workout right now, and that they all begin at random times. Further suppose that they are burning kilojoules at different constant rates (i.e., everyone is riding at constant power) that are uniformly distributed between $0$ and $200$ Watts.

Halfway through (i.e., $15$ minutes into) your workout, you notice that you’re exactly halfway up the leaderboard. How far up the leaderboard can you expect to be as you’re finishing your workout?

Let's start by determining the distribution of the random other riders' outputs at any one time. At any particular time, say $\tau$, the only riders still on the leaderboard would have started at times $t \in (\tau - 1800, \tau).$ Let's assume that riders join the $30$ minute class uniformly randomly such that the probability that any join between the time $t$ and $t+dt$ is proportional to $dt,$ that is, at time \tau, the riders would have already completed $T \sim U(0,1800)$ seconds of the rider. These riders would also have uniformly distribution constant powers, $P \sim U(0,200).$ The total output is $O = P \cdot T,$ which we can get the distribution of by directly computing the \begin{align*}\mathbb{P} \{ O \leq \theta \} &= \frac{1}{360000}\int_0^{200} \int_0^{1800} \chi \{ p t \leq \theta \} \,dt \,dp \\ &= \frac{1}{360000} \int_0^{200} \int_0^{\min \{ 1800, \theta / p \}} \,dt \,dp \\ &= \frac{1}{360000} \int_0^{200} \min \left\{1800, \frac{\theta}{p} \right\} \,dp \\ &= \frac{1}{360000} \left( \int_0^{\theta/1800} 1800 \,dp + \int_{\theta / 1800}^200 \frac{\theta}{p} \,dp \right) \\ &= \frac{1}{360000} \left( \theta + \theta \left( \ln 200 - \ln (\theta / 1800) \right) \right) \\ &= \frac{\theta}{360000} \left( 1- \ln \left( \frac{\theta}{360000} \right) \right).\end{align*}

So let's simplify slightly and focus on the function $\Phi(t) = t ( 1 - \ln t ).$ So in particular, if at some point I find myself half way up the leaderboard, then that would mean that my output $\tilde{O} = 360000 \tilde{\theta}$ where $\Phi(\tilde{\theta}) = \frac{1}{2}.$ Of course, $\Phi$ does not have a neat and tidy inverse function, so we would have to implicitly solve for $\tilde{\theta} = \Phi^{-1}(0.5),$ but more on this later.

So since the distribution of random riders is time invariant, if halfway through my ride I have output $\tilde{O} = 360000 \Phi^{-1}(0.5),$ then I can expect my output at end of my ride I have output $2\tilde{O}.$ In this case, the proportion of riders that I will be ahead of the leaderboard is \begin{align*}\Phi(2 \Phi^{-1}(0.5)) &= 2\Phi^{-1}(0.5) \left( 1- \ln (2\Phi^{-1}(0.5))\right) \\ &= 2 \Phi^{-1}(0.5) \left( 1- \ln 2 - \ln \Phi^{-1}(0.5) \right) \\ &= -2 \Phi^{-1}(0.5) \ln 2 + 2 \Phi^{-1} (0.5) \left( 1 - \ln \Phi^{-1}(0.5) \right) \\ &= -2 \Phi^{-1}(0.5) \ln 2 + 2 \Phi \left( \Phi^{-1}(0.5) \right) \\ &= 1 - 2 \Phi^{-1}(0.5) \ln 2 = 1 - 2 \tilde{\theta} \ln 2 .\end{align*} So, since we can analytically solve the inverse function to find that $\tilde{\theta} = \Phi^{-1}(0.5) = 0.1866823,$ I can expect to be ahead of about $$1-2 \tilde{\theta} \ln 2 \approx 74.1203380189...\%$$ of the riders at the end of my ride.

As an added bonus problem (though not quite Extra Credit), what’s the highest up the leaderboard you could expect to be $15$ minutes into your workout?

If I am killing it at 200 Watts for the first 15 minutes, then I would have an output of $\hat{O} = 200 \cdot 900 = 180000$ kJ, which would put me ahead of about $$\Phi\left(\frac{\hat{O}}{360000}\right) = \Phi(0.5) = 0.5 (1 + \ln 2) \approx 84.657359028...\%$$ of the riders after $15$ minutes.