Ed Nigma's Answer Emporium: June 2023

This week, let’s look at the middle-square method for generating pseudorandom four-digit numbers. First, we start with a four-digit number, such as $9,876.$ If we square it, we get the eight-digit number $97,535,376.$ Our next pseudorandom number is taken from the middle four digits of that square: $5,353.$ And to get the next pseudorandom number, we can square $5,353,$ which gives us $28,654,609,$ the middle four digits of which are $6,546.$

Of course, many four-digit numbers have a square that’s seven digits rather than eight. And in a sequence of middle-square numbers, you also might encounter smaller numbers whose squares have six or fewer digits. In these cases, you can append zeros to the beginning of the square until you have eight total digits, and once again take the middle four digits.

No matter what initial four-digit number you pick, your sequence of pseudorandom numbers will eventually repeat itself in a loop. If you want the longest sequence of such numbers before any repetition occurs, what starting number should you pick? And how many unique numbers are in the sequence?

Mathematically, the middle square pseduorandom operation can be written as the following recurrence $$x_{n+1} = ( x_n \mod 1000000 ) // 100, \, \forall n \geq 1.$$ That is, first take the remainder when dividing by a million then use the integer divisor when dividing by $100.$ The assertion (which makes sense since there are only no more than $100,000$ possible values to take) is that eventually there will be some $n$ such that $$x_{n+1} \in \{ x_1, \dots, x_n \}.$$ This means that for any initial value of $x_1,$ we can generate the sequence of successive iterations of the middle square recurrence, each time checking the stopping criterion above. While there may certainly have been more elegeant techniques, I brute forced my way through to a solution of using $6,239$ as an inital value, which results in sequence of $111$ unique numbers before any repetitions.

Interestingly, though perhaps owing to a benevalent question-crafter more than anything else, there is only one such initial value that yields a sequence of $111$ unique numbers before repetitions, and all other positive integers $n \lt 111$ have at least $3$ initial values that lead to a sequence of $n$ unique values before repetitions. For instance, there are three fixed points of the middle square recurrence, namely $2,500$, $3,792$ and $7,600,$ where each exhibits $$x_2 = (x_1 \mod 1000000) // 100 = x_1.$$ Similarly, there are $3$ initial values that lead to a sequence of $110$ unique values before repetitions, namely $3,416$, $4,655$, and $9,251.$

Starting a competition with the flip of a coin (say, to determine possession of a ball) is so boring! Instead, let’s give the captain of one team a fair coin and the captain of the other team a fair die. The captain with the coin will flip it at the same time the other captain rolls the die. They continue doing this until the coin is the same (whether heads or tails) for three consecutive flips or the number that comes face-up on the die is the same for two consecutive rolls.

On average, how many coin flips will it take to get three in a row? And how many die rolls will it take to get two in a row?

Extra credit: While the numbers of flips and rolls may often be the same, which team — the team with the coin or the team with the die — is more likely to win the toss/roll? (That is, which is more likely to happen sooner?)

Let's tackle the expectation case first. Let's say that $R$ is the randome number of rolls to get two in a row on a fair die and $F$ is the random number of flips to get three in a row. Let's denote the outcome of the $k$th roll as $r_k$ and similarly the outcome of the $k$th flip as $f_k.$

Assume without loss of generality that the roller's first roll is some $i \in \{1, 2, 3, 4, 5, 6\}.$ Then the second roll will either be $i$, with probability $1/6,$ or some $j \ne i,$ with probability $5/6.$ So we have $$\mathbb{E}[R \mid r_1 = i] = \frac{1}{6} \mathbb{E}[ R \mid r_1 = r_2 = i ] + \frac{5}{6} \mathbb{E}[ R \mid r_1 = i \ne j = r_2 ].$$ In the former case, $R \equiv 2$ if $r_1 = r_2 = i$; alternatively, in the latter case, $R \mid r_1 = i \ne j = r_2$ is equivalent to $1 + R \mid r_1 = j.$ Due to the arbitrariness of the choice of initial roll realization, $\mathbb{E} [ R \mid r_1 = i ] = \mathbb{E} [ R \mid r_1 = j ],$ for all $i, j \in \{ 1, 2, 3, 4, 5, 6 \}$ we have $$\mathbb{E}[ R \mid r_1 = i ] = \frac{1}{6} 2 + \frac{5}{6} ( 1 + \mathbb{E} [ R \mid r_1 = i ] )$$ or equivalently, $$\mathbb{E} [ R \mid r_1 = i ] = 7.$$ So therefore, the expected number of rolls is $$\mathbb{E} [ R ] = \sum_{i=1}^6 \frac{1}{6} \mathbb{E} [ R \mid r_1 = i ] = 7,$$ again due to the arbitrariness of the initial roll realization.

Literally and figuratively on the flip side .... Let's assume without loss of generality that the first flip is $H$. The second flip is either again a $H$ or it will be a $T$, each with $1/2$ probability. So by the law of total expectation we have $$\mathbb{E} [F \mid f_1 = H] = \frac{1}{2} \mathbb{E} [ F \mid f_1 = f_2 = H ] + \frac{1}{2} \mathbb{E} [ F \mid f_1 = H, f_2 = T].$$ As before, we see that $\mathbb{E} [F \mid f_1 = H] = \mathbb{E} [F \mid f_1 = T],$ so we have $\mathbb{E} [ F \mid f_1 = H, f_2 = T ] = 1 + \mathbb{E} [ F \mid f_1 = H ].$ Going one step further if $f_1 = f_2 = H,$ then either $f_3 = H$ and we are done (with $F = 3$), or $f_3 = T,$ where we have $$\mathbb{E} [ F \mid f_1 = f_2 = H, f_3 = T ] = 2 + \mathbb{E}[F \mid f_1 = H].$$ So we have $$\mathbb{E} [F \mid f_1 = H] = \frac{1}{2} \left( \frac{1}{2} 3 + \frac{1}{2} (2 + \mathbb{E} [ F \mid f_1 = H ]) \right) + \frac{1}{2} ( 1 + \mathbb{E} [F \mid f_1 = H ] ),$$ or equivalently, $$\mathbb{E}[ F \mid f_1 = H ] = 7.$$ Once again, since the outcome of the choice of first flip was arbitrary, the expected number of flips is $$\mathbb{E}[F] = \frac{1}{2} \mathbb{E}[ F \mid f_1 = H ] + \frac{1}{2} \mathbb{E} [F \mid f_1 = T] = 7,$$ so the expectation is the same number of rolls and flips are the same.

So it is all well and good that the expected number of flips is the same, but will one side or the other be advantaged by the new method for determining the initial possession? Yes, the flipper will end up with a better odds of winning. Here I will give some hand-wavy justifications, but spinning up some simulations will additionally bear this out. The roller will always consistently have a $5/6$ probability of not winning on each roll. However, the flipper has a more volatile distribution not winning: if she just switched from $H$ to $T$ or vice versa, then the probability of not winning on the next roll is $100%,$ but if there are two in a row it drops to an even $50%$ of winning/not-winning. So as long as the flipper doesn't constantly go into streaks of alternating between H and T one after another, the flipper will have a lower probability of not-losing, or conversely probability of winning in the next round.

Ed Nigma's Answer Emporium

Monday, June 19, 2023

Middle Square Madness

Sunday, June 18, 2023

Laissez le bon temps rouler, or have a flipping good time!