Monday, May 26, 2025

Fiddlish space frequencies

Before getting to rivers, let’s figure out where spaces are likely to appear in the (fictional) Fiddlish language, which includes only three- and four-letter words. These words are separated by spaces, but there is no other punctuation. Suppose a line of Fiddlish text is generated such that each next word has a 50 percent chance of being three letters and a 50 percent chance of being four letters.

Suppose a line has many, many, many words. What is the probability that any given character deep into the line is a space?

Let's suppose that we have N1 words in a line. Then there would be N1 spaces, whereas the total length of the line would be L=4F+3T+N1 characters, where F is the number of four-letter words and T is the number of three-letter words. Since F+T=N, we have L=4F+3(NF)+N1=4N+F1.

Since F is binomially distributed with frequency 12, the expected value of F is E[F]=N2, so the expected length is E[L]=92N1.

Therefore, the expected frequency of spaces in the line, which is equivalent to the probability that any given character deep into the line is a space, is then p=N1E[L]=N192N129=22.222%

as N.

No comments:

Post a Comment