## Mathematical Recreations

 MathRec Home Last Month's Problem Introductory Problem Older Problemsand Links Educational Resources Steve's Personal Page

#### Introductory Problem: Solutions

I've attacked this problem numerically and analytically. The numerical work is contained in a spreadsheet in Microsoft Excel format. I've tried to be as transparent as possible about the layout and formulas in the spreadsheet, but I've put together a brief overview here.

Smarties® come in six different colors and there are fifteen candies in each roll.

• Assuming the colors are individually random with equal probability, what is the chance of getting a roll of candy with at least one of each color? (See spreadsheet or analytical solutions, but the numerical answer is about 64.4%)
• With the same assumption, what is the chance of getting a roll of candy with at least two of each color? (See the spreadsheet solution, but the numerical answer is about 7.3%)
• With the same assumption, what is the chance of getting a roll of candy with k colors, 0<k<6? (See spreadsheet or analytical solutions)
• With the same assumption, what is the chance of getting a roll of candy with six consecutive candies, each a different color? (I don't really have a satisfactory analytical solution, but you can see the spreadsheet solution. The numerical answer is about 12.3%)
• With the same assumption, given that you have a roll with six different colors, what is the probability that the roll has six consecutive candies with different colors? (This is a trivial combination of the answers to the first question and the previous question. It is about 12.3%/64.4%, or about 19.1%)
• With the same assumption regarding randomness, given n possible colors for each of m locations, what is the probability of having k different colors in the set? (See the analytical solution below)
• With the same assumption, what is the probability of having at least one set of n consecutive locations where each is a different color? I suspect that this may not be solvable in any closed form. In any case, I don't have a closed-form solution yet. A general solution with linear computational complexity is here, and a method for generating analytical solutions for specific values of n is here.
• In an infinite string of ordered locations, each of which can be one of n different colors, what is the rate at which sequences of n different colors occur? The intent is that a location can only be part of one group of n. Specifically, the answer for n = 2 is not 1/2; it is 1/3 since a sequence like AABAA would count once, not twice. (I have a closed-form solution for this one below. There are also two similar questions that I came up with. The first one is "What is the probability that a particular sequence of n locations will all have different colors?" The answer to this one is n!/nn. The other similar question is: "What is the probability that a particlar location will be part of at least one sequence of n different colors?" I haven't attempted to solve this one yet, but I think it's doable, probably in closed form. Note that it's a special case of the previous question where m = 2n –1. Perhaps someone will tackle it.)

Thanks,
Steve

#### General Solution to P(k) for arbitrary n and m

Here we consider the general case of m items, each of which can be any of n different colors. I'm going to go through the process of "discovering" the solution, then deriving it. This mirrors the way I actually solved the problem

First, let's consider the probability that all m items are all the same color. That is, the case where k = 1. There are n possible colors. The probability that any given item is a specific color is equal to 1/n, so we find:

 (1.1) P(k = 1)  =  n (1/n)m  =  n x1,

where x1 is the probability that all of the items are a specific color.

If we now consider the case where k = 2, we observe that there are n(n - 1)/2 different ways to choose two colors. The probability that any given item is one of two specific colors is 2/n, but we don't want to count the cases where all of the items are the same color. For each choice of two colors, there are two possible ways that the items could be one color. So we find:

 (1.2) P(k = 2)  =  [n(n – 1)/2] [(2/n)m – 2(1/n)m]  =  [n(n – 1)/2] x2.

Similarly, for k = 3, we note that, for each choice of three colors, there are three ways to choose two of those colors and three ways to choose one of those colors, so:

(1.3)
 P(k = 3) =  [n(n – 1)(n – 2)/6] [(3/n)m – 3x2 – 3x1] =  [n(n – 1)(n – 2)/6] [(3/n)m – 3(2/n)m + 3(1/n)m]  =  [n(n – 1)(n – 2)/6] x3.

At this point, we can start to notice some patterns. The coefficient for the number of ways to choose k colors out of a possible n colors is, of course, the normal binomial coefficient n!/k!(n – k)!. Furthermore, the coefficients of xj are also the binomial coefficients k!/j!(k – j)!. So we can write the general form:

(1.4)
 P(k)  = n!  k!(n – k)! ( (k/n)m – k – 1 ∑ j = 1 k!  j!(k – j)! xj ) = n!  k!(n – k)! xk

Still, this is not in closed form. Let's do one more specific case for k = 4:

(1.5)
 P(k = 4)  =  [n!/4!(n – 4)!] [(4/n)m – 4(3/n)m + 6(2/n)m – 4(1/n)m]  =  [n!/4!(n – 4)!] x4.

Now we can notice that the coefficients of (j/n)m for each P(k) seem to be binomial coefficients as well, but with alternating sign. The coefficient of (j/n)m seems to be (–)(k – j)[k!/j!(k – j)!]. (Here I've used the notation (–)(k – j) = 1 if k – j is even and = –1 if k – j is odd.) Let's see if we can derive the observed result. (I actually went up to k = 6 before I decided to tackle the derivation.) We've shown that

(1.6)
xk  =   k

j = 1
(–)(k–j)  k!
j!(k – j)!
 ( ) j m n

for j < 5, but we can also substitute Eq. 1.6 into Eq. 1.4 to get a result for k = 5. (We expect the result to be Eq. 1.6, in which case, it will hold for all k. In case it's new to anyone, this technique is called the principle of mathematical induction.) We find

(1.7)
xk  =
 ( ) k m n
–  k – 1

j = 1
k!
j!(kj)!
j

i = 1
(–)(j–i)  j!
i!(j – i)!
 ( ) i m n

We want to find the coefficient for each (i/n)m, so we note that the order of the sum can be switched. For i = 1, there is a term for each j greater than or equal to one. Similarly, for arbitrary i, there is a term for each j greater than or equal to i. We rewrite the equation as

(1.8)
xk  =
 ( ) k m n
–  k – 1

i = 1
k – 1

j = i
(–)(j–i)  k!
j!(kj)!
j!
i!(j – i)!
 ( ) i m n

Looking at Eq. 1.6, we see that we want to pull out a factor of k!/i!(k – i)!, and we can also cancel out the like terms of j! to get

(1.9)
xk  =
 ( ) k m n
–  k – 1

i = 1
k!
i!(ki)!
k – 1

j = i
(–)(j–i)  (k – i)!
(k – j)!(j – i)!
 ( ) i m n

The sum from j = i to k – 1 isn't going to simplify, so let's substitute x = k – j to get a sum from one to k – i.

(1.10)
xk  =
 ( ) k m n
–  k – 1

i = 1
k!
i!(ki)!
k – i

x = 1
(–)(k–i–x)  (k – i)!
x!(k – i – x)!
 ( ) i m n

Now we can see the light of day. We can be pretty sure that the inner sum simplifies to a sign, and it simply remains to understand why. Now that we've switched the variables, we can notice that the inner sum is over the binomial coefficients of k – i with alternating sign. We know that the sum over all such coefficients must be equal to zero. (If you don't already know this, look at (y – 1)(k–i) for y = 1.) The inner sum is over all of the binomial coefficients of k – i except for x = 0. The inner sum is therefore simply equal to –(–)(k–i), and we get

(1.11)
xk  =
 ( ) k m n
+  k – 1

i = 1
(–)(k–i)  k!
i!(ki)!
 ( ) i m n
,

which simplifies trivially to Eq. 1.6. We find, therefore, that

(1.12)
P(k)  =   n!
k!(nk)!
k

j = 1
(–)(k–j)  k!
j!(k – j)!
 ( ) j m n

#### Method For Determining The Probability Of Finding At Least One Sequence Of n Different Colors In A Series Of m Locations.

The method used in AF5:BA21 of the spreadsheet for n = 6 and m = 15 can be applied to arbitrary n and m. (The computation in the spreadsheet tracks more information than is needed here, since it is keeping track of the number of colors which have ever occurred, in addition to the length of the current sequence.)

We can start by defining some notation. Let pm be the probability that a sequence of n differently colored locations will be found within a series of m ordered locations. If we keep track of state as we step through the locations, then at any given time we will have a current sequence of i locations with another j locations remaining to be examined. We define pij as the probability of finding a sequence of n differently colored locations out of a series of i + j ordered locations, given that the first i locations are known to be differently colored. We are looking for pm = p0, m. We can start by noting that

(2.1)
 p0, m =  p1, m – 1 pi, 0 =  0 for i < n, and pn, j =  1 for all j.

Let's start with the case where we have only one location left. We can see that

(2.2)
 pi, 1 =  0 for i  <  n – 1 pn – 1, 1 =  1 / n.

For all i < n and all j >0,

(2.3)
 pi, j  =  pi + 1, j – 1 (n – i) / n  + i ∑ k = 1 pk, j – 1 / n.

This can be used to calculate all pi, j for successively larger values of j for any value of n. The computational complexity is linear in n and in j.

#### Method for Finding Analytical Solutions To The Probility Of Finding A Sequence Of n Different Colors In A Series Of m Locations For Specific n.

After examining j different locations, you have either found a sequence of n different locations, or you have a current state with a sequence of i different locations. Think of the state after examining j locations as a vector, where element i is the probability that you have a current sequence of i different locations.

The initial vector after examining one location is (1, 0, 0, ...). The sum of the elements is one minus the probability that a sequence of n consecutive differently colored locations has been found.

There is an operator which transforms the vector each time a new location is examined. For any specific value of n, we can find eigenvectors x such that x transforms into x, where c is a constant (called the eigenvalue) which is characteristic of that eigenvector.

By finding the n – 1 different eigenvectors and decomposing (1, 0, 0, ...) into a linear combination of those eigenvectors, we can create a general solution for any specific n.

Let's look at the trivial case where n = 2. Each vector has one element. After examining one location, you will always have a current sequence of one differently colored location, corresponding to a vector of (1). After examining a second location, you will complete a sequence of two different colors half the time. The other half of the time, you choose the same color again, and you have a sequence of one differently colored location, corresponding to a vector of (1/2). After examining three locations, the vector is (1/4), and so on. The eigenvalue is 1/2, and the probability that you've found a sequence of n = 2 different locations after examining m locations is given by

 (3.1) p  =   1 – 1/2(m –1).

For n = 3, the situation is no longer trivial. Let's write the vector as (x1x2). When you look at another location, (x1x2) transforms into

 (3.2) (x1 + x2, 2x1 + x2) / 3.

Our eigenvectors will, therefore satisfy

(3.3)
 (x1 + x2) / 3 =   c x1 (2x1 + x2) / 3 =   c x2

It will be convenient to define g = 3c – 1 (or more generally, g = nc – 1) while we solve the linear equations. We find

(3.4)
 x2 =  g x1 x1 =  g x2 / 2  =  g2x1 / 2.

This can immediately be solved to find that g = ±√2. For the positive square root, we find an eigenvector proportional to (1, √2) with an eigenvalue of (1 + √2)/3. For the negative square root, we find an eigenvector proportional to (1, –√2) with an eigenvalue of (1 – √2)/3.

The vector x after examining one location can be written as

 (3.5) x  =  (1, 0)  =  (1, √2) / 2  +  (1, –√2) / 2.

We know how each eigenvector evolves, so we can immediately write the state after examining j locations:

(3.6)
 xj  = 1 2 (1, √2) ( 1 + √2  3 ) (j – 1) + 1 2 (1, –√2) ( 1 – √2  3 ) (j – 1)

From here, we can easily find the probability that a sequence of n = 3 differently colored locations has been found after examining m locations:

(3.7)
 p  =  1  – 3 2 ( 1 + √2  3 ) m – 3 2 ( 1 – √2  3 ) m

At this point, you may want to skip to the next section. The basic techniques for doing n > 3 are pretty much the same, but the math gets more complex. That is, the eigenvectors and eigenvalues may be complex numbers. To demonstrate how the system evolves with complex solutions, I'm providing a solution for n = 4. Feel free to skip it unless you are truly intrigued.

Our simultaneous equations for n = 4 are

(3.8)
 x2 + x3 =   g x1 3x1 + x3 =   g x2 2x2 =   g x3

From these, we find

(3.9)
 x2 =  g x3 / 2 x1 =  (g2 – 2) x3 / 6 g3 =  5g + 6.

Solving the cubic equation is tedious, but not hard. Obviously, the real root can be found numerically, but the analytical formula also works:

(3.10)
 g0 =  (3 + √(118/27))1/3  +  (3 – √(118/27))1/3 =  2.689095324

 (3.11) g2 + g0 g + 6/g0  =   0.

(3.12)
 g± =  –g0/2  ±  i √(3g02 – 20) / 2 =  –g0/2  ±  i gi/2,

where this equation defines gi = √(3g02 – 20). Note that it is not necessary to have any powers of g0 greater than g02 or any powers of gi at all, since these can be eliminated using the following identities:

(3.13)
 g03 =  5g0 + 6 gi2 =  3g02 – 20.

We find the following three eigenvectors for n = 4

(3.14)
 x0 =  (g02 – 2,  3g0,  6) x+ =  (6 – g02 – i g0gi,  –3g0 + 3i gi,  12) x– =  (6 – g02 + i g0gi,  –3g0 – 3i gi,  12)

With a little bit of work, we can express (1, 0, 0) in terms of the three eigenvectors:

(3.15)
 (1, 0, 0)  = 1  3g02 – 5 x0  + –gi + 3i g0  4gi (3g02 – 5) x+  + –gi – 3i g0  4gi (3g02 – 5) x–

We know how each eigenvector evolves, so we can use this information to write the general form for the state after examining j locations

(3.16)
xj  =
 1  3g02 – 5
x0
 ( g0 + 1   4 ) j – 1
+   gi + 3i g0
4gi (3g02 – 5)
x+
 ( 2 – g0 + i gi   8 ) j – 1
+   gi – 3i g0
4gi (3g02 – 5)
x
 ( 2 – g0 – i gi  8 ) j – 1

The elements of x are always real numbers. (Note that the last two terms are complex conjugates of each other; that is, the magnitude of the real and imaginary parts are the same, but with opposite sign for the imaginary part.) To get the answer to our question, we need to sum the elements of x to see how the sum evolves over time.

Multiplying out and using the identities to simplify, we get

(3.17)
x1 + x2 + x3   =
 g02 + 3g0 + 4  3g02 – 5 ( g0 + 1   4 ) j – 1
+
 gi (2g02 – 3g0 – 9)  +  i (–9g02 + 17g0 + 30)  2gi (3g02 – 5) ( 2 – g0 + i gi   8 ) j – 1
+
 gi (2g02 – 3g0 – 9)  –  i (–9g02 + 17g0 + 30)  2gi (3g02 – 5) ( 2 – g0 – i gi   8 ) j – 1

Now, finally, we can eliminate the complex notation with the identity ei t = cos t + i sin t. Noting that the answer we're looking for is 1 – (x1 + x2 + x2), we can write the probability that a sequence of m ordered locations has at least one series of n = 4 differently colored locations:

(3.18)
p   =
 1  – g02 + 3g0 + 4  3g02 – 5 ( g0 + 1   4 ) m – 1
–   {
 2g02 – 3g0 – 9  3g02 – 5 cos [ (m – 1) ( π – Tan–1 gi  g0 – 2 )]
 + 9g02 – 17g0 – 30  gi (3g02 – 5) sin [ (m – 1) ( π – Tan–1 gi  g0 – 2 )]}
 × ( g02 – g0 – 4  16 ) (m – 1)/2

#### General Solution For The Rate Of Occurrence of Sequences of n Different Colors In an Infinite Series

Here we consider an infinite series of locations, where each location can be one of n different colors. The question at hand is a variation on "How often do sequences of n different colors occur?" Taken at face value, the answer is simply the probability that n consecutive locations will be a different color. That is, n!/nn. (There is only one set of n colors, but there are n! different ways to order that set. Each ordering has a 1/nn chance of occurring.)

This answer counts overlapping strings twice, which is OK, if that's what you mean. For example, if there are six colors, the string ROYGVWOR would count twice—once for ROYGVW and once for YGVWOR. My intent is to count the number of nonoverlapping sequences of n different colors. (There is another variation of the question that I haven't solved: "What is the probability that a given location will be part of one or more sequences of n different colors?")

Let's look first at the simple case of 2 different colors. Let's assume that a sequence of two different locations has just occurred and we are looking for the next occurrence. When you examine the next location, you will have a sequence of one consecutive locations which are different colors. When you examine the second location, you will have a 50/50 chance of having a completed sequence of two different colors. If you don't complete the sequence, then you again have one consecutive location which is different.

Let's let x1 be the expectation value of the number of additional locations that you will have to examine, if you currently have one consecutive location which is different. Then we can write

 (4.1) x1  =   1 + x1/2,

since you must examine one more location plus a 50/50 chance of examining zero additional locations or x1 additional locations. In this case, we find x1 = 2. We'll let i be the length of the current sequence, and xi will be the expectation value of the number of additional locations which need to be examined before reaching a state of i = n. The answer we are looking for is x0. (Actually, we are looking for 1/x0, since the problem asked for the rate.) For any value of n, we can observe that

 (4.2) x0  =  1 + x1.

For the case of n = 2, we find x0 = 3. Now let's start exploring the system for other values of n. For any specific value of n, we can create a set of linear equations:

(4.3)
 xi  =  1  +  xi + 1 (n – i) / n  + i ∑ j = 1 xj / n

We note that xn is equal to zero. That is, when you get a sequence of n different locations, you terminate the search. I wrote out these equations manually (and solved them) up through n = 4. In doing that, I observed that there is a fairly small change from xi – 1 to xi. We can write

(4.3)
 xi – 1 =  xi  –  xi + 1 (n – i) / n  +  xi (n – i) / n =  xi (2n – i) / n  –  xi + 1 (n – i) / n,

since all but one term of the summation is common to both equations. This recursion formula can be used to generate all of the xi once we know two consecutive values. Since we know one value for i = n, we can also determine all of the xi in terms of xn – 1. Let's do the first few:

(4.4)
 xn – 2 =  xn – 1 (n + 1) / n xn – 3 =  xn – 1 (n2 + n + 2) / n2 xn – 4 =  xn – 1 (n3 + n2 + 2n + 6) / n3 xn – 5 =  xn – 1 (n4 + n3 + 2n2 + 6n + 24) / n4

I actually had to do this many to notice that there was a pattern, because I wasn't expecting to see one. (I was hoping that n2 + n + 2 would turn out to be something factorable.) We can now note that xi and xi + 1 bear some resemblance to each other. Let's assume that xi –1 has the following form:

 (4.5) xi – 1  =  xi  +   ci xn – 1 / nn – i

At this point, we can see that this form holds true for all i from n – 5 through n – 1. We can now rewrite the recursion formula above in terms of differences between neighboring xi

(4.6)
 xi – 1 – xi =  (xi – xi + 1) (n – i) / n ci xn – 1 / nn – i =  (ci + 1 xn – 1 / nn – i – 1) (n – i) / n

This immediately simplifies to

 (4.7) ci  =  (n – i) ci + 1.

Now (if you hadn't already noticed), we can see that the coefficients are factorials, and ci = (n – i)!. Let's look specifically at x0 – x1:

 (4.8) x0  –  x1  =  c1 xn – 1 / nn – 1

Since we know that x0 – x1 is equal to one, and we know that c1 is equal to (n – 1)!, so we find:

 (4.9) xn – 1  =  nn – 1 / (n – 1)!  =  nn / n!

and we can write the general form for xi

(4.10)
 xi  = nn   n! n – i – 1 ∑ j = 0 j!   n j

As a side note, xn – 1 = nn/n!, which is the rate at which (potentially overlapping) sequences of n different locations occur. (It is also one over the probability that n random locations will all be different colors.) Why?

Well, if you allow overlapping sequences, then you start looking for the next sequence when you already have n different in a row. So when you consider the next possible sequence, it's composed of the last n – 1 locations (which are all different), plus the next location. So the mean number of sequences that you have to examine to find the next one with all n locations colored differently will be xn – 1. This is necessarily one over the probability that a random sequence of n locations will all be colored differently, since the rate is one over the mean interval.

I could have used this argument to simply state that xn – 1 = nn/n!, but I felt it was better to derive it and then explain it. After all, I didn't have that kind of clarity of thought until after I found the answer, so I felt it was fair to show how I actually got there.