Monday, November 17, 2008

Combinatorial Analysis for Basic Probability Theory

Introduction

Combinatorial Analysis is concerned with determining the number of logical possibilities of some event without necessarily enumerating each one.

Basic Principle of Counting


If some event can occur in `n_1` different ways, and a following event can occur in `n_2` different ways ..., then the total number of ways the the events occur in the order given is `n_1*n_2*n_3* ...`


example 1


License plates begin with two letters, followed by a digit which may range from 1-9, which are then followed by two more digits each of which may range from 0-9. How many different license plate numbers may there be?


`26*26*9*10*10 = 608400`


example 2


There are `n` coin tosses, each toss may result in one of two outcomes: heads or tails. What are the total number of possible outcomes?


`2*2*2* ... *2 = 2^n` (`n` number of tosses, each having two possible outcomes)


Factorials


A factorial is the product of `n` integers ranging from `1` to `n` inclusively. It is written `n!`.

`n! = 1*2* ... *(n-1)*n`


By definition `1!``=1` and `0!``=1`. Factorials may also be defined recursively: `n! = n*(n-1)!


Binomial Coefficients and the Binomial Theorem


The symbol, `((n),(r))`, where `r` and `n` are positive integers with `r <= n`, is known as a binomial coefficient. This will be shown later to represent the value of the coefficient of the rth term in the binomial expansion of `(a+b)^n`. It is defined as follows:


(1) `((n),(r)) = (n(n-1)(n-2)*...*(n-r+1))/(1*2*3*...*(r-1)r)`


Observe that `(n-r)!` can be written as:

`(n-r)(n-r-1)*...*2*1`

and this expression multiplied by the numerator of (1) completes the sequence of `n!`:


`[n(n-1)(n-2)*...*(n-r+1)](n-r)! = n!`


Thus multiplying the RH side of (1) by `((n-r)!)/((n-r)!)` yields:


(2) `((n),(r)) = (n(n-1)(n-2)*...*(n-r+1))/(1*2*3*...*(r-1)r)*(((n-r)!)/((n-r)!))= (n!)/(r!(n-r)!)`


example and observation


Note that the number of factors in the numerator and denominators are equal:


`((8),(6)) = (8!)/(2!6!) = (8*7*6*5*4*3*2*1)/((2*1)*(6*5*4*3*2*1)) = (8*7)/(2*1)`


`((7),(3)) = (7!)/(4!3!) = (7*6*5*4*3*2*1)/((4*3*2*1)*(3*2*1)) = (7*6*5)/(3*2*1)`


Useful Identity


We can show that


(3) `((n),(n-r)) = ((n),(r))`

by substituting `n-r` for `r` in (1): `((n),(n-r)) = (n(n-1)(n-2)*...*(n-(n-r)+1))/(1*2*3*...*(n-r-1)(n-r))` and simplifying:


(4)`((n),(n-r)) = (n(n-1)(n-2)*...*(r+1))/((n-r)!)`


Similar to the operation performed in (2), multiplying the RH side of (4) by `(r!)/(r!)` yields:


(5)`((n),(n-r)) = (n(n-1)(n-2)*...*(r+1))/((n-r)!)*((r!)/(r!)) = (n!)/(r!(n-r)!)`


Since `(n-r)+r=n`, an important result of this fact is that if the sum of two numbers (say `a` and `b`) is `n`, i.e., `a+b=n`, then


(6) `((n),(a)) = ((n),(b))`


example


`((7),(3))=(7!)/(4!3!) = (7!)/(3!4!) = ((7),(4))`

The Binomial Theorem


The Binomial Theorem is used to calculate the coefficients of each term of a binomial expansion:


(7) `(x+y)^n = sum_(k=0)^n ((n),(k))x^(k)y^(n-k)`


Proof by Induction


Before starting the proof, a useful combinatorial identity is presented without proof:


(8) `((n),(r)) = ((n-1),(r-1)) + ((n-1),(r))     1<=r<=n`


Proof of case for `n` value of theorem is `1`


When `n=1`, (7) reduces to


(9) `x+y=((1),(0)) x^0y^1 + ((1),(1))x^1y^0 = x+y`      `:.`(`remember `0!``=1`)


Proof of case for `n` value of theorem `n-1 => n`


Clearly it is true that


(10) `(x+y)^n = (x+y)(x+y)^(n-1)


Now we will assume that (7) holds for `n-1` and that (10) can be equivalently expressed:


(11) `(x+y)^n = (x+y)sum_(k=0)^(n-1) ((n-1),(k))x^(k)y^(n-1-k)`


We will now show that the product represented by the RHS of (11) is equivalent to the RHS of (7), thus completing the proof.


First we distribute the summation over the `(x+y)` factor:


(12) `(x+y)^n = sum_(k=0)^(n-1) ((n-1),(k))x^(k+1)y^(n-1-k) + sum_(k=0)^(n-1) ((n-1),(k))x^(k)y^(n-k)`


For the first summation term, we let `i=k+1`, which equivalently is `k=i-1`. In the second summation, we let `i=k`. Note that this transformation does not change the number of terms, the values of the coefficients, or the exponent values of any of the factors:


(13) `(x+y)^n = sum_(i=1)^(n) ((n-1),(i-1))x^(i)y^(n-i) + sum_(i=0)^(n-1) ((n-1),(i))x^(i)y^(n-i)`


Next we change the range of both summations to be the same: `sum_(i=1)^(n-1)`, which means that the ending index of the first summation changes from `n` to `n-1` and the starting index of the second summation changes from 0 to 1. The changes do not affect the coefficient values of any of the middle terms, but does drop the end terms `x^n` and `y^n`. These terms are added to the summation to maintain their algebraic equivalence with the expression being operated on by this change. Since the summation operators are now the same, we can apply the distributive property:


(14) `(x+y)^n = x^n + sum_(i=1)^(n-1) (((n-1),(i-1)) + ((n-1),(i)))x^(i)y^(n-i) + y^n`


We can now apply the combinatorial identity introduced by (8):


(15) `(x+y)^n = x^n + sum_(i=1)^(n-1) ((n),(i)) x^(i)y^(n-i) + y^n`


Folding the end terms into the summation, we complete the proof:


(16) `(x+y)^n = sum_(i=0)^(n) ((n),(i)) x^(i)y^(n-i)` `:.`


Pascal's Triangle and Binomial Coefficients






Pascal's triangle has many interesting properties besides the ones I describe below. For a more thorough introduction, visit the Wikipedia article Pascal's triangle.



  • We can assign a row index `n`, beginning with 0, to each row of the triangle. Observe that each row represents the sequence of coefficients for an expansion of a binomial of order `n`. For a given row, the value of the coefficient of the column index `k` is given by `((n),(k))`.


  • There are two useful combinatorial identities which demonstrate that a given coefficient element is determined by the sum of elements directly above it in the preceeding row:



    1. `((n+1),(k)) = ((n),(k-1)) + ((n),(k))`


    2. `((n),(k)) = ((n-1),(k-1)) + ((n-1),(k))`






Permutations


Any arrangement of a collection of `n` objects in a given order is called a permutation of the objects (taken all at the time). Any arrangement of `r<=n` of these objects in a given order is callled an r-permutation or a permutation of the n objects taken r at a time.


example


Given a collection of four letters {a,b,c,d}:

  • bdca, dcba and acdb are permutations from the collection (taken all at the time)

  • bad, adb, cdb and bca are permutations from the collection (taken three at the time)




The following notations denote the same concept - the number of possible permutations of a collection of n objects taken r at the time:


P(n,r), nPr, Pn,r, or (n)r. This number is calculated using the fundemental principle of counting. For instance for P(4,3), we have four possiblities for the first choice, three possiblities for the second choice, and two possiblities for the third choice:


P(4,3) = 4*3*2 = 24


The general case for P(n,r) is


(17) P(n,r) = `n(n-1)(n-2)*...*(n-r+1)`


Multiplying the RHS of (17) by the identity `((n-r)!)/((n-r)!)` yields:


(18) P(n,r) = `(n!)/((n-r)!)`


Permutations where some elements are indistinguishable


Consider all permutations of the letters in the word 'DADDY'. Three of letters ('D') are the same - which means that some permutations are indistinguishable from one another. For a given permutation, for the first 'D' placed, one of three 'D' elements may chosen, for the second choice, one of two remaining 'D' elements may be chosen. Finally there is only one choice for the last one chosen. The important point of course is that no matter which 'D' is chosen for a given permutation, they all appear the same. Often when analyzing this type of problem it helps to think of these elements as having a separate identity using an index, for instance D1, D2, D3. Nevertheless, each separate element appears the same and is said to be indistiguishable in this respect.


When creating a permutation with the above letters, we choose the letter 'D' three times, which can be done in `3.2.1 = 3!` ways. Now observe that there are `6!` ways to permute the letters in 'DADDY'. Thus there are `(6!)/(3!) = 120/6 = 20` distinguishable ways to permute the letters. This leads us to the following theorem:


The number of permutations of `n` objects of which `n_1, n_2, ..., n_r` are alike is:


(19) `(n!)/(n_1!*n_2!* ... *n_r!`    Also see (24).


Combinations


Given a collection of n objects, a combination of n objects taken r at a time is any selection of r of the objects where order is disregarded. In other words, an r-combination of a set of n objects is any subset of those objects of size r.

example


Given the collection of four letters {a,b,c,d}, the r-combinations of size 3 are: abc, abd, acd, bcd. Note that the permutations involving the letters abc are abc, acb, bac, bca, cab, cba. Viewed as combination, these permutations are equivalent.

Notation


The number of combinations of n objects taken r at time may be denoted in several ways, all of them representing the equivalent function: C(n,r), nCr, Cn,r, `C_n^r`.

Recall that the number of permutations P(n,r) = `(n!)/((n-r)!)`. As illustrated by the previous example, we know that a given set of `r` elements, taken `r` at a time, can be arranged (permuted) in `r!` ways. Thus we can 'factor' out the redundant arrangements by dividing P(n,r) by `r!` to arrive at the number of unique combinations:


(20) `C(n,r) = (P(n,r))/(r!) = (n!)/(r!(n-r)!)`


Observe that this is the same formula for determining the binomial coefficient `((n),(r))`. Thus `C(n,r) = ((n),(r))`.


As noted in the introductory paragraph of the section, `C(n,r)` is the number of subsets of size `r` that can be combined from the elements the a set of size `n`. How many subsets can be made from a set `A` of size `n`? Let's be clear that what we are talking about is the power set of `A` - written `tt P(A)`. The cardinality of this set is '2^n'. This can be easily seen using the fundamental principle of counting. There is a binary choice for set membership - in or out. Thus for n elements in a set, there are `2^n` ways this can happen.


Multinomial Theorem


A multinomial is of the form `(x_1 + x_2 + ... + x_(m-1) + x_m)^n`


Let's look at the expansions of `(a+b+c+d)^2` and `(a+b+c+d)^3`:


(21) `(a+b+c+d)^2 = a^2 + b^2 + c^2 + d^2 + 2ab + 2ac + 2ad + 2bc + 2bd + 2cd`

(22) `(a+b+c+d)^3 = a^3 + b^3 + c^3 + d^3 + 3ab^2 + 3ac^2 + 3ad^2 + 3bc^2 + 3bd^2 + 3cd^2 + 3a^2b + 3a^2c + 3a^2d + 3b^2c + 3b^2d + 3c^2d + 6abc + 6abd + 6acd + 6bcd`

Observations



  • For `(a+b+c+d)^2`, the number of unique variables in a term may be 1 or 2.

  • For `(a+b+c+d)^3`, the number of unique variables in a term may be 1, 2, or 3.


Of course, the reason there is never more variables in a term than the integer value of the exponent is because of the number of multiplications involved in producing a term. Not counting the coefficients, a term is produced by some combination of multiplication with the same variable or another one. The reason some coefficient values are greater than one is because in a given multiplication, when the variables are not the same, the multiplication is performed more than once. For example for `(a+b+c+d)^2`, we produce terms `ab` and `ba`, Since order is not important, we just add (collect) them: `ab+ba = 2ab`.


In the case of `(a+b+c+d)^3`, we produce terms by multiplying the result in (21) by `(a+b+c+d)`. Although the terms in the RHS of (21) differ completely with the terms in `(a+b+c+d)`, multiplying still produces the necessary terms, which when collected, result in the coefficients we see.

Groundwork Analysis


We will abstract and generalize the base of a given multinomial by using indexed variables, thus the general form of a multinomial is: `(x_1+x_2+ ... +x_(r-1)+x_r)^n`. When multiplying, a given term will always have the form: `bb C(x_1^(n_1)*x_2^(n_2)* ... *x_(r-1)^(n_(r-1))*x_r^(n_r))`. However, there is a constraint that for a given term, the exponents must sum to the value of `n`. We will write this constraint using this notation:

(23) `n_1+n_2+ ... +n_(r-1)+n_r = n, n >= 0`.


This obviously implies that for multinomials for which `r<n`, a given `n_i` may range from `0` through `n`.


How are exponent values distributed in an expansion? For example we see that for `(a+b+c+d)^3`, `a^1b^1c^1` occurs six times, which involves six different multiplication operations. Thus the coefficient value, in this case six, tells us how many ways this product was produced.

Multinomial Coefficients


We develop here a formula which determines how we distribute `n` distinct items among `r` distinct groups of size `n_i`, where the sum of all `n_i` must equal `n` (see ((23)). This formula has other uses in combinitorial analysis, but for our purpose we can state that `n` is the order of the multinomial and r is the dimensionality or number of variables.


Here we will 'chain' (multiply) an arbitrary number of binomial coefficient functions to arrive at our formula.

We assign an arbitrary value `n_i` for each grouping within the set of `r` distinct groups, while maintaining the constraint that the sum of all `n_i` equals `n`, that is: `sum_(i=1)^r n^i = n`


Now there are `((n),(n_1))` possible choices for the first group.

There are `((n-n_1),(n_2))` possible choices for the second group.

There are `((n-n_1-n_2),(n_3))` possible choices for the third group - and so on.


Using the generalized fundemental principle of counting we get:

`((n),(n_1))((n-n_1),(n_2)) ... ((n-n_1-n_2- ... -n_(r-1)),(n_r)) = ((n!)/((n-n_1)!n_1!))(((n-n_1)!)/((n-n_1-n_2)!n_2!)) ... (((n-n_1-n_2- ... -n_(r-1))!)/(0!n_r!)) = `


(24) `((n!)/(n_1!n_2! ... n_r!))`


possible divisions. Note that this result is the same as (19).

Multinomial Theorem


We can now write and understand our multinomial theorem:


(25) `(a+b+c+d)^n = sum_(n_1,...,n_r) ((n!)/(n_1!n_2! ... n_r!)) x_1^(n_1)*x_2^(n_2)* ... *x_(r-1)^(n_(r-1))*x_r^(n_r)`    where `n_1+ ... +n_r = n`


In English, this theorem states that the expansion is the sum of all nonnegative integer-valued vectors `(n_1, ... , n_r)` such that `n_1+ ... +n_r = n`


How many coefficients are in an expansion?


The answer is given by the following theorem:


There are `((n+r-1),(n))` distinct nonnegative interger valued vectors `(x_1,x_2,...,x_r)` satisfying the eq. `x_1+x_2+...+x_r = n`. Not proved here, see reference.

example


Let's apply the multinomial theorem to the expansion of `(x+y+z)^2`:


`(x+y+z)^2 = ((2),(2 0 0)) x^2y^0z^0 + ((2),(0 2 0)) x^0y^2z^0 + ((2),(0 0 2)) x^0y^0z^2 + ((2),(1 1 0)) x^1y^1z^0 + ((2),(1 0 1)) x^1y^0z^1 + ((2),(0 1 1)) x^0y^1z^1 =`


`x^2+y^2+z^2+2xy+2xz+2yz`


There are `((2+3-1),(2))` = `((4),(2)) = ((4!)/((4-2)!2!)) = 24/4 = 6` coefficients.

References


Ross, Sheldon M.; A First Course in Probability; 1976.

No comments: