Click here to purchase the entire book in PDF format.



next up previous contents index
Next: Suggested Reading List Up: Statistics Previous: Coherence   Contents   Index


Probability Distribution Functions

Flip a coin. You'll get a heads or a tails. Flip it again, and again and again, each time writing down the result. If you flipped the coin 1000 times, chances are that you'll see that you got a heads about 500 times and a tails about 500 times. This is because each side of the coin has an equal chance of landing up, therefore there is a 50% probability of getting a heads, and a 50% probability of getting a tails. If we were to draw a graph of this relationship, with ``heads'' and ``tails'' being on the x-axis and the probability on the y-axis, we would have two points, both at 50%.

Let's do basically the same thing by rolling a die. If we roll it 600 times, we'll probably see around 100 1's, 100 2's, 100 3's, 100 4's, 100 5's and 100 6's. Like the coin, this is because each number has an equal probability of being rolled. I tried this, and kept track of each number that was rolled an got the numbers shown in Table 4.2. If we were to graph this information, it would look like Figure 4.2.


Table 4.2: The results of rolling a die 600 times.
Result 1 2 3 4 5 6
Number of 111 98 101 94 92 104
times rolled            


Figure 4.2: The results of 600 rolls of a die. This is a plot of the information shown in Table 4.2.
\includegraphics[width=2.75in]{04statistics/graphics/die_pdf}

Let's say that we didn't know that there was an equal probability of rolling each number on the die. How could we find this out experimentally? All we have to do is to take the numbers in Table 4.2 and divide by the number of times we rolled the die. This then tells us the probability (or the chances) of rolling each number. If the probability of rolling a number is 1, then it will be rolled every time. If the probability is 0, then it will never be rolled. If it is 0.5, then the number will be rolled half of the time.

Figure 4.3: The calculated probability of rolling each number on the die using the results shown in Table 4.2.
\includegraphics[width=2.75in]{04statistics/graphics/die_pdf_2}

Notice that the numbers didn't work out perfectly in this example, but they did come close. I was expecting to get each number 100 times, but there was a small deviation from this. The more times I roll the dice, the more reality will approach the theoretical expectation. To check this out, I did a second experiment where I rolled the die 60,000 times.

Figure 4.4: The calculated probability of rolling each number on the die using the results after 60,000 rolls. Notice that the graph has a rectangular shape.
\includegraphics[width=2.75in]{04statistics/graphics/die_pdf_3}

This graph tells us a number of things. Firstly, we can see that there is a 0 probability of rolling a 7 (this is obvious because there is no ``7'' on a die, so we can never roll and get that result). Secondly, we can see that there is an almost exactly equal probability of rolling the numbers from 1 to 6 inclusive. Finally, if we look at the shape of this graph, we can see that it makes a rectangle. So, we can say that rolling a die results in a rectangular probability density function or RPDF.

It's possible to have different probability density functions. For example, what would happen if we rolled two dice? Let's do it and find out. I rolled a pair of dice 600 times and kept track of the results. These are all listed in Table 4.3.


Table 4.3: The results of rolling two dice 600 times.
Result Number of
  times rolled

2

21
3 36
4 55
5 72
6 78
7 87
8 84
9 60
10 50
11 37
12 20

 


Notice that I only rolled a ``2'' 21 times, but I rolled a ``7'' 87 times. This is because there was only one way that I could roll a 2 - by getting two 1's. However, there are different ways to get a 7. There's 1+6, 2+5, 3+4, 4+3, 5+2 and 6+1. So, it makes sense that I rolled a 7 four times as often as a 2 because there are four times as many combinations that result in a 7 than can result in a 2.

If I graph the results of the 600 rolls, we get the plot shown in Figure 4.5. Notice that it looks a bit like a triangle.

Figure 4.5: The results of rolling two dice 600 times(this is a graph of the same data shown in Table 4.3).
\includegraphics[width=2.75in]{04statistics/graphics/600_tpdf}

If I do the same thing, but roll the pair of dice 60,000 times instead, we get something like the numbers shown in Figure 4.4.


Table 4.4: The results of rolling two dice 60000 times.
Result Number of
  times rolled

2

1657
3 3283
4 4935
5 6663
6 8430
7 9988
8 8368
9 6699
10 4996
11 3336
12 1645

 


A graph of these numbers is shown in Figure 4.6 and the same graph represented as a probability (instead of the number of times the values were rolled) is shown in Figure 4.7. Notice that, when we roll so mane times, the graph really does look like a triangle. Consequently, we call it a triangular probability distribution function or TPDF.

Figure 4.6: results of rolling two dice 60000 times (this is a graph of the data shown in Table 4.4).
\includegraphics[width=2.75in]{04statistics/graphics/60000_tpdf}

Figure 4.7: The calculated probability of rolling each number on the two dice using the results shown in Table 4.3.
\includegraphics[width=2.75in]{04statistics/graphics/60000_tpdf_probability}

Let's look at another probability density function. Let's look at the ages of children in Grade 5. If we were to take all the Grade 5 students in Canada, ask them their age, and make a PDF out of the results, it might look like Figure 4.8.

Figure 4.8: The calculated probability of the age of students in Grade 5 in Canada.
\includegraphics[width=2.75in]{04statistics/graphics/age_pdf}

This is obviously not an RPDF or a TPDF because the result doesn't look like a rectangle or a triangle. In fact, it is what statisticians call a normal distribution, better known as a bell curve. What this tells us is that the probability a Canadian Grade 5 student of being either 10 or 11 years old is higher than for being any other age. It is possible, but less likely that the student will be 8, 9, 12 or 13 years old. It is extremely unlikely, but also possible for the student to be 7 or 14 years old.


next up previous contents index
Next: Suggested Reading List Up: Statistics Previous: Coherence   Contents   Index
Geoff Martin 2006-10-15

Click here to purchase the entire book in PDF format.