259
What is Data? Data is a collection of facts, such as values or measurements. It can be numbers, words, measurements, observations or even just descriptions of things. Qualitative vs Quantitative Data can be qualitative or quantitative. Qualitative data is descriptive information (it describes something) Quantitative data, is numerical information (numbers). And Quantitative data can also be Discrete or Continuous: Discrete data can only take certain values (like whole numbers) Continuous data can take any value (within a range) Put simply: Discrete data is counted, Continuous data is measured Example: What do we know about Arrow the Dog?

Statistics and probability

Embed Size (px)

DESCRIPTION

Statistics and Probability

Citation preview

Page 1: Statistics and probability

What is Data?

Data is a collection of facts, such as values or measurements.

It can be numbers, words, measurements, observations or even just

descriptions of things.

Qualitative vs Quantitative

Data can be qualitative or quantitative.

Qualitative data is descriptive information (it describes something)

Quantitative data, is numerical information (numbers).

And Quantitative data can also be Discrete or Continuous:

Discrete data can only take certain values (like whole numbers)

Continuous data can take any value (within a range)

Put simply: Discrete data is counted, Continuous data is measured

Example: What do we know about Arrow the Dog?

Page 2: Statistics and probability

Qualitative:

He is brown and black

He has long hair

He has lots of energy

Quantitative:

Discrete:

He has 4 legs

He has 2 brothers

Continuous:

He weighs 25.5 kg

He is 565 mm tall

To help you remember think "Quantitative is about Quantity"

Collecting

Data can be collected in many ways. The simplest way is direct observation.

Example: you want to find how many cars pass by a certain point on a

road in a 10-minute interval.

So: stand at that point on the road, and count the cars that pass by in

that interval.

You collect data by doing a Survey.

Census or Sample

A Census is when you collect data for every member of the group (the

whole "population").

A Sample is when you collect data just for selected members of the group.

Example: there are 120 people in your local football club.

You can ask everyone (all 120) what their age is. That is a census.

Or you could just choose the people that are there this afternoon. That

is a sample.

Page 3: Statistics and probability

A census is accurate, but hard to do. A sample is not as accurate, but may be

good enough, and is a lot easier.

Language

Data or Datum?

The singular form is "datum", so we would say "that datum is very high".

"Data" is the plural so we can say "the data are available", but it is also

a collectionof facts, so "the data is available" is fine too.

Discrete and Continuous Data

Data can be Descriptive (like "high" or "fast") or Numerical (numbers).

And Numerical Data can be Discrete or Continuous:

Discrete data is counted, Continuous data is measured

Discrete Data

Discrete Data can only take certain values.

Example: the number of students in a class (you can't have half a

student).

Continuous Data

Continuous Data can take any value (within a range)

Examples:

A person's height: could be any value (within the range of

human heights), not just certain fixed heights,

Time in a race: you could even measure it to fractions of

Page 4: Statistics and probability

a second,

A dog's weight,

The length of a leaf,

Lots more!

Analog and Digital

Analog: something physical with continuous change.

Digital: made up of numbers.

Arrow Barks!

Let's record him barking:

Arrow's bark is analog. It is actual pressure waves in the air, so it is physical

with continuous change.

Continuous change: changes smoothly ... no sudden breaks.

And the microphone converts that pressure into an electrical signal. It is

stilll analog (the electricity is physical, and has continuous change).

But when it gets to your computer or

phone it gets

converted to digits!

Page 5: Statistics and probability

Thousands of times a second the

analog signal is measured by

special electronics ... and is then

saved as numbers.

So the "sound" is now "12, 25, 39, 52, 68, 71, 78, 82, 82, 79, 70, 59, ..." (in

fact it would be inbinary, so would be something like

"000011000001100100100111...")

It is now digital!

Notice the digital data has sudden jumps up and down ... it

does not change continuously.

It is Discrete Data: that means it can only be certain values (such as 1, 2, 3,

etc).

Digital data is very easy for computers and phones to use. It can be saved,

shared electronically, sent all over the world quickly and more.

How can we hear Digits?

Easy! The numbers are used to control the size of an electrical signal, which

is analog.

The electricity can be

sent to a speaker ...

... to make sound waves

again!

It should sound very much like the original bark (but not perfectly so!)

Page 6: Statistics and probability

Digital Pictures

A similar thing happens when you take a picture.

Light (which is analog) gets projected onto a grid of millions of little sensors

inside the camera:

The camera measures the light at each point and produces numbers.

The picture is now digital!

So the "picture" is now "A1DDF9, ADE3FF, B5E7FE, AFE4F8, ...", which

are hexadecimal color numbers, (that are used internally in binary, so would

be something like "101000011101110111111001...")

Look really closely at a digital picture ... it is made up of millions of little

squares called "pixels":

Each "pixel" is made using a hexadecimal color number.

Digital IS Numbers

So digital pictures, music, videos etc are actually stored on your computer or

phone as numbers.

Numbers rule!

Page 7: Statistics and probability

Bar Graphs

A Bar Graph (also called Bar Chart) is a graphical display of data using bars

of different heights.

Imagine you just did a survey of your friends to find which kind of movie

they liked best:

Table: Favorite Type of Movie

Comedy Action Romance Drama SciFi

4 5 6 1 4

You could show that on a bar graph like this:

It is a really good way to show relative sizes: it is easy to see which types of

movie are most liked, and which are least liked, at a glance.

You can use bar graphs to show the relative sizes of many things, such as

what type of car people have, how many customers a shop has on different

days and so on.

Page 8: Statistics and probability

Example: Most Popular Fruit

A survey of 145 people revealed their favorite fruit:

Fruit: Apple Orange Banana Kiwifruit Blueberry Grapes

People: 35 30 10 25 40 5

And here is the bar graph:

For that group of people Blueberries are most popular and Grapes are the

least popular.

Example: Student Grades

In a recent test, this many students got the following grades:

Grade: A B C D

Students: 4 12 10 2

And here is the bar graph:

Page 9: Statistics and probability

You can create graphs like that using our Data Graphs (Bar, Line and

Pie) page.

Histograms vs Bar Graphs

Bar Graphs are good when

your data is

incategories (such as

"Comedy", "Drama", etc).

But when you

have continuous data (such

as a person's height) then

use a Histogram.

Pie Chart

Page 10: Statistics and probability

Pie Chart - A special chart that uses "pie slices" to show relative

sizes of data.

Imagine you just did a survey of your friends to find which kind of movie

they liked best.

Here are the results:

Table: Favorite Type of Movie

Comedy Action Romance Drama SciFi

4 5 6 1 4

You could show that by this pie chart:

It is a really good way to show relative sizes: it is easy to see which movie

types are most liked, and which are least liked, at a glance.

You can create graphs like that using our Data Graphs (Bar, Line and

Pie) page.

Or you can make them yourself ...

How to Make Them Yourself

Page 11: Statistics and probability

First, put your data into a table (like above), then add up all the values to

get a total:

Comedy Action Romance Drama SciFi TOTAL

4 5 6 1 4 20

Next, divide each value by the total and multiply by 100 to get a percent:

Comedy Action Romance Drama SciFi TOTAL

4 5 6 1 4 20

4/20

=20%

5/20

=25%

6/20

=30%

1/20

= 5%

4/20

=20% 100%

Now you need to figure out how many degrees for each "pie slice" (correctly

called a sector).

A Full Circle has 360 degrees, so we do this calculation:

Comedy Action Romance Drama SciFi TOTAL

4 5 6 1 4 20

4/20

=20%

5/20

=25%

6/20

=30% 1/20 = 5%

4/20

=20% 100%

4/20 ×

360°

= 72°

5/20 ×

360°

= 90°

6/20 ×

360°

= 108°

1/20 ×

360°

= 18°

4/20 ×

360°

= 72°

360°

Page 12: Statistics and probability

Now you are ready to start drawing!

Draw a circle.

Then use your protractor to measure

the degrees of each sector.

Here I show the first sector ...

... you can do the rest!

More Examples

You can use pie charts to show the relative sizes of many things, such as:

what type of car people have,

how many customers a shop has on different days and so on.

how popular are different breeds of dogs

Example: Student Grades

Here is how many students got each grade in the recent test:

A B C D

4 12 10 2

And here is the pie chart:

Page 13: Statistics and probability

Dot Plots

A Dot Plot is a graphical display of data using dots.

Imagine you just did a survey of your friends to find which kind of movie

they liked best:

Table: Favorite Type of Movie

Comedy Action Romance Drama SciFi

4 5 6 1 4

On a dot plot it looks like this:

Page 14: Statistics and probability

It is a really good way to show relative sizes: it is easy to see which types of

movie are most liked, and which are least liked, at a glance. Very similar to

a bar graph.

Here is another example:

Example: Minutes To Eat Breakfast

A survey of "How long does it take you to eat breakfast?" had the following

results:

Minutes: 0 1 2 3 4 5 6 7 8 9 10 11 12

People: 6 2 3 5 2 5 0 0 2 3 7 4 1

Which means that 6 people take 0 minutes to eat breakfast (they probably

had no breakfast!), 2 people say they only spend 1 minute having breakfast,

etc.

And here is the dot plot:

Another version of the dot plot has just one dot for each data point like this:

Example: (continued)

This has the same data as above:

But notice that we need to have lines and numbers on the side so we can see

what the dots mean.

Page 15: Statistics and probability

Line Graphs

Line Graph - A graph that shows information that is connected in

some way (such as change over time)

You are learning math facts, and each day you do a short test to see how

good you are. These are the results:

Table: Facts I got Correct

Day 1 Day 2 Day 3 Day 4

3 4 12 15

And here is the same data as a Line Graph:

You seem to be improving!

Scatter Plots

A graph of plotted points that show the

relationship between two sets of data.

In this example, each dot represents one person's

weight versus their height.

(The data is plotted on the graph as "Cartesian

Page 16: Statistics and probability

(x,y) Coordinates")

Example:

The local ice cream shop keeps track of how much ice cream they sell versus

the temperature on that day. Here are their figures for the last 12 days:

Ice Cream Sales vs Temperature

Temperature °C Ice Cream Sales

14.2° $215

16.4° $325

11.9° $185

15.2° $332

18.5° $406

22.1° $522

19.4° $412

25.1° $614

23.4° $544

18.1° $421

22.6° $445

17.2° $408

And here is the same data as a Scatter Plot:

Page 17: Statistics and probability

It is now easy to see that warmer weather leads to more sales, but the

relationship is not perfect.

Line of Best Fit

You can also draw a "Line of Best Fit" (also called a "Trend Line") on your

scatter plot:

Try to have the line as close as possible to all points, and as many points

above the line as below.

Example: Sea Level Rise

Page 18: Statistics and probability

A Scatter Plot of Sea

Level Rise:

And here I have drawn

on a "Line of Best Fit".

Correlation

When the two sets of data are strongly linked together we say they have

a High Correlation.

The word Correlation is made of Co- (meaning "together"), and Relation

Correlation is Positive when the values increase together, and

Correlation is Negative when one value decreases as the other

increases

Like this:

Page 19: Statistics and probability

(Learn More About Correlation)

Negative Correlation

Correlations can be negative, which means there is a correlation but one

value goes down as the other value increases.

Example : Birth Rate vs Income

The birth rate tends to be lower in richer

countries.

Below is a scatter plot for about 100 different

countries.

Country

Yearly

Production

per Person

Birth

Rate

Madagascar $800 5.70

India $3,100 2.85

Mexico $9,600 2.49

Taiwan $25,300 1.57

Norway $40,000 1.78

It has a negative correlation (the line slopes down)

Note: I tried to fit a straight line to the data, but maybe a curve would work

better, what do you think?

Page 20: Statistics and probability

Pictographs

A Pictograph is a way of showing data using images.

Each image stands for a certain number of things.

Example: Apples Sold

Here is a pictograph of how many apples were sold at the local shop

over 4 months:

Note that each picture of an apple means 10 apples (and the half-

apple picture means 5 apples).

So the pictograph is showing:

In January 10 apples were sold

In February 40 apples were sold

In March 25 apples were sold

In April 20 apples were sold

It is a fun and interesting way to show data.

But it is not very accurate: in the example above we can't show just 1 apple

sold, or 2 apples sold etc.

Why don't you try to make your own pictographs? Here are a few ideas:

How much money you have (week by week)

How much exercise you get (each day)

How many hours you watch TV every week

How many sports stories are in each newspaper

Page 21: Statistics and probability

Histograms

A Histogram is a graphical display of data using bars of different heights.

It is similar to a Bar Chart, but

a histogram groups numbers

into ranges

And you decide what ranges to

use!

Example: Dress Shop Survey

You asked customers who bought one of the "Aurora" range of skirts

how old they were.

The ages were from 5 to 25 years old.

You decide to put the results into groups of 5:

The 1 to 5 years old range,

The 6 to 10 years old range,

etc...

So when someone says "I am 17"

you add 1 to the "16-20" range.

And here is the result:

You can see (for example) that there were 30customers

between 6 and 10 years old

Histograms are a great way to show results of continuous data, such as:

weight

height

how much time

Page 22: Statistics and probability

etc.

But if your data is

in categories (such as

Country or Favorite Movie),

then you should use a Bar

Chart.

Frequency Histogram

A Frequency Histogram is a special histogram that uses vertical columns to

show frequencies (how many times each score occurs):

Here I have added up how often 1 occurs (2

times), how often 2 occurs (5 times), etc, and

shown them as a histogram.

Frequency Distribution

Page 23: Statistics and probability

Frequency

Frequency is how often something occurs.

Example: Sam played football on

Saturday Morning,

Saturday Afternoon

Thursday Afternoon

The frequency was 2 on Saturday, 1 on Thursday and 3 for the whole

week.

Frequency Distribution

By counting frequencies we can make a Frequency Distribution table.

Example: Goals

Sam's team has scored the following numbers of goals in

recent games:

2, 3, 1, 2, 1, 3, 2, 3, 4, 5, 4, 2, 2, 3

Sam put the numbers in order,

then added up:

how often 1 occurs (2 times),

how often 2 occurs (5 times),

etc,

and wrote them down as a

Frequency Distribution table:

From the table we can see interesting things such as

Page 24: Statistics and probability

getting 2 goals happens most frequently

only once did they get 5 goals

This is the definition:

Frequency Distribution: values and their frequency (how often each value

occurs).

Here is another example:

Example: Newspapers

These are the numbers of newspapers sold at a local shop over the last

10 days:

22, 20, 18, 23, 20, 25, 22, 20, 18, 20

Let us count how many of each number there is:

Papers Sold Frequency

18 2

19 0

20 4

21 0

22 2

23 1

24 0

25 1

It is also possible to group the values. Here they are grouped in 5s:

Papers Sold Frequency

15-19 2

20-24 7

25-29 1

(Learn more about Grouped Frequency Distributions)

Graphs

After creating a Frequency Distribution table you might like to make a Bar

Graph or a Pie Chartusing the Data Graphs (Bar, Line and Pie) page.

Page 25: Statistics and probability

Stem and Leaf Plots

A special table where each data value is split into a "leaf" (usually the last

digit) and a "stem" (the other digits). Like in this example:

Example:

"32" is split into "3" (stem) and "2" (leaf).

The "stem" values are listed down, and the "leaf" values go right (or left)

from the stem values.

The "stem" is used to group the scores and each "leaf" indicates the

individual scores within each group.

Cumulative Tables and Graphs

Cumulative

Cumulative means "how much so far".

Think of the word "accumulate" which means to gather together.

To have cumulative totals, just add up the values as you go.

Example: Jamie has earned this much in the last 6 months:

Month Earned

March $120

Page 26: Statistics and probability

April $50

May $110

June $100

July $50

August $20

To work out the cumulative totals, just add up as you go.

The first line is easy, the total earned so far is the same as Jamie

earned that month:

Month Earned Cumulative

March $120 $120

But for April, the total earned so far is $120 + $50 = $170 :

Month Earned Cumulative

March $120 $120

April $50 $170

And for May we continue to add up: $170 + $110 = $280

Month Earned Cumulative

March $120 $120

April $50 $170

May $110 $280

Do you see how we add the previous month's cumulative total to this

month's earnings?

Here is the calculation for the rest:

June is $280 + $100 = $380

July is $380 + $50 = $430

August is $430 + $20 = $450

And this is the result

Month Earned Cumulative

March $120 $120

April $50 $170

Page 27: Statistics and probability

May $110 $280

June $100 $380

July $50 $430

August $20 $450

The last cumulative total should match the total of all earnings:

$450 is the last cumulative total ...

... it is also the total of all earnings:

$120+$50+$110+$100+$50+$20 = $450

So we got it right.

So that's how to do it, add up as you go down the list and you will have

cumulative totals.

You could also call it a "Running Total"

Graphs

You can make cumulative graphs if you want. Just plot each Month's

cumulative total:

Cumulative Bar Graph Cumulative Line Graph

How to Do a Survey

Page 28: Statistics and probability

Survey Says ...

Turn on the television, radio or open a

newspaper and you will often see the

results from a survey.

Gathering information is an important way to help people

make decisions about topics of interest.

Surveys can help decide what needs changing, where money

should be spent, what products to purchase, what problems

there might be, or lots of other questions you may have at

any time.

The best part about surveys is that they can be used to

answer any question about any topic.

You can survey people (through questionnaires, opinion polls, etc)

or things (like pollution levels in a river, or traffic flow).

Four Steps

Here are four steps to a successful survey:

Step one: create the questions

Step two: ask the questions

Step three: tally the results

Step four: present the results

Let us look at those steps in more detail ...

Step One: Create the Questions

The first thing is to decide is

What questions do you want answered?

Page 29: Statistics and probability

Sometimes these may be simple questions like:

"What is your favorite color?"

Other times the questions may be quite complex

such as:

Which roads have the worst traffic conditions

Simple Surveys

If you are doing a simple survey, you could use tally marks to represent each

person‟s answer:

Sometimes, it is helpful to be creative in how the people can respond. It

makes it more fun for both you and your respondents (the people answering

the question).

Example:What is your favorite color?

Have them write down their favorite color on a piece of paper and drop

it in a fish bowl.

Then, put all of the pieces of paper into piles and count them.

To help you make a good Questionnaire read our page Survey

Questions.

Page 30: Statistics and probability

Step Two: Asking The Questions

Now you have your questions, go out and ask them! But who to ask?

If you survey a small group you can ask everybody (called a Census)

If you want to survey a large group, you may not be able to ask everybody

so you should ask a sample of the population (called a Sample)

If your are Sampling you should be careful who you ask.

To be a good sample, each person should be

chosen randomly

If you only ask people who look

friendly, you will only know what

friendly people think!

If you went to the swimming pool and

asked people "Can you swim?" you

will get a biased answer ... maybe

even 100% will say "Yes"

Note: the surveys where people are asked to ring a number

to vote are not very accurate, because only certain types of

people actually ring up!

So be careful not to bias your survey. Try to choose randomly.

Example: You want to know the favorite colors for people at your

school, but don't have the time to ask everyone.

Solution: Choose 50 people at random:

stand at the gate and choose "the next person to arrive" each time

or choose people randomly from a list and then go and find them!

or you could choose every 5th person

Your results will hopefully be nearly as good as if you asked everyone.

Page 31: Statistics and probability

If you choose a person and they do not want to answer, just record "no

answer" on the survey form and mention how many people did not answer in

your report.

After completing a sampling survey you can use the information to make

a prediction as to how the rest of the population would respond.

The more people you have asked, the better your result will be.

Example: nationwide opinion polls survey up to 2,000 people, and the

results are nearly as good (within about 1%) as asking everyone.

Step Three: Tally the Results

Now you have finished asking questions it is time to tally the results.

By "tally" I mean add up. This usually involves lots of paperwork and

computer work (spreadsheets are useful!)

Example: For "favorite colors of my class" you can simply write tally

marks like this (every fifth mark crosses the previous 4 marks, so you

can easily see groups of 5):

Step Four: Presenting the Results

Now you have your results, you will want to show them to other people in

the best possible way.

Page 32: Statistics and probability

We have written a special page called Showing the Results of a Survey, but

here is a quick summary:

Tables

Sometimes, you can simply report the information in a table.

A table is a very simple way to show others the results. A table should have

a title, so those looking at it understand what results the table shows:

Table: The Favorite Colors of My Class

Yellow Red Blue Green Pink

4 5 6 1 4

Statistics

You can also summarize the results using statistics, such

as mean or standard deviation

Example: you have lots of information about how long it takes people to

get to school but it may be simpler just to present a summary such as:

Shortest Journey: 3 minutes

Average Journey: 22 minutes

Longest Journey: 58 minutes

Graphs

But nothing makes a report look better than a nice graph or chart.

Use Data Graphs (Bar, Line and Pie) to make them.

Example Survey Question: What is your favorite color?

Page 33: Statistics and probability

Have fun asking questions!!!!!

Survey Questions

How to make a good Questionnaire!

The first question is one you should ask yourself:

"What do I hope to learn from asking the questions?"

This defines your objective (the purpose, or why you

are conducting the survey).

Example: you want to clean up the local river. You feel that with some

help and some money you could make it really beautiful again.

You want to survey your local community to find out:

Are other people also worried about the river.

Would they be willing to donate their time or money to help.

Questions

Page 34: Statistics and probability

Now you know why you are doing a survey, start writing down the questions

you will ask!

Just write down any questions you think may be useful. Don't worry about

quality at this stage, we will improve your list of questions later.

Example: Questions you could ask for the river survey:

Does pollution worry you?

Do you ever go down to the river?

Can you spare some money to help the river?

Have you noticed the pollution in the river?

Would you be happy to volunteer for river cleanup?

When would you be available to help?

How should we clean up the river?

etc...

You can also ask the person about themselves (not too personal!), such as

approximate age, male or female, etc, so that you know the kind of people

that you have been surveying.

Your Turn: Go ahead and write down the questions for your

own survey!

Types of Questions

A survey question can be:

Open-ended (the person can answer in any way they want), or

Closed-ended (the person chooses from one of several options)

Closed ended questions are much easier to total up later on, but may stop

people giving an answer they really want.

Example: "What is your favorite color?"

Open-ended: Someone may answer "dark fuchsia", in which case you

will need to have a category "dark fuchsia" in your results.

Closed-ended: With a choice of only 12 colors your work will be

easier, but they may not be able to pick their exact favorite color.

Page 35: Statistics and probability

Look at each of your questions and decide if they should be

open-ended or closed ended (take the opportunity to rewrite

any questions, too)

Example: "What do you think is the best way to clean up the river?"

Make it Open-ended: the answers won't be easy to put in a table or

graph, but you may get some good ideas, and there may be some good

quotes for your report.

Example: "How often do you visit the river?"

Make it Closed-ended with the following options:

Nearly every day

At least 5 times a year

1 to 4 times a year

Almost never

You will be able to present this data in a neat bar graph.

Question Sequence

It is important that the questions don't "lead" people to the answer

Example: people may say "yes" to donate money if you ask the

questions this way

Do you love nature?

Will you donate money to help the river?

But probably will say "no" if you ask the questions this way:

Is lack of money a problem for you?

Will you donate money to help the river?

To avoid this kind of thing, try to have your questions go:

from the least sensitive to the most sensitive

from the more general to the more specific

from questions about facts to questions about opinions

Page 36: Statistics and probability

Go through your questions and put them in the best

sequence possible

Example: I will ask people how often they visit the river (a fact) before

I ask them what they feel about pollution (an opinion)

I will ask people their general feelings about the environment before I

ask them their feelings about the river.

Neutral Questions

Your questions should also be neutral ... allowing the

person to think their own thoughts about the question.

In the example above we had the question "Do you love nature?" ... that

is a bad questionbecause it is almost forcing the person to say "Yes, of

course."

Try rewording it to be more neutral, for example:

Example: "How important is the natural environment to you?"

Not Important

Some Importance

Very Important

But you can also make statements and see if people agree:

Page 37: Statistics and probability

Reword every question to be neutral

Possible Answers

For each "closed-ended" question try to think:

What are the possible answers to this question?

Make sure you have most of the

common answer available.

If you are not sure what people might

answer, you could always try a small

open ended survey (maybe ask your

friends or people in the street) to find

common answers.

Trick: try to avoid neutral answers (such as "don't care") because people

may choose them so they don't have to think about the answer!

It is also helpful to have an “other” category in case none of the choices are

satisfactory for the person answering the question.

Example: What is your favorite color?

Red, blue, green, yellow, purple, black, brown, orange, other

. Scaled Answers

Sometimes you could have a scale on which they can rate their feelings

about the question.

Have "opposite" words at either end and a scale in between like this:

Examples:

The river is ...

Polluted :_____:_____:_____:_____:_____:_____: Clean

Cleaning up the river is ...

Page 38: Statistics and probability

Easy :_____:_____:_____:_____:_____:_____: Difficult

. Rated Items

For this type of answer the person gets to rate or rank each option.

Don't have too many items though, as that makes it too hard to answer.

Example: Please rank the following activities from 1 to 5, putting 1 next

to your favorite through to 5 for your least favorite.

___ Fishing

___ Football

___ Golf

___ Shopping

___ Sleeping

. Number Answers

You can also just ask for a number

Example: "How many times did you visit the river during the past

year?"

____ times

Look at each "closed-end" question and choose the best

answer options.

How Will I Gather the Answers?

Try to make life easier by thinking how you will

gather the answersbefore you ask the questions

It is important to make the process simple, for both

yourself and those responding.

Page 39: Statistics and probability

The Questionnaire

You are going to want a neat form that

makes it easy to answer the questions AND

easy to total up the answers later on.

Type your questions and answer options into a word

processor or spreadsheet, and format it neatly.

Remember to leave plenty of space for open-ended

questions.

How Will I Show the Results?

Go over each of the questions and think how you want the answers to go into

your report:

in a table,

a bar graph,

a pie chart,

or just explained in words.

Make sure each question is set up so you can present the

answers in your chosen style.

Example: you decide to have six options for "How many times do you

visit the river" so the bar graph looks best.

Test It Out

Page 40: Statistics and probability

You should test your questionnaire on a few people.

was each question clear and easy to understand?

were they happy with the options?

It is also a good idea to time how long it takes so you will be

able to tell people "this survey only takes 2 minutes" (or

however long it takes). Use ourStopwatch.

Try the questionnaire on some friends.

Take notes of any difficulties your friends have with the

questionnaire, and see what you can do to improve it.

Your Original Objective

Lastly, look back at your original objectives for this survey ...

will the questions really help you find out what you want to know?

are there some questions you can remove? (smaller surveys are

easier!)

This is your last chance to make sure your questionnaire is a

good one!

You Are Done!

Now you have your questions as perfect as you can get them ..

... go out and ask them!

Showing the Results of a Survey

Page 41: Statistics and probability

So you have just Conducted a Survey and want to show

your results in the best possible way?

Here are some suggestions:

Tables

Sometimes, you can simply report the information in a table.

A table is a very simple way to show others the results. A table should have

a title, so those looking at it understand what it shows:

Table: The Favorite Colors of My Class

Yellow Red Blue Green Pink

4 5 6 1 4

Statistics

You can also summarize the results using statistics, such

as Mean, Median, Mode, Standard Deviation and Quartiles

Example: you have lots of information about how long it takes people to

get to school but it may be simpler just to present a summary such as:

Shortest Journey: 3 minutes

Average Journey: 22 minutes

Longest Journey: 58 minutes

Graphs

But nothing makes a report look better than a nice graph or chart

There are many different types of graphs. Three of the most

common are:

Line Graph - shows information that is somehow connected (such

as change over time)

Page 42: Statistics and probability

Bar Graph – shows relative sizes of different results:

Pie Chart - shows sizes as part of a whole (good for showing

percentages).

Page 43: Statistics and probability

You can create graphs like those using our Data Graphs (Bar, Line and

Pie) page

People's Comments

If people have given their opinions or comments in the survey, you can

present the more interesting ones:

Example: In response to the question "How can we best clean up the

river?" we received these interesting replies:

"The government has a special fund for this"

"The local gardening group has seedlings you could plant"

Report

Put it all together into a report, with a nice introduction, and conclusions at

the end, and you are done!

Accuracy and Precision

They mean slightly different things!

Accuracy

Page 44: Statistics and probability

Accuracy is how close a measured value is to the actual (true) value.

Precision

Precision is how close the measured values are to each other.

Examples of Precision and Accuracy:

Low Accuracy

High Precision

High Accuracy

Low Precision

High Accuracy

High Precision

So, if you are playing soccer and you always hit the left goal post instead of

scoring, then you arenot accurate, but you are precise!

Bias (don't let precision fool you!)

If you measure something several times and all values are close,

they may all be wrong if there is a "Bias"

Bias is a systematic (built-in) error which makes all measurements wrong by

a certain amount.

Examples of Bias

The scales read "1 kg" when there is nothing on them

You always measure your height wearing shoes with thick soles.

A stopwatch that takes half a second to stop when clicked

Page 45: Statistics and probability

In each case all measurements will be wrong by the same amount. That is

bias.

Degree of Accuracy

Accuracy depends on the instrument you are measuring with. But as a

general rule:

The degree of accuracy is half a unit each side of the unit of measure

Examples:

If your instrument measures in "1"s

then any value between 6½ and 7½ is measured as "7"

If your instrument measures in "2"s

then any value between 7 and 9 is measured as "8"

(Notice that the arrow points to the same spot, but the measured values are

different!

Read more at Errors in Measurement. )

Activity: Asking Questions

As you walk, or in the car or at home, look around and ask yourself

questions about the world around you.

Write down 5 of those questions that can be answered using numbers.

Examples:

How many trees in the park?

How long would it take to cut the grass along the street?

Page 46: Statistics and probability

How much paint would it take to do the whole house?

Which Ice Cream sells the most?

You can use this form:

Question

1

2

3

4

5

Why Do This Activity?

It will improve your awareness and understanding of the world

It will increase your curiosity and

It will improve your "number-sense"

Try To Do This Your Whole Life

It is a good habit to always ask questions about the world.

Activity: Improving Questions

First do the Asking Questions Activity where you are asked to write down 5

real-world questions that can be answered using numbers.

Page 47: Statistics and probability

Now we want to take those questions and make them better.

For each question:

Is it possible to answer?

Can we answer it exactly (or close enough)?

Do we know what each part of it means?

Ask "does that depend on ... "

Example: How many trees in the park?

When you start counting the trees you may find lots of tiny ones ...

should they be counted?

Maybe the question could be changed to

How many trees taller than 2 meters are in the park?

Example: How long would it take to cut the grass along the

street?

What are you using to cut the grass: A lawn mower? One you sit on?

Maybe the question could be changed to

How long would it take to cut the grass along the street

using our lawn mower?

Also: what does "along the street" mean? Just the grass alongside the

road? Maybe you need a map!

1 Original Question:

Improved Question:

2 Original Question:

Improved Question:

3 Original Question:

Improved Question:

4 Original Question:

Page 48: Statistics and probability

Improved Question:

5 Original Question:

Improved Question:

Page 49: Statistics and probability

Probability and Statistics

Finding a Central Value

When you have two or more numbers it is nice to find a value for the

"center".

2 Numbers

With just 2 numbers the answer is easy: go half-way in-between.

Example: what is the central value for 3 and 7?

Answer: Half-way in-between, which is 5.

You can calculate it by adding 3 and 7 and then dividing the result by 2:

(3+7) / 2 = 10/2 = 5

3 or More Numbers

You can use the same idea when you have 3 or more numbers:

Example: what is the central value of 3, 7 and 8?

Answer: You calculate it by adding 3, 7 and 8 and then dividing the results by

3 (because there are 3 numbers):

(3+7+8) / 3 = 18/3 = 6

Notice that we divided by 3 because we had 3 numbers ... very important!

Page 50: Statistics and probability

The Mean

So far we have been calculating the Mean (or the Average):

Mean: Add up the numbers and divide by how many numbers.

But sometimes the Mean can let you down:

Example: Birthday Activities

Uncle Bob wants to know the average age at the party, to choose an activity.

There will be 6 kids aged 13, and also 5 babies aged 1.

Add up all the ages, and divide by 11 (because there are 11 numbers):

(13+13+13+13+13+13+1+1+1+1+1) / 11 = 7.5...

The mean age is about 7½, so he gets a Jumping Castle!

The 13 year olds are embarrassed,

and the 1-year olds can't jump!

The Mean was accurate, but in this case it was not useful.

The Median

But you could also use the Median: simply list all numbers in order and

choose the middle one:

Example: Birthday Activities (continued)

List the ages in order:

1, 1, 1, 1, 1, 13, 13, 13, 13, 13, 13

Choose the middle number:

1, 1, 1, 1, 1, 13 , 13, 13, 13, 13, 13

The Median age is 13 ... so let's have a Disco!

Sometimes there are two middle numbers. Just average them:

Example: What is the Median of 3, 4, 7, 9, 12, 15

There are two numbers in the middle:

3, 4, 7, 9, 12, 15

Page 51: Statistics and probability

So we average them:

(7+9) / 2 = 16/2 = 8

The Median is 8

The Mode

The Mode is the value that occurs most often:

Example: Birthday Activities (continued)

Group the numbers so we can count them:

1, 1, 1, 1, 1, 13, 13, 13, 13, 13, 13

"13" occurs 6 times, "1" occurs only 5 times, so the mode is 13.

How to remember? Think "mode is most"

But Mode can be tricky, there can sometimes be more than one Mode.

Example: What is the Mode of 3, 4, 4, 5, 6, 6, 7

Well ... 4 occurs twice but 6 also occurs twice.

So both 4 and 6 are modes.

When there are two modes it is called "bimodal", when there are three or

more modes we call it "multimodal".

Conclusion

There are other ways of measuring central values, but Mean, Median and

Mode are the most common.

Use the one that best suits your data. Or better still, use all three!

How to Find the Mean

The mean is just the average of the numbers.

It is easy to calculate: add up all the numbers, then divide by how

many numbers there are.

Page 52: Statistics and probability

In other words it is the sum divided by the count.

Example 1: What is the Mean of these numbers?

6, 11, 7

Add the numbers: 6 + 11 + 7 = 24

Divide by how many numbers (there are 3 numbers): 24 / 3 = 8

The Mean is 8

Why Does This Work?

It is because 6, 11 and 7 added together is the same as 3 lots of 8:

It is like you are "flattening out" the numbers

Example 2: Look at these numbers:

3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29

The sum of these numbers is 330

There are fifteen numbers.

The mean is equal to 330 / 15 = 22

The mean of the above numbers is 22

Negative Numbers

How do you handle negative numbers? Adding a negative number is the

same as subtracting the number (without the negative). For example 3 + (-

2) = 3-2 = 1.

Knowing this, let us try an example:

Page 53: Statistics and probability

Example 3: Find the mean of these numbers:

3, -7, 5, 13, -2

The sum of these numbers is 3 - 7 + 5 + 13 - 2 = 12

There are 5 numbers.

The mean is equal to 12 ÷ 5 = 2.4

The mean of the above numbers is 2.4

Here is how to do it one line:

Mean = 3 − 7 + 5 + 13 − 2

= 12

= 2.4 5 5

Now have a look at The Mean Machine.

How to Find the Median Value

It's the middle number in a sorted list.

Median Value

The Median is the "middle number" (in a sorted list of numbers).

How to Find the Median Value

To find the Median, place the numbers in value order and find the middle

number.

Example: find the Median of 12, 3 and 5

Put them in order:

3, 5, 12

The middle number is 5, so the median is 5.

Example:

Page 54: Statistics and probability

3, 13, 7, 5, 21, 23, 39, 23, 40, 23, 14, 12, 56, 23,

29

When we put those numbers in order we have:

3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 39, 40,

56

There are fifteen numbers. Our middle number will be the eighth number:

3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 39, 40,

56

The median value of this set of numbers is 23.

(It doesn't matter that some numbers are the same in the list.)

Two Numbers in the Middle

BUT, when there are an even amount of numbers things are slightly

different.

In that case we need to find the middle pair of numbers, and then find the

value that would be half way between them. This is easily done by adding

them together and dividing by two.

Example:

3, 13, 7, 5, 21, 23, 23, 40, 23, 14, 12, 56, 23, 29

When we put those numbers in order we have:

3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 40, 56

There are now fourteen numbers and so we don't have just one middle

number, we have a pair of middle numbers:

3, 5, 7, 12, 13, 14, 21, 23, 23, 23, 23, 29, 40, 56

In this example the middle numbers are 21 and 23.

To find the value half-way between them, add them together and divide by 2:

21 + 23 = 44

44 ÷ 2 = 22

Page 55: Statistics and probability

So the Median in this example is 22.

(Note that 22 was not in the list of numbers ... but that is OK because half the

numbers in the list are less, and half the numbers are greater.)

Your Turn

Remember: sort them first (by dragging them left or right) !

View Larger

Which is the Middle Number?

A quick way to find which is the middle number: count how many

numbers, add 1 then divide by 2

Example: There are 45 numbers

45 plus 1 is 46, then divide by 2 and you get 23

So the median is the 23rd number in the sorted list.

Example: There are 66 numbers

66 plus 1 is 67, then divide by 2 and you get 33.5

33 and a half? That means that the 33rd and 34th numbers in the sorted

list are the two middle numbers.

So to find the median: add the 33rd and 34th numbers together and divide

by 2.

Page 56: Statistics and probability

How to Find the Mode or Modal Value

The mode is simply the number which appears most often.

Finding the Mode

To find the mode, or modal value, first put the numbers in order, then count

how many of each number.

Example:

3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29

In order these numbers are:

3, 5, 7, 12, 13, 14, 20, 23, 23, 23, 23, 29, 39, 40, 56

This makes it easy to see which numbers appear most often.

In this case the mode is 23.

Another Example: {19, 8, 29, 35, 19, 28, 15}

Arrange them in order: {8, 15, 19, 19, 28, 29, 35}

19 appears twice, all the rest appear only once, so 19 is the mode.

How to remember? Think "mode is most"

More Than One Mode

You can have more than one mode.

Example: {1, 3, 3, 3, 4, 4, 6, 6, 6, 9}

3 appears three times, as does 6.

So there are two modes: at 3 and 6

Having two modes is called "bimodal".

Having more than two modes is called "multimodal".

Grouping

When all values appear the same number of times the idea of a mode is not

useful. But you could group them to see if one group has more than the

others.

Page 57: Statistics and probability

Example: {4, 7, 11, 16, 20, 22, 25, 26, 33}

Each value occurs once, so let us try to group them.

We can try groups of 10:

0-9: 2 values (4 and 7)

10-19: 2 values (11 and 16)

20-29: 4 values (20, 22, 25 and 26)

30-39: 1 value (33)

In groups of 10, the "20s" appear most often, so we could choose 25 as

the mode.

You could use different groupings and get a different answer!

Activity: Averages Brain-Teaser

Here is a little puzzle about averages. Is it right?

Who is Better at Kicking Goals?

At practice last week:

You scored 2 of 10 shots at goal

Sam scored 3 of 10 shots

Sam is better!

This week:

You scored 53 of 100 shots

Sam scored 6 of 10 shots

Sam is still better.

Page 58: Statistics and probability

But let's add up the scores for BOTH weeks:

You scored 55 of 110 shots: that is 50%

Sam scored 9 of 20 shots: that is only 45%

Hang on! YOU are better!

Sam was better last week and this week ... but you are better over

both weeks?

Please explain.

...

Maybe make a table with all the data and do the calculations yourself

Sam You

Last Week

This Week

Both Weeks

....

... read on after you have thought about it ...

Page 59: Statistics and probability

...

It is All True

Because you had SO MANY shots at goal this week, and did well at

them, you lifted your two-week average above Sam's.

At practice last week:

You scored 2 of 10 (20%)

Sam scored 3 of 10 (30%)

This week:

You scored 53 of 100 (53%)

Sam scored 6 of 10 (60%)

For BOTH weeks:

You scored 55 of 110 (50%)

Sam scored 9 of 20 (45%)

To be fair, we should really compare the averages when your and Sam's

attempts at goal is roughly the same.

Page 60: Statistics and probability

If Sam had attempted 100 shots this week, he may have scored 60 out of

100, and his two-week average would have been about 57%, better than

you.

So be careful when comparing two sets of data with widely different counts.

The Mean from a Frequency Table

It is easy to calculate the Mean:

Add up all the numbers, then divide by how many numbers there are.

Example 1: What is the Mean of these numbers?

6, 11, 7

Add the numbers: 6 + 11 + 7 = 24

Divide by how many numbers (there are 3 numbers): 24 ÷ 3 = 8

The Mean is 8

But sometimes you won't have a simple list of numbers, you might have a

frequency table like this (the "frequency" says how often they occur):

Score Frequency

1 2

2 5

3 4

4 2

5 1

(it says that score 1 occurred 2 times, score 2 occurred 5 times, etc)

You could list all the numbers like this:

Mean = 1+1 + 2+2+2+2+2 + 3+3+3+3 + 4+4 + 5

(how many numbers)

Page 61: Statistics and probability

But rather than do lots of adds (like 3+3+3+3) it is often easier to use

multiplication:

Mean = 2×1 + 5×2 + 4×3 + 2×4 + 1×5

(how many numbers)

And rather than count how many numbers there are, we can add up the

frequencies:

Mean = 2×1 + 5×2 + 4×3 + 2×4 + 1×5

2 + 5 + 4 + 2 + 1

So let's calculate:

Mean = 2 + 10 + 12 + 8 + 5

= 37

= 2.64... 14 14

And that is how to calculate the mean from a frequency table!

Here is another example:

Example: Parking Spaces per House in Hampton Street

Isabella went up and down the street to find out how many parking

spaces each house had. Here are her results:

Parking

Spaces Frequency

1 15

2 27

3 8

4 5

What is the mean number of Parking Spaces?

Answer:

Mean = 15× 1 + 27×2 + 8×3 + 5×4

= 15+54+24+20

= 2.05... 15+27+8+5 55

The Mean is 2.05 (to 2 decimal places)

(much easier than adding all numbers separately!)

Page 62: Statistics and probability

Notation

Now you know how to do it, let's do that last example again, but using

formulas.

This symbol (called Sigma) means "sum up"

(read more at Sigma Notation)

So we can say "add up all frequencies" this way:

(where f is frequency)

And we would use it like this:

Likewise we can add up "frequency times score" this way:

(where f is frequency and x is the matching score)

And the formula for calculating the mean from a frequency table is:

The x with the bar on top says "the mean of x"

So now we are ready to do our example above, but with correct notation.

Example: Calculate the Mean of this Frequency Table

x f

1 15

2 27

3 8

4 5

And here it is:

Page 63: Statistics and probability

There you go! You can use sigma notation.

Calculate in the Table

It is often better to do the calculations in the table.

Example: (continued)

From the previous example, calculate f × x in the right-hand column

and then do totals:

x f fx

1 15 15

2 27 54

3 8 24

4 5 20

TOTALS: 55 113

And the Mean is then easy:

Mean = 113 / 55 = 2.05...

Mean, Median and Mode

from Grouped Frequencies

Let's start off with some raw data (not a grouped frequency) ...

Example: Alex did a survey of how many games each of 20

friends owned, and got this:

9, 15, 11, 12, 3, 5, 10, 20, 14, 6, 8, 8, 12, 12, 18, 15, 6, 9, 18, 11

Page 64: Statistics and probability

To find the Mean, add up all the numbers, then divide by how many numbers

there are:

Mean =

9+15+11+12+3+5+10+20+14+6+8+8+12+12+18+15+6

+9+18+11 = 11.

1 20

To find the Median, place the numbers in value order and find the middle

number (or the mean of the middle two numbers). In this case the mean of

the 10th and 11th values:

3, 5, 6, 6, 8, 8, 9, 9, 10, 11, 11, 12, 12, 12, 14, 15, 15, 18, 18, 20:

Median = 11 + 11

= 11 2

To find the Mode, or modal value, place the numbers in value order then

count how many of each number. The Mode is the number which appears

most often (you can have more than one mode):

3, 5, 6, 6, 8, 8, 9, 9, 10, 11, 11, 12, 12, 12, 14, 15, 15, 18, 18, 20:

12 appears three times, more often than the other values, so Mode = 12

Grouped Frequency Table

Now, let's make a Grouped Frequency Table of Alex's data:

Number of

games

Frequency

1 - 5 2

6 - 10 7

11 - 15 8

Page 65: Statistics and probability

16 - 20 3

(It says that 2 of Alex's friends own somewhere between 1 and 5 games, 7

own between 6 and 10 games, etc)

Oh No!

Suddenly all the original data gets lost (naughty pup!)

Only the Grouped Frequency Table survived ...

... can we help Alex calculate the Mean, Median and Mode from just that

table?

The answer is ... no we can't. Not accurately anyway. But, we can

make estimates.

Estimating the Mean from Grouped Data

So all we have left is:

Number of

games

Frequency

1 - 5 2

6 - 10 7

11 - 15 8

16 - 20 3

The groups (1-5, 6-10, etc) also called class intervals, are

of width 5

The numbers 1, 6, 11 and 16 are the lower class boundaries

The numbers 5, 10, 15 and 20 are the upper class boundaries

The midpoints are halfway between the lower and upper class

boundaries

Page 66: Statistics and probability

So the midpoints are 3, 8, 13 and 18

We can estimate the Mean by using the midpoints.

So, how does this work?

Think about Alex's 7 friends who are in the group 6 - 10: all we know is that

they each have between 6 and 10 games:

Maybe all seven of them have 6 games,

Maybe all seven of them have 10 games,

But it is more likely that there is a spread of numbers: some have 6,

some have 7, and so on

So we take an average: we assume that all seven of them have 8 games (8

is the average of 6 and 10), which is the midpoint of the group.

So, we could make the table in a different way:

Midpoint Frequency

3 2

8 7

13 8

18 3

Now we think "2 people have 3 games, 7 people have 8 games, 8 people

have 13 games and 3 people have 18 games", so we imagine the data looks

like this:

3, 3, 8, 8, 8, 8, 8, 8, 8, 13, 13, 13, 13, 13, 13, 13, 13, 18, 18, 18

Now we can add them all up and divide by 20. This is the quick way to do it:

Midpoint

x

Frequency

f

fx

3 2 6

8 7 56

13 8 104

Page 67: Statistics and probability

18 3 54

Totals: 20 220

So an estimate of the mean number of games is:

Estimated Mean = 220

= 11 20

Estimating the Median from Grouped Data

To estimate the Median, let's look at our data again:

Number of

games

Frequency

1 - 5 2

6 - 10 7

11 - 15 8

16 - 20 3

The median is the mean of the middle two numbers (the 10th and

11th values) ...

... and they are both in the 11 - 15 group:

We can say "the median group is 11 - 15"

But if we need to estimate a single Median value we can use this formula:

Estimated Median = L + (n/2) − cfb

× w fm

where:

L is the lower class boundary of the group containing the median

n is the total number of data

Page 68: Statistics and probability

cfb is the cumulative frequency of the groups before the median group

fm is the frequency of the median group

w is the group width

For our example:

L = 11

n = 20

cfb = 2 + 7 = 9

fm = 8

w = 5

Estimated Median = 11 + (20/2) − 9

× 5 = 11 + (1/8) x 5 = 11.625 8

Estimating the Mode from Grouped Data

Again, looking at our data:

Number of

games

Frequency

1 - 5 2

6 - 10 7

11 - 15 8

16 - 20 3

We can easily identify the modal group (the group with the highest

frequency), which is 11 - 15

We can say "the modal group is 11 - 15"

But the actual Mode may not even be in that group! Or there may be more

than one mode. Without the raw data we don't really know.

But, we can estimate the Mode using the following formula:

Estimated Mode = L + fm − fm-1

× w (fm − fm-1) + (fm − fm+1)

Page 69: Statistics and probability

where:

L is the lower class boundary of the modal group

fm-1 is the frequency of the group before the modal group

fm is the frequency of the modal group

fm+1 is the frequency of the group after the modal group

w is the group width

In this example:

L = 11

fm-1 = 7

fm = 8

fm+1 = 3

w = 5

Estimated Mode = 11

+

8 − 7 × 5 = 11 + (1/6) × 5

= 11.833... (8 − 7) + (8 −

3)

Our final result is:

Estimated Mean: 11

Estimated Median: 11.625

Estimated Mode: 11.833...

(Compare that with the true Mean, Median and Mode of 11.1, 11 and

12 that we got at the very start.)

And that is how it is done.

Now let us look at two more special examples, and get some more practice

along the way!

Continuous Data

Page 70: Statistics and probability

Data can be Discrete or Continuous:

Discrete data can only take certain values, like our previous example

(games owned)

Continuous data can take any value (within a range), such as length

or weight

Continuous data can be treated in exactly the same way as discrete

data, but with one important difference.

The difference concerns the class boundaries.

Example: You grew fifty baby carrots using special soil. You dig

them up and measure their lengths (to the nearest mm) and group

the results:

Length (mm) Frequency

150 - 154 5

155 - 159 2

160 - 164 6

165 - 169 8

170 - 174 9

175 - 179 11

180 - 184 6

185 - 189 3

Now, what does "155 - 159" mean?

The clue is "to the nearest mm".

A length of 154.5 mm would be rounded up to 155 mm (and placed

in 155 - 159),

Page 71: Statistics and probability

Similarly 159.49 mm would be rounded down to 159 mm (and also be

placed in 155 - 159).

So lengths from 154.5 up to (but not including) 159.5 get placed in 155

- 159

And so for continuous data "155 - 159" has two types of numbers at the

beginning and end:

the lower class boundary of 155 and the upper class boundary of

159

the lower class limit of 154.5 and upper class limit of 159.5

Note that the upper class limit of one class interval is the lower class limit of

the next class interval.

So, how does this affect our calculations?

The Mean is not affected

But the Median and Mode now have L = Lower class limit (rather than

Lower class boundary)

Now let's go:

Mean

Length (mm) Midpoint

x

Frequency

f

fx

150 - 154 152 5 760

155 - 159 157 2 314

160 - 164 162 6 972

165 - 169 167 8 1336

170 - 174 172 9 1548

175 - 179 177 11 1947

180 - 184 182 6 1092

185 - 189 187 3 561

Totals: 50 8530

Page 72: Statistics and probability

Estimated Mean = 8530

= 170.6 mm 50

Median

The Median is the mean of the 25th and the 26th length, so is in the 170 -

174 group:

L = 169.5 (the lower class limit of the 170 - 174 group)

n = 50

cfb = 5 + 2 + 6 + 8 = 21

fm = 9

w = 5

Estimated Median =

169.5 +

(50/2) −

21 × 5 = 169.5 + 2.22... = 171.7

mm (to 1 decimal) 9

Mode

The Modal group is the one with the highest frequency, which is 175 - 179:

L = 174.5 (the lower class limit of the 175 - 179 group)

fm-1 = 9

fm = 11

fm+1 = 6

w = 5

Estimated Mode =

174.5 +

11 − 9 × 5 = 174.5 + 1.42... = 175.9

mm (to 1 decimal) (11 − 9) +

(11 − 6)

Ages

Age is a special case.

Page 73: Statistics and probability

When we say "Sarah is 17" she stays "17" up until her eighteenth birthday.

She might be 17 years and 364 days old and still be called "17".

In other words, even though "age" is a continuous variable (time), we treat it

as discrete.

Example: The ages of the 112 people who live on a tropical island

were grouped as follows:

Age Number

0 - 9 20

10 - 19 21

20 - 29 23

30 - 39 16

40 - 49 11

50 - 59 10

60 - 69 7

70 - 79 3

80 - 89 1

A child in the first group 0 - 9 could be almost 10 years old. So the midpoint

for this group is 5 not 4.5

The midpoints are 5, 15, 25, 35, 45, 55, 65, 75 and 85

Similarly, in the calculations of Median and Mode, we will use the class

boundaries 0, 10, 20 etc

Mean

Page 74: Statistics and probability

Age Midpoint x

Number f

fx

0 - 9 5 20 100

10 - 19 15 21 315

20 - 29 25 23 575

30 - 39 35 16 560

40 - 49 45 11 495

50 - 59 55 10 550

60 - 69 65 7 455

70 - 79 75 3 225

80 - 89 85 1 85

Totals: 112 3360

Estimated Mean = 3360

= 30 112

Median

The Median is the mean of the ages of the 56th and the 57th people, so is in

the 20 - 29 group:

L = 20 (the lower class boundary of the class interval containing the

median)

n = 112

cfb = 20 + 21 = 41

fm = 23

w = 10

Estimated Median = 20

+

(112/2) −

41 × 10 = 20 + 6.52... = 26.5 (to 1

decimal) 23

Page 75: Statistics and probability

Mode

The Modal group is the one with the highest frequency, which is 20 - 29:

L = 20 (the lower class boundary of the modal class)

fm-1 = 21

fm = 23

fm+1 = 16

w = 10

Estimated Mode =

20 +

23 − 21 × 10 = 20 + 2.22... = 22.2 (to 1

decimal) (23 − 21) +

(23 − 16)

Summary

For grouped data, we cannot find the exact Mean, Median and Mode,

we can only give estimates.

To estimate the Mean use the midpoints of the class intervals.

Estimated Median = L + (n/2) + cfb

× w fm

where:

L is the lower class boundary of the group containing the median

n is the total number of data

cfb is the cumulative frequency of the groups before the median group

fm is the frequency of the median group

w is the group width

Estimated Mode = L + fm − fm-1

× w (fm − fm-1) + (fm − fm+1)

where:

L is the lower class boundary of the modal group

fm-1 is the frequency of the group before the modal group

Page 76: Statistics and probability

fm is the frequency of the modal group

fm+1 is the frequency of the group after the modal group

w is the group width

For continuous data use limits (rather than boundaries) for median

and mode

Weighted Mean

Also called Weighted Average

A mean where some values contribute more than others.

Mean

When we do a simple mean (or average), we give equal weight to each

number.

Here is the mean of 1, 2, 3 and 4:

Add up the numbers, divide by how many numbers:

Mean = 1 + 2 + 3 + 4

= 10

= 2.5 4 4

Weights

We could think that each of those numbers has a "weight" of ¼ (because

there are 4 numbers):

Page 77: Statistics and probability

Mean = ¼ × 1 + ¼ × 2 + ¼ × 3 + ¼ × 4

= 0.25 + 0.5 + 0.75 + 1 = 2.5

Same answer.

Now let's change the weight of 3 to 0.7, and the weights of the other

numbers to 0.1 so the total of the weights is still 1:

Mean = 0.1 × 1 + 0.1 × 2 + 0.7 × 3 + 0.1 × 4

= 0.1 + 0.2 + 2.1 + 0.4 = 2.8

This weighted mean is now a little higher ("pulled" there by the weight of

3).

When some values get more weight than others

the central point (the mean) can change:

Decisions

Weighted means can help with decisions where some things are more

important than others:

Page 78: Statistics and probability

Example: Sam wants to buy a new camera, and decides on

the following rating system:

Image Quality 50%

Battery Life 30%

Zoom Range 20%

The Cony camera gets 8 (out of 10) for Image Quality, 6 for Battery

Life and 7 for Zoom Range

The Sanon camera gets 9 for Image Quality, 4 for Battery Life and 6 for

Zoom Range

Which camera is best?

Cony: 0.5 × 8 + 0.3 × 6 + 0.2 × 7 = 4 + 1.8 + 1.4 = 7.2

Sanon: 0.5 × 9 + 0.3 × 4 + 0.2 × 6 = 4.5 + 1.2 + 1.2 = 6.9

Sam decides to buy the Cony.

What if the Weights Don't Add to 1?

When the weights don't add to 1, divide by the sum of weights.

Example: Alex usually works 7 days a week, but sometimes

just 1, 2, or 5 days.

The data:

2 weeks Alex worked 1 day each week

14 weeks Alex worked 2 days each week

8 weeks Alex worked 5 days each week

32 weeks Alex worked 7 days each week

Page 79: Statistics and probability

What is the mean number of days Alex works per week?

Use "Weeks" as the weighting:

Weeks × Days = 2 × 1 + 14 × 2 + 8 × 5 + 32 × 7

= 2 + 28 + 40 + 224 = 294

Also add up the weeks:

Weeks = 2 + 14 + 8 + 32 = 56

Divide:

Mean = 294

= 5.25 56

It looks like this:

But it is often better to use a table to make sure you have all the numbers

correct:

Example (continued):

Have:

the number of weeks is the weight w

and days (the value we want the mean of) is x

Multiply w by x, sum up w and sum up wx:

Weight w

Days x

wx

2 1 2

14 2 28

8 5 40

32 7 224

Σw = 56 Σwx = 294

Note: Σ (Sigma) means "Sum Up"

Page 80: Statistics and probability

Divide Σwx by Σx:

Mean = 294

= 5.25 56

And that leads us to our formula:

Weighted Mean = Σwx

Σw

In other words: multiply each weight w by its matching value x, sum that all

up, and divide by the sum of weights.

Summary

Weighted Mean: A mean where some values contribute more than

others.

When the weights add to 1: just multiply each weight by the

matching value and sum it all up

Otherwise, multiply each weight w by its matching value x, sum

that all up, and divide by the sum of weights:

Weighted Mean = Σwx

Σw

The Range (Statistics)

The Range is the difference between the lowest and highest values.

Example: In {4, 6, 9, 3, 7} the lowest value is 3, and the highest is 9.

So the range is 9-3 = 6.

Page 81: Statistics and probability

It is that simple!

But perhaps too simple ...

The Range Can Be Misleading

The range can sometimes be misleading when there are extremely high or

low values.

Example: In {8, 11, 5, 9, 7, 6, 3616}:

the lowest value is 5,

and the highest is 3616,

So the range is 3616-5 = 3611.

The single value of 3616 makes the range large, but most values are

around 10.

So you may be better off using Interquartile Range or Standard Deviation.

Range of a Function

Range can also mean all the output

values of a function, see Domain,

Range and Codomain.

Page 82: Statistics and probability

Quartiles

Quartiles are the values that divide a list of numbers into quarters.

First put the list of numbers in order

Then cut the list into four equal parts

The Quartiles are at the "cuts"

Like this:

Example: 5, 8, 4, 4, 6, 3, 8

Put them in order: 3, 4, 4, 5, 6, 8, 8

Cut the list into quarters:

And the result is:

Quartile 1 (Q1) = 4

Quartile 2 (Q2), which is also the Median, = 5

Quartile 3 (Q3) = 8

Sometimes a "cut" is between two numbers ... the Quartile is the average of

the two numbers.

Example: 1, 3, 3, 4, 5, 6, 6, 7, 8, 8

The numbers are already in order

Cut the list into quarters:

In this case Quartile 2 is half way between 5 and 6:

Q2 = (5+6)/2 = 5.5

And the result is:

Page 83: Statistics and probability

Quartile 1 (Q1) = 3

Quartile 2 (Q2) = 5.5

Quartile 3 (Q3) = 7

Interquartile Range

The "Interquartile Range" is from Q1 to Q3:

To calculate it just subtract Quartile 1 from Quartile 3, like this:

Example:

The Interquartile Range is:

Q3 - Q1 = 8 - 4 = 4

Box and Whisker Plot

You can show all the important values in a "Box and Whisker Plot", like this:

A final example covering everything:

Page 84: Statistics and probability

Example: Box and Whisker Plot and Interquartile

Range for

4, 17, 7, 14, 18, 12, 3, 16, 10, 4, 4, 11

Put them in order:

3, 4, 4, 4, 7, 10, 11, 12, 14, 16, 17, 18

Cut it into quarters:

3, 4, 4 | 4, 7, 10 | 11, 12, 14 | 16, 17, 18

In this case all the quartiles are between numbers:

Quartile 1 (Q1) = (4+4)/2 = 4

Quartile 2 (Q2) = (10+11)/2 = 10.5

Quartile 3 (Q3) = (14+16)/2 = 15

Also:

The Lowest Value is 3,

The Highest Value is 18

So now we have enough data for the Box and Whisker Plot:

And the Interquartile Range is:

Q3 - Q1 = 15 - 4 = 11

Percentiles

Percentile: the value below which a percentage of data falls.

Example: You are fourth tallest person in a group of 20

80% of people are shorter than you:

That means you are at the 80th percentile.

Page 85: Statistics and probability

If your height is 1.85m then "1.85m" is the 80th percentile height in

that group.

In Order

The data needs to be in order! So percentiles of height need to be in height

order (sorted by height). If they were percentiles of weight, they would need

to be in weight order.

Deciles

A related idea is Deciles (sounds like decimal and percentile together),

which splits the data into 10% groups:

The 1st decile is the 10th percentile (the value that divides the data

so that 10% is below it)

The 2nd decile is the 20th percentile (the value that divides the

data so that 20% is below it)

etc!

Example: (continued)

You are at the 8th decile (the 80th percentile).

Quartiles

Another related idea is Quartiles, which splits the data into quarters:

Example: 1, 3, 3, 4, 5, 6, 6, 7, 8, 8

The numbers are in order. Cut the list into quarters:

Page 86: Statistics and probability

In this case Quartile 2 is half way between 5 and 6:

Q2 = (5+6)/2 = 5.5

And the result is:

Quartile 1 (Q1) = 3

Quartile 2 (Q2) = 5.5

Quartile 3 (Q3) = 7

The Quartiles also divide the data into divisions of 25%, so:

Quartile 1 (Q1) can be called the 25th percentile

Quartile 2 (Q2) can be called the 50th percentile

Quartile 3 (Q3) can be called the 75th percentile

Example: (continued)

For 1, 3, 3, 4, 5, 6, 6, 7, 8, 8:

The 25th percentile = 3

The 50th percentile = 5.5

The 75th percentile = 7

Estimating Percentiles

We can estimate percentiles from a line graph.

Example: Shopping

A total of 10,000 people visited the shopping mall over 12 hours:

Time (hours) People

Page 87: Statistics and probability

0 0

2 350

4 1100

6 2400

8 6500

10 8850

12 10,000

a) Estimate the 30th percentile (when 30% of the visitors

had arrived).

b) Estimate what percentile of visitors had arrived after 11

hours.

First draw a line graph of the data: plot the points and join them with a

smooth curve:

a) The 30th percentile occurs when the visits reach 3,000.

Draw a line horizontally across from 3,000 until you hit the curve, then

draw a line vertically downwards to read off the time on the horizontal

axis:

Page 88: Statistics and probability

So the 30th percentile occurs after about 6.5 hours.

b) To estimate the percentile of visits after 11 hours: draw a line

vertically up from 11 until you hit the curve, then draw a line

horizontally across to read off the population on the horizontal axis:

So the visits at 11 hours were about 9,500, which is the 95th

percentile.

Mean Deviation

The mean of the distances of each value from their mean.

Yes, we use "mean" twice: Find the mean ... use it to work out distances ...

then find the mean of those!

Three steps:

Page 89: Statistics and probability

1. Find the mean of all values

2. Find the distance of each value from that mean (subtract the mean

from each value, ignore minus signs)

3. Then find the mean of those distances

Like this:

Example: the Mean Deviation of 3, 6, 6, 7, 8, 11, 15, 16

Step 1: Find the mean:

Mean = 3 + 6 + 6 + 7 + 8 + 11 + 15 + 16

= 72

= 9 8 8

Step 2: Find the distance of each value from that mean:

Value Distance from

9

3 6

6 3

6 3

7 2

8 1

11 2

15 6

16 7

Which looks like this:

Step 3. Find the mean of those distances:

Mean Deviation = 6 + 3 + 3 + 2 + 1 + 2 + 6 + 7

= 30

= 3.75 8 8

So, the mean = 9, and the mean deviation = 3.75

Page 90: Statistics and probability

It tells us how far, on average, all values are from the middle.

In that example the values are, on average, 3.75 away from the middle.

For deviation just think distance

Formula

The formula is:

Mean Deviation = Σ|x - μ|

N

Let's learn more about those symbols!

Firstly:

μ is the mean (in our example μ = 9)

x is each value (such as 3 or 16)

N is the number of values (in our example N = 8)

Absolute Deviation

Each distance we calculated is called an Absolute Deviation, because it is

the Absolute Value of the deviation (how far from the mean).

To show "Absolute Value" we put "|" marks either side like

this: |-3| = 3

For any value x:

Absolute Deviation = |x - μ|

From our example, the value 16 has Absolute Deviation = |x - μ| = |16 -

9| = |7| = 7

And now let's add them all up ...

Page 91: Statistics and probability

Sigma

The symbol for "Sum Up" is Σ (called Sigma Notation), so we have:

Sum of Absolute Deviations = Σ|x - μ|

Divide by how many values N and we have:

Mean Deviation = Σ|x - μ|

N

Let's do our example again, using the proper symbols:

Example: the Mean Deviation of 3, 6, 6, 7, 8, 11, 15, 16

Step 1: Find the mean:

μ = 3 + 6 + 6 + 7 + 8 + 11 + 15 + 16

= 72

= 9 8 8

Step 2: Find the Absolute Deviations:

x |x - μ|

3 6

6 3

6 3

7 2

8 1

11 2

15 6

16 7

Σ|x - μ| = 30

Step 3. Find the Mean Deviation:

Mean Deviation = Σ|x - μ|

= 30

= 3.75 N 8

Page 92: Statistics and probability

What Does It "Mean" ?

Mean Deviation tells us how far, on average, all values are from the middle.

Here is an example (using the same data as on the Standard

Deviation page):

Example: You and your friends have just measured the

heights of your dogs (in millimeters):

The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm

and 300mm.

Step 1: Find the mean:

μ = 600 + 470 + 170 + 430 + 300

= 1970

= 394 5 5

Step 2: Find the Absolute Deviations:

x |x - μ|

600 206

470 76

170 224

430 36

300 94

Σ|x - μ| = 636

Step 3. Find the Mean Deviation:

Mean Deviation = Σ|x - μ| = 636 = 127.2

Page 93: Statistics and probability

N 5

So, on average, the dogs' heights are 127.2 mm from the mean.

(Compare that with the Standard Deviation of 147 mm)

A Useful Check

The deviations on one side of the mean should equal the deviations on

the other side.

From our first example:

Example: 3, 6, 6, 7, 8, 11, 15, 16

The deviations are:

6 + 3 + 3 + 2 + 1 = 2 + 6 + 7

15 = 15

Likewise:

Example: Dogs

Deviations left of mean: 224 + 94 = 318

Deviations right of mean: 206 + 76 + 36 = 318

If they are not equal ... you may have made a msitake!

Standard Deviation and Variance

Deviation just means how far from the normal

Standard Deviation

Page 94: Statistics and probability

The Standard Deviation is a measure of how spread out numbers are.

Its symbol is σ (the greek letter sigma)

The formula is easy: it is the square root of the Variance. So now you ask,

"What is the Variance?"

Variance

The Variance is defined as:

The average of the squared differences from the Mean.

To calculate the variance follow these steps:

Work out the Mean (the simple average of the numbers)

Then for each number: subtract the Mean and square the

result (the squared difference).

Then work out the average of those squared differences. (Why

Square?)

Example

You and your friends have just measured the heights of your dogs (in

millimeters):

The heights (at the shoulders) are: 600mm, 470mm, 170mm, 430mm and

300mm.

Page 95: Statistics and probability

Find out the Mean, the Variance, and the Standard Deviation.

Your first step is to find the Mean:

Answer:

Mean = 600 + 470 + 170 + 430 + 300

= 1970

= 394 5 5

so the mean (average) height is 394 mm. Let's plot this on the chart:

Now, we calculate each dogs difference from the Mean:

To calculate the Variance, take each difference, square it, and then average

the result:

So, the Variance is 21,704.

Page 96: Statistics and probability

And the Standard Deviation is just the square root of Variance, so:

Standard Deviation: σ = √21,704 = 147.32... = 147 (to the

nearest mm)

And the good thing about the Standard Deviation is that it is useful. Now we

can show which heights are within one Standard Deviation (147mm) of the

Mean:

So, using the Standard Deviation we have a "standard" way of knowing what

is normal, and what is extra large or extra small.

Rottweilers are tall dogs. And Dachshunds are a bit short ... but don't tell

them!

Now try the Standard Deviation Calculator.

But ... there is a small change with Sample Data

Our example was for a Population (the 5 dogs were the only dogs we were

interested in).

But if the data is a Sample (a selection taken from a bigger Population),

then the calculation changes!

When you have "N" data values that are:

Page 97: Statistics and probability

The Population: divide by N when calculating Variance (like

we did)

A Sample: divide by N-1 when calculating Variance

All other calculations stay the same, including how we calculated the mean.

Example: if our 5 dogs were just a sample of a bigger population of

dogs, we would divide by 4 instead of 5 like this:

Sample Variance = 108,520 / 4 = 27,130

Sample Standard Deviation = √27,130 = 164 (to the nearest

mm)

Think of it as a "correction" when your data is only a sample.

Formulas

Here are the two formulas, explained at Standard Deviation Formulas if you

want to know more:

The "Population Standard Deviation":

The "Sample Standard Deviation":

Looks complicated, but the important change is to

divide by N-1 (instead of N) when calculating a Sample Variance.

*Footnote: Why square the differences?

If we just added up the differences from the mean ... the negatives would

cancel the positives:

Page 98: Statistics and probability

4 + 4 - 4 - 4

= 0 4

So that won't work. How about we use absolute values?

|4| + |4| + |-4| + |-4|

= 4 + 4 + 4 + 4

= 4 4 4

That looks good (and is the Mean Deviation), but what about this case:

|7| + |1| + |-6| + |-2|

= 7 + 1 + 6 + 2

= 4 4 4

Oh No! It also gives a value of 4, Even though the differences are more

spread out!

So let us try squaring each difference (and taking the square root at the

end):

√ 42 + 42 + 42 + 42

= √ 64

= 4 4 4

√ 72 + 12 + 62 + 22

= √ 90

= 4.74... 4 4

That is nice! The Standard Deviation is bigger when the differences are more

spread out ... just what we want!

In fact this method is a similar idea to distance between points, just applied

in a different way.

Page 99: Statistics and probability

And it is easier to use algebra on squares and square roots than absolute

values, which makes the standard deviation easy to use in other areas of

mathematics.

Return to Top

Standard Deviation Calculator

This shows you the step-by-step calculations to work out the Standard

Deviation (see below for formulas).

Enter your numbers below, the answer is calculated "live":

Page 100: Statistics and probability

When your data is the whole population the

formula is:

(The "Population Standard Deviation")

Page 101: Statistics and probability

When your data is a sample the formula is:

(The "Sample Standard Deviation")

The important difference is "N-1" instead of "N" ... read more at Standard

Deviation Formulas.

Standard Deviation Formulas

Deviation just means how far from the normal

Standard Deviation

The Standard Deviation is a measure of how spread out numbers are.

You might like to read this simpler page on Standard Deviation first.

But here we explain the formulas.

The symbol for Standard Deviation is σ (the Greek letter sigma).

This is the formula for Standard Deviation:

Say what? Please explain!

OK. Let us explain it step by step.

Say you have a bunch of numbers like 9, 2, 5, 4, 12, 7, 8, 11.

To calculate the standard deviation of those numbers:

1. Work out the Mean (the simple average of the numbers)

Page 102: Statistics and probability

2. Then for each number: subtract the Mean and square the

result

3. Then work out the mean of those squared differences.

4. Take the square root of that and you are done!

The formula actually says all of that, and I will show you how.

The Formula Explained

First, let us have some example values to work on:

Example: Sam has 20 Rose Bushes.

The number of flowers on each bush is

9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4

Work out the Standard Deviation.

Step 1. Work out the mean

In the formula above μ (the greek letter "mu") is the mean of all our values

...

Example: 9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9,

6, 9, 4

The mean is:

(9+2+5+4+12+7+8+11+9+3+7+4+12+5+4+10+9+6+9+4) / 20 =

140/20 = 7

So:

μ = 7

Page 103: Statistics and probability

Step 2. Then for each number: subtract the Mean and square the

result

This is the part of the formula that says:

So what is xi ? They are the individual x values 9, 2, 5, 4, 12, 7, etc...

In other words x1 = 9, x2 = 2, x3 = 5, etc.

So it says "for each value, subtract the mean and square the result", like this

Example (continued):

(9 - 7)2 = (2)2 = 4

(2 - 7)2 = (-5)2 = 25

(5 - 7)2 = (-2)2 = 4

(4 - 7)2 = (-3)2 = 9

(12 - 7)2 = (5)2 = 25

(7 - 7)2 = (0)2 = 0

(8 - 7)2 = (1)2 = 1

... etc ...

Step 3. Then work out the mean of those squared differences.

To work out the mean, add up all the values then divide by how many.

First add up all the values from the previous step.

But how do we say "add them all up" in mathematics? We use "Sigma": Σ

The handy Sigma Notation says to sum up as many terms as we want:

Sigma Notation

Page 104: Statistics and probability

We want to add up all the values from 1 to N, where N=20 in our case

because there are 20 values:

Example (continued):

Which means: Sum all values from (x1-7)2 to (xN-7)2

We already calculated (x1-7)2=4 etc. in the previous step, so just sum

them up:

=

4+25+4+9+25+0+1+16+4+16+0+9+25+4+9+9+4+1+4+9

= 178

But that isn't the mean yet, we need to divide by how many, which is

simply done by multiplying by "1/N":

Example (continued):

Mean of squared differences = (1/20) × 178 = 8.9

(Note: this value is called the "Variance")

Step 4. Take the square root of that and you are done!

Example (concluded):

σ = √(8.9) = 2.983...

DONE!

Sample Standard Deviation

But wait, there is more ...

Page 105: Statistics and probability

... sometimes your data is only a sample of the whole population.

Example: Sam has 20 rose bushes, but what if Sam only

counted the flowers on 6 of them?

The "population" is all 20 rose bushes,

and the "sample" is the 6 he counted. Let us say they were:

9, 2, 5, 4, 12, 7

We can still estimate the Standard Deviation.

But when you use the sample as an estimate of the whole population,

the Standard Deviation formula changes to this:

The formula for Sample Standard Deviation:

The important change is "N-1" instead of "N" (which is called

"Bessel's correction").

The symbols also change to reflect that we are working on a sample instead

of the whole population:

The mean is now x (for sample mean) instead of μ (the

population mean),

And the answer is s (for Sample Standard Deviation) instead of σ.

But that does not affect the calculations. Only N-1 instead of N changes

the calculations.

OK, let us now calculate the Sample Standard Deviation:

Step 1. Work out the mean

Page 106: Statistics and probability

Example 2: Using sampled values 9, 2, 5, 4, 12, 7

The mean is (9+2+5+4+12+7) / 6 = 39/6 = 6.5

So:

x = 6.5

Step 2. Then for each number: subtract the Mean and square the

result

Example 2 (continued):

(9 - 6.5)2 = (2.5)2 = 6.25

(2 - 6.5)2 = (-4.5)2 = 20.25

(5 - 6.5)2 = (-1.5)2 = 2.25

(4 - 6.5)2 = (-2.5)2 = 6.25

(12 - 6.5)2 = (5.5)2 = 30.25

(7 - 6.5)2 = (0.5)2 = 0.25

Step 3. Then work out the mean of those squared differences.

To work out the mean, add up all the values then divide by how many.

But hang on ... we are calculating the Sample Standard Deviation, so

instead of dividing by how many (N), we are going to divide by N-1

Example 2 (continued):

Sum = 6.25 + 20.25 + 2.25 + 6.25 + 30.25 + 0.25 = 65.5

Divide by N-1: (1/5) × 65.5 = 13.1

(This value is called the "Sample Variance")

Step 4. Take the square root of that and you are done!

Example 2 (concluded):

s = √(13.1) = 3.619...

Page 107: Statistics and probability

DONE!

Comparing

When we used the whole population we got: Mean = 7, Standard Deviation

= 2.983...

When we used the sample we got: Sample Mean = 6.5, Sample Standard

Deviation = 3.619...

Our Sample Mean was wrong by 7%, and our Sample Standard Deviation

was wrong by 21%.

Why Would We Take a Sample?

Mostly because it is easier and cheaper.

Imagine you want to know what the whole country thinks ... you can't

ask millions of people, so instead you ask maybe 1,000 people.

There is a nice quote (supposed to be by Samuel Johnson):

"You don't have to eat the whole ox to know that the meat is tough."

This is the essential idea of sampling. To find out information about the

population (such as mean and standard deviation), we do not need to look

at all members of the population; we only need a sample.

But when we take a sample, we lose some accuracy.

Summary

The Population Standard Deviation:

Page 108: Statistics and probability

The Sample Standard Deviation:

Univariate and Bivariate Data

Univariate: one variable, Bivariate: two variables

Univariate means "one variable" (one type of data)

Example: Travel Time (minutes): 15, 29, 8, 42, 35, 21, 18, 42, 26

The variable is Travel Time

We can do lots of things with univariate data:

Find a central value using mean, median and mode

Find how spread out it is using range, quartiles and standard deviation

Make plots like Bar Graphs, Pie Charts and Histograms

Bivariate means "two variables", in other words there are two types of data

With bivariate data you have two sets of related data that you want

to compare:

Example:

An ice cream shop keeps track of how much ice cream they sell versus

the temperature on that day.

The two variables are Ice Cream Sales and Temperature.

Here are their figures for the last 12 days:

Ice Cream Sales vs Temperature

Temperature °C Ice Cream Sales

14.2° $215

16.4° $325

Page 109: Statistics and probability

11.9° $185

15.2° $332

18.5° $406

22.1° $522

19.4° $412

25.1° $614

23.4° $544

18.1° $421

22.6° $445

17.2° $408

And here is the same data as a Scatter Plot:

It is now easy to see that warmer weather leads to more sales, but

the relationship is not perfect.

So with bivariate data we are interested in comparing the two sets of data

and finding anyrelationships.

We can use Tables, Scatter Plots, Correlation, Line of Best Fit, and plain old

common sense.

Scatter Plots

Page 110: Statistics and probability

A graph of plotted points that show the

relationship between two sets of data.

In this example, each dot represents one person's

weight versus their height.

(The data is plotted on the graph as "Cartesian

(x,y) Coordinates")

Example:

The local ice cream shop keeps track of how much ice cream they sell versus

the temperature on that day. Here are their figures for the last 12 days:

Ice Cream Sales vs Temperature

Temperature °C Ice Cream Sales

14.2° $215

16.4° $325

11.9° $185

15.2° $332

18.5° $406

22.1° $522

19.4° $412

25.1° $614

23.4° $544

18.1° $421

22.6° $445

17.2° $408

And here is the same data as a Scatter Plot:

Page 111: Statistics and probability

It is now easy to see that warmer weather leads to more sales, but the

relationship is not perfect.

Line of Best Fit

You can also draw a "Line of Best Fit" (also called a "Trend Line") on your

scatter plot:

Try to have the line as close as possible to all points, and as many points

above the line as below.

Example: Sea Level Rise

Page 112: Statistics and probability

A Scatter Plot of Sea

Level Rise:

And here I have drawn

on a "Line of Best Fit".

Correlation

When the two sets of data are strongly linked together we say they have

a High Correlation.

The word Correlation is made of Co- (meaning "together"), and Relation

Correlation is Positive when the values increase together, and

Correlation is Negative when one value decreases as the other

increases

Like this:

Page 113: Statistics and probability

(Learn More About Correlation)

Negative Correlation

Correlations can be negative, which means there is a correlation but one

value goes down as the other value increases.

Example : Birth Rate vs Income

The birth rate tends to be lower in richer

countries.

Below is a scatter plot for about 100 different

countries.

Country

Yearly

Production

per Person

Birth

Rate

Madagascar $800 5.70

India $3,100 2.85

Mexico $9,600 2.49

Taiwan $25,300 1.57

Norway $40,000 1.78

It has a negative correlation (the line slopes down)

Note: I tried to fit a straight line to the data, but maybe a curve would work

better, what do you think?

Page 114: Statistics and probability

Outliers

"Outliers" are values that "lie outside" the other values.

When we collect data, sometimes

there are values that are "far away"

from the main group of data ... what

do we do with them?

Example: Long Jump

A new coach has been working with the Long Jump team this month,

and the athletes' performance has changed. Augustus can now jump

0.15m further, June and Carol can jump 0.06m further.

Here are all the results:

Augustus: +0.15m

Tom: +0.11m

June: +0.06m

Carol: +0.06m

Bob: + 0.12m

Sam: -0.56m

Oh no! Sam got worse.

Here are the results on the number line:

The mean is:

(0.15+0.11+0.06+0.06+0.12-0.56) / 6 = -0.06 / 6 = -0.01m

So, on average the performance went DOWN.

The coach is obviously useless ... right?

Sam's result is an "Outlier" ... what if we remove Sam's result?

Page 115: Statistics and probability

Example: Long Jump (continued)

Let us try the results WITHOUT Sam:

Mean = (0.15+0.11+0.06+0.06+0.12)/6 = 0.08m

Hey, the coach looks much better now!

But is that fair? Can we just get rid of values we don't like?

What To Do?

You need to think "why is that value over there?"

It may be quite normal to have high or low values

People can be short or tall

Some days there is no rain, other days there can be a downpour

Athletes can perform better or worse on different days

Or there may be an unusual reason for extreme data

Example: Long Jump (continued)

We find out that Sam was feeling sick that day. Not the coach's fault at

all.

So it is a good idea in this case to remove Sam's result.

When you remove outliers YOU are influencing the data, it is no longer

"pure", so you shouldn't just get rid of the outliers without a good reason!

And when you do get rid of them, explain what you are doing and why.

Mean, Median and Mode

We saw how outliers affect the mean, but what about the median or mode?

Example: Long Jump (continued)

The median ("middle" value):

including Sam is: 0.085

without Sam is: 0.11 (went up a little)

The mode (the most common value):

Page 116: Statistics and probability

including Sam is: 0.06

without Sam is: 0.06 (stayed the same)

The mode and median didn't change very much.

They also stayed around where most of the data is.

So it seems that outliers have the biggest effect on the mean, and not so

much on the median or mode.

Hint: calculate the median and mode when you have outliers.

Correlation

When two sets of data are strongly linked together we say they have a High

Correlation.

The word Correlation is made of Co- (meaning "together"), and Relation

Correlation is Positive when the values increase together, and

Correlation is Negative when one value decreases as the other

increases

Like this:

Correlation can have a value:

1 is a perfect positive correlation

0 is no correlation (the values don't seem linked at all)

-1 is a perfect negative correlation

The value shows how good the correlation is (not how steep the line is),

and if it is positive or negative.

Page 117: Statistics and probability

Example: Ice Cream Sales

The local ice cream shop keeps track of how much ice cream they sell versus

the temperature on that day, here are their figures for the last 12 days:

Ice Cream Sales vs Temperature

Temperature °C Ice Cream Sales

14.2° $215

16.4° $325

11.9° $185

15.2° $332

18.5° $406

22.1° $522

19.4° $412

25.1° $614

23.4° $544

18.1° $421

22.6° $445

17.2° $408

And here is the same data as a Scatter Plot:

Page 118: Statistics and probability

You can easily see that warmer weather leads to more sales, the relationship

is good but not perfect.

In fact the correlation is 0.9575 ... see at the end how I calculated

it.

Correlation Is Not Good at Curves

The correlation calculation only works well for relationships that follow a

straight line.

Our Ice Cream Example: there has been a heat wave!

It gets so hot that people aren't going near the shop, and sales start

dropping.

Here is the latest graph:

Page 119: Statistics and probability

The correlation is now 0: "No Correlation" ... !

The calculated value of correlation is 0 (trust me, I worked it out),

which says there is "no correlation".

But we can see the data follows a nice curve that reaches a peak

around 25° C. But the correlation calculation is not "smart" enough to

see this.

Moral of the story: make a Scatter Plot, and look at it!

You may see more than the correlation value says.

Correlation Is Not Causation

"Correlation Is Not Causation" ... by that I mean: when there is a correlation

it does not mean that one thing causes the other

Example: Sunglasses vs Ice Cream

Our Ice Cream shop finds how many sunglasses were sold by a big

store for each day and compares them to their ice cream sales:

Page 120: Statistics and probability

The correlation between Sunglasses and Ice Cream sales is

high

Does this mean that sunglasses make people want ice cream?

How To Calculate

How did I calculate the value 0.9575 at the top?

I used "Pearson's Correlation". There is software that can calculate it for you,

such as the CORREL() function in Excel or OpenOffice Calc ...

... but here is how to calculate it yourself:

Let us call the two sets of data "x" and "y" (in our case Temperature is x and

Ice Cream Sales is y):

Step 1: Find the mean of x, and the mean of y

Step 2: Subtract the mean of x from every x value (call them "a"), do

the same for y (call them "b")

Step 3: Calculate: a × b, a2 and b2 for every value

Step 4: Sum up a × b, sum up a2 and sum up b2

Step 5: Divide the sum of a × b by the square root of [(sum of a2) ×

(sum of b2)]

Here is how I calculated the first Ice Cream example (values rounded to 1 or

0 decimal places):

Page 121: Statistics and probability

As a formula it is:

Where:

Σ is Sigma, the symbol for "sum up"

is each x-value minus the mean of x (called "a" above)

is each y-value minus the mean of y (called "b" above)

You probably won't have to calculate it like that, but at least you know it is

not "magic", but simply a routine set of calculations.

Approximate Values

Page 122: Statistics and probability

There are also approximate ways to calculate a correlation coefficient, such

as "Spearman's rank correlation coefficient", but I prefer using a spreadsheet

like above.

Probability

How likely something is to happen.

Many events can't be predicted with total certainty. The best we can say is

how likely they are to happen, using the idea of probability.

Tossing a Coin

When a coin is tossed, there are two possible outcomes:

heads (H) or

tails (T)

We say that the probability of the coin landing H is ½.

And the probability of the coin landing T is ½.

Throwing Dice

When a single die is thrown, there are six possible

outcomes: 1, 2, 3, 4, 5, 6.

The probability of any one of them is 1/6.

Probability

In general:

Page 123: Statistics and probability

Probability of an event happening = Number of ways it can happen

Total number of outcomes

Example: the chances of rolling a "4" with a die

Number of ways it can happen: 1 (there is only 1 face with a "4" on

it)

Total number of outcomes: 6 (there are 6 faces altogether)

So the probability = 1

6

Example: there are 5 marbles in a bag: 4 are blue, and 1 is

red. What is the probability that a blue marble will be

picked?

Number of ways it can happen: 4 (there are 4 blues)

Total number of outcomes: 5 (there are 5 marbles in total)

So the probability = 4

= 0.8 5

Probability Line

You can show probability on a Probability Line:

Probability is always between 0 and 1

Probability is Just a Guide

Page 124: Statistics and probability

Probability does not tell us exactly what will happen, it is just a guide

Example: toss a coin 100 times, how many Heads will come

up?

Probability says that heads have a ½ chance, so we would expect 50

Heads.

But when you actually try it out you might get 48 heads, or 55 heads ...

or anything really, but in most cases it will be a number near 50.

Learn more at Probability Index.

Words

Some words have special meaning in Probability:

Experiment or Trial: an action where the result is uncertain.

Tossing a coin, throwing dice, seeing what pizza people choose are all

examples of experiments.

Sample Space: all the possible outcomes of an experiment

Example: choosing a card from a deck

There are 52 cards in a deck (not including Jokers)

So the Sample Space is all 52 possible cards: {Ace of Hearts, 2 of

Hearts, etc... }

The Sample Space is made up of Sample Points:

Sample Point: just one of the possible outcomes

Example: Deck of Cards

the 5 of Clubs is a sample point

the King of Hearts is a sample point

"King" is not a sample point. As there are 4 Kings that is 4 different

sample points.

Page 125: Statistics and probability

Event: a single result of an experiment

Example Events:

Getting a Tail when tossing a coin is an event

Rolling a "5" is an event.

An event can include one or more possible outcomes:

Choosing a "King" from a deck of cards (any of the 4 Kings) is an event

Rolling an "even number" (2, 4 or 6) is also an event

The Sample Space is all possible

outcomes.

A Sample Point is just one possible

outcome.

And an Event can be one or more of

the possible outcomes.

Hey, let's use those words, so you get used to them:

Example: Alex decide to see how many times a "double"

would come up when throwing 2 dice.

Each time Alex throws the 2 dice is an Experiment.

It is an Experiment because the result is uncertain.

The Event Alex is looking for is a "double", where both dice have the

same number. It is made up of these 6 Sample Points:

{1,1} {2,2} {3,3} {4,4} {5,5} and {6,6}

Page 126: Statistics and probability

The Sample Space is all possible outcomes (36 Sample Points):

{1,1} {1,2} {1,3} {1,4} ... {6,3} {6,4} {6,5} {6,6}

These are Alex's Results:

Experiment Is it a Double?

{3,4} No

{5,1} No

{2,2} Yes

{6,3} No

... ...

After 100 Experiments, Alex had 19 "double" Events ... is that close

to what you would expect?

Probability Line

Probability is the chance that something will happen. It can be shown on a

line.

The probability of an event occurring is somewhere between impossible and

certain.

As well as words we can use numbers (such as fractions or decimals) to show

the probability of something happening:

Impossible is zero

Certain is one.

Page 127: Statistics and probability

Here are some fractions on the probability line:

We can also show the chance that something will happen:

a) The sun will rise tomorrow.

b) I will not have to learn mathematics at school.

c) If I flip a coin it will land heads up.

d) Choosing a red ball from a sack with 1 red ball and 3 green balls

Between 0 and 1

The probability of an event will not be less than 0.

This is because 0 is impossible (sure that something will not

happen).

The probability of an event will not be more than 1.

This is because 1 is certain that something will happen.

The Basic Counting Principle

When there are m ways to do one thing,

and n ways to do another,

then there are m×n ways of doing both.

Page 128: Statistics and probability

Example: you have 3 shirts and 4 pants.

That means 3×4=12 different outfits.

Example: There are 6 flavors of ice-cream, and 3 different cones.

That means 6×3=18 different single-scoop ice-creams you could order.

It also works when you have more than 2 choices:

Example: You are buying a new car.

There are 2 body styles:

sedan or hatchback

There are 5 colors available:

There are 3 models: GL (standard model),

SS (sports model with bigger engine)

SL (luxury model with leather seats)

How many total choices?

You can see in this "tree" diagram:

You can count the choices, or just do the simple calculation:

Total Choices = 2 × 5 × 3 = 30

Page 129: Statistics and probability

Independent or Dependent?

But it only works when all choices are independent of each other.

If one choice affects another choice (i.e. depends on another choice), then a

simple multiplication is not right.

Example: You are buying a new car ... but ...

the salesman says "You can't choose black for the hatchback" ...

well then things change!

You now have only 27 choices.

Because your choices are not independent of each other.

But you can still make your life easier with this calculation:

Choices = 5×3 + 4×3 = 15 + 12 = 27

Relative Frequency

How often something happens divided by all outcomes.

Example: if your team has won 9 games from a total of 12 games

played:

the Frequency of winning is 9

the Relative Frequency of winning is 9/12 = 75%

All the Relative Frequencies add up to 1 (except for any rounding error).

Example: Travel Survey

92 people were asked how they got to work:

Page 130: Statistics and probability

35 used a car

42 took public transport

8 rode a bicycle

7 walked

The Relative Frequencies (to 2 decimal places) are:

Car: 35/92 = 0.38

Public Transport: 42/92 = 0.46

Bicycle: 8/92 = 0.09

Walking: 7/92 = 0.08

0.38+0.46+0.09+0.08 = 1.01

(It would be exactly 1 if we had used perfect accuracy),

Try it for yourself:

Page 131: Statistics and probability

Activity: An Experiment with a Die

You will need:

A singledie

Interesting point

Many people think that one of these cubes is called "a dice". But no!

Page 132: Statistics and probability

The plural is dice, but the singular is die. (i.e. 1 die, 2 dice.)

The common die has six faces:

We usually call the faces 1, 2, 3, 4, 5 and 6.

High, Low, and Most Likely

Before we start, let's think about what might happen.

Question: If you roll a die:

1. What is the least possible score?

2. What is the greatest possible score?

3. What do you think is the most likely score?

The first two questions are quite easy to answer:

1. The least possible score must be 1

2. The greatest possible score must be 6

3. The most likely score is ... ???

Are they all just as likely? Or will some happen more often?

Let us see which is most likely ...

The Experiment

Throw a die 60 times,

record the scores in a tally table.

You can record the results in this table using tally marks:

Score Tally Frequency

1

Page 133: Statistics and probability

2

3

4

5

6

Total Frequency = 60

OK, Go!

... ...

Finished ...?

Now draw a bar graph to illustrate

your results.

You can fill in this one:

Or you can use Data Graphs (Bar,

Line and Pie)

then print it out.

You may get something like this:

Are the bars all the same height?

If not ... why not?

60 Throws

Page 134: Statistics and probability

OK, why did I ask you to make 60 throws? Well, only 6 throws would not

give you good results, 600 throws would have been too hard, so I chose 60,

which is 10 lots of 6.

So we should expect 10 of each number, like this:

Those are the theoretical values,

as opposed to the experimental ones you got from your experiment!

How do those theoretical results compare with your experimental

results?

This graph and your graph should be similar, but they are not likely to be

exactly the same, as your experiment relied on chance, and the number of

times you did it was fairly small.

If you did the experiment a very large number of times, you would get

results much closer to the theoretical ones.

Questions

Which face came up most often? ____

Which face came up least often? ____

Do you think you would get the same results if you did this

again? Yes / No

An experiment gives results.

When done again it may give different results!

Page 135: Statistics and probability

So it is important to know when results are good quality, or

just random.

Probability

On the page Probability you will find a formula:

Probability of an event happening = Number of ways it can happen

Total number of outcomes

Example: Probability of a 2

We know there are 6 possible outcomes.

And there is only 1 way to get a 2.

So the probability of getting 2 is:

Probability of a 2 = 1

6

Doing that for each score gets us:

Score Probability

1 1/6

2 1/6

3 1/6

4 1/6

5 1/6

6 1/6

Total = 1

The sum of all the probabilities is 1

For any experiment:

The sum of the probabilities of all possible outcomes is always equal to 1

Activity: An Experiment with Dice

We will be throwing two dice and adding the scores ...

Page 136: Statistics and probability

You will need:

Twodice

Interesting point

Many people think that one of these cubes is called "a dice". But no!

The plural is dice, but the singular is die: i.e. 1 die, 2 dice.

The common die has six faces:

We usually call the faces 1, 2, 3, 4, 5 and 6.

We Will Be Throwing Two Dice and Adding the Scores ...

Example: if one die shows 2 and the other die shows 6, then the total

score would be 2 + 6 = 8

Question: Can you get a total of 8 any other way?

What about 6 + 2 = 8 (the other way around), is that a different

way?

Yes! Because the two dice are different.

Example: imagine one die is colored red and the other is

colored blue.

There are two possibilities:

So 2 + 6 and 6 + 2 are different.

Page 137: Statistics and probability

And you can get 8 with other numbers, such as 3 + 5 = 8 and 4 +

4 = 8

High, Low, and Most Likely

Before we start, let's think about what might happen.

Question: If you throw 2 dice together and add the two scores:

1. What is the least possible total score?

2. What is the greatest possible total score?

3. What do you think is the most likely total score?

The first two questions are quite easy to answer:

1. The least possible total score must be 1 + 1 = 2

2. The greatest possible total score must be 6 + 6 = 12

3. The most likely total score is ... ???

Are they all just as likely? Or will some happen more often?

To help answer the third question let us try an experiment.

The Experiment

Throw two dice together 108 times,

add the scores together each time,

record the scores in a tally table.

Why 108? That seems a strange number to choose. I will explain later.

You can record the results in this table using tally marks:

Added

Scores Tally Frequency

2

3

4

Page 138: Statistics and probability

5

6

7

8

9

10

11

12

Total Frequency = 108

OK, Go!

...

...

Finished ...?

Now draw a bar graph

to show your results.

Or you can use Data

Graphs (Bar, Line and

Pie) then print it out.

You may get something

like this:

Page 139: Statistics and probability

Are the bars all about the same height?

If not ... why not?

So Why Did We Get That Shape?

The explanation is simple:

There is only one way to get a total of 2 (1 + 1),

but there are six ways of getting a total of 7 (1 + 6, 2 + 5, 3

+ 4, 4 + 3, 5 + 2 and 6 + 1)

Here is a table of all possibile outcomes, and the totals. I have also shown

what adds to 7 in bold.

Score on One Die

1 2 3 4 5 6

Score

on the

Other

Die

1 2 3 4 5 6 7

2 3 4 5 6 7 8

3 4 5 6 7 8 9

4 5 6 7 8 9 10

5 6 7 8 9 10 11

6 7 8 9 10 11 12

You can see there is only 1 way to get 2, there are 2 ways to get 3, and so

on.

Let us count the ways of getting each total and put them in a table:

Total

Score

Number of

Ways to

Get

Score

2 1

3 2

Page 140: Statistics and probability

4 3

5 4

6 5

7 6

8 5

9 4

10 3

11 2

12 1

Total = 36

Can you see the Symmetry in this table?

2 and 12 have the same number of ways = 1 each

3 and 11 have the same number of ways = 2 each

4 and 10 have the same number of ways = 3 each

5 and 9 have the same number of ways = 4 each

6 and 8 have the same number of ways = 5 each

108 Throws

OK, why 108 throws? Well, only 36 throws would not give good results, 360

throws would be good but take a long time, so 108 (which is 3 lots of

36) seemed just right.

So let's multiply all these numbers by 3 to match our total of 108:

Total

Score

Number of

Ways to

Get

Score

2 3

3 6

4 9

5 12

6 15

7 18

Page 141: Statistics and probability

8 15

9 12

10 9

11 6

12 3

Total = 108

Those are the theoretical values, as opposed to the experimental ones

you got from your experiment.

The theoretical values look like this in a bar graph:

How do these theoretical results compare with your experimental

results?

This graph and your graph should be quite similar, but they are not likely to

be exactly the same, as your experiment relied on chance, and the number

of times you did it was fairly small.

If you did the experiment a very large number of times, you should get

results much closer to the theoretical ones.

And, by the way, we've now answered the question from near the beginning

of the experiment:

What is the most likely total score?

7 has the highest bar, so 7 is the most likely total score.

Page 142: Statistics and probability

Hey, is that why people talk about Lucky 7 ... ?

Probability

On the page Probability you will find a formula:

Probability of an event happening = Number of ways it can happen

Total number of outcomes

Example: Probability of a total of 2

We know there are 36 possible outcomes.

And there is only 1 way to get a total score of 2.

So the probability of getting 2 is:

Probability of a total of 2 = 1

36

Doing that for each score gets us:

Total

Score Probability

2 1/36

3 2/36

4 3/36

5 4/36

6 5/36

7 6/36

8 5/36

9 4/36

10 3/36

11 2/36

12 1/36

Total = 1

(Note: I didn't simplify the fractions)

The sum of all the probabilities is 1

For any experiment:

Page 143: Statistics and probability

The sum of the probabilities of all possible outcomes is always equal to 1

Activity: Dropping a Coin onto a Grid

A few hundred years ago people enjoyed betting on coins tossed on to the

floor ... would they cross a line or not?

A man called "Buffon" (see "Buffon's Needle") started thinking about this and

worked out how to calculate the probability.

Now it is your turn to have a go!

You will need:

A small round coin,

such as a US penny, a 1c Euro or 5 Rupee.

A sheet of paper with a grid of 30 mm squares.

Steps

Measure the diameter of your coin: ____ mm

a US Penny is 19mm, a 1c Euro is 16.25mm, a Rs 5 is

23mm

Also measure the spacing of your grid (it may not print at

exactly 30mm): ____ mm

Put your sheet of paper on a flat surface such as a table top or

the floor.

From a height of about 5cm, drop the coin onto the paper and

record whether it lands:

Page 144: Statistics and probability

A: Completely inside a square (not touching any grid

lines)

B: Crosses one or more lines

The exact height from which you drop the coin is not important, but don't

drop it so close to the paper that you are cheating!

If the coin rolls completely off the paper, then do not count that turn.

100 Times

Now we will drop the coin 100 times, but first ...

... what percentage do you think will land A, or B?

Make a guess (estimate) before you begin the experiment:

Your Guess for "A" (%):

Your Guess for "B" (%):

OK let's begin.

Drop the coin 100 times and record A (does not touch a line) or B (touches a

line) using Tally Marks:

Coin lands Tally Frequency Percentage

A

B

Totals: 100 100%

Page 145: Statistics and probability

Now draw a Bar Graph to illustrate your results. You can create one at Data

Graphs (Bar, Line and Pie).

Are the bars the same height?

Did you expect them to be?

How does the result compare with your guess?

We Can Calculate What It Should Be ...

Here are some positions for the coin to land so it does not quite touch one

of the lines:

Place your coin on your grid (like above), and then put a mark on the paper

where the center of the coin is (just a rough estimate will do).

See how the coin's center is one radius r away from a

line.

(Read about a Circle's Radius and Diameter.)

Make lots of "center marks" then draw a box connecting them all like below:

d = Coin's diameter (2 × r)

When a coin's center is within the yellow box it won't touch any line.

Page 146: Statistics and probability

The yellow box is smaller than the grid by two radiuses (= one diameter) of

the coin.

So what are the areas?

The area of the grid square is 30 × 30 = 900 mm2

The area of the yellow box is (30-d) × (30-d) = (30-d)2 mm2

The above calculation was for a 30 mm grid, but we can use S for grid size:

The area of the grid square is S × S = S2 mm2

The area of the yellow box is (S-d)2 mm2

Example: A 1c Euro (d=16.25 mm) on a 29mm grid (S=29

mm):

Grid Square = 292 = 841 mm2

Yellow Box = (29-16.25)2 = 12.752 = 162 mm2 (to the nearest mm2)

So you should expect the coin to land not crossing a line of the grid

approximately:

"A" = 162 / 841 = 19.3% of the time

And "B" = 100% - 19.3% = 80.7%

Now do the calculations for your own grid size and coin size.

Grid Spacing S (mm):

Diameter of Coin d (mm):

Area of Grid Square = S2 (mm2):

Area of Yellow Box = (S-d)2 (mm2):

"A" (%):

"B" (%):

How do these theoretical results compare with your experimental

results?

It won't be exact (because it is a random thing) but it may be close.

Different Sizes of Coin

Page 147: Statistics and probability

Try repeating the experiment using a different sized coin.

First calculate the theoretical value ... how does this affect the values

for A and B?

Then do the experiment to see how close it gets.

What You Have Done

You have (hopefully) had fun running an experiment.

You have done some geometry, and had some experience calculating areas

and probabilities.

And you have seen the relationship between theory and reality.

Activity: Buffon's Needle

A few hundred years ago people enjoyed betting on coins tossed on to the

floor ... would they cross a line or not?

A man called "Buffon" started thinking about this and worked out

the probability. It is called "Buffon's Needle" in his honor.

Now it is your turn to have a go!

You will need:

A match, with the head cut off.

It must be less than 50 mm.

(You can use a needle, but be careful!)

A sheet of paper with lines 50 mm apart.

Page 148: Statistics and probability

Steps

Measure the spacing of your lines (it may not print at exactly

50mm): ____ mm

Measure the length of your match (must be less than the line

spacing): ____ mm

Make sure your sheet of paper is on a flat surface such as a

table top or the floor.

From a height of about 5cm, drop the match onto the paper

and record whether it lands:

A: Not touching a line

B: Touching or crossing a line

The exact height from which you drop the match is not important, but don't

drop it so close to the paper that you are cheating!

If the match rolls completely off the paper, then do not count that turn.

100 Times

Now we will drop the match 100 times, but first ...

... what percentage do you think will land A, or B?

Make a guess (estimate) before you begin the experiment:

Your Guess for "A" (%):

Your Guess for "B" (%):

OK let's begin.

Drop the match 100 times and record A (does not touch a grid line)

or B (touches or crosses a grid line) using Tally Marks:

Page 149: Statistics and probability

match lands Tally Frequency Percentage

A (no touch)

B (crosses)

Totals: 100 100%

Now draw a Bar Graph to illustrate your results. You can create one at Data

Graphs (Bar, Line and Pie).

Are the bars the same height?

Did you expect them to be?

How does the result compare with your guess?

We Can Calculate What It Should Be ...

Buffon used the results from his experiment with a needle to estimate the

value of π (Pi). He worked out this formula:

π ≈ 2L

xp

Where

L is the length of the needle (or match in our case)

x is the line spacing (50 mm for us)

p is the proportion of needles crossing a line (case B)

But today we are going to "change the subject" of the formula to work out

the "p" (the proportion of B):

Start with: π ≈ 2L/xp

multiply both sides by p: πp ≈ 2L/x

divide both sides by π: p ≈ 2L/πx

Page 150: Statistics and probability

And we get:

p =

2L

πx

Example: John had a match of length 36 mm, and a 50 mm

line spacing.

So John has:

L = 36

x = 50

Substituting these values into the formula, John got:

p =

2 × 36 = 0.46...

π × 50

So John should expect the match to cross the line (case B) 46 times out

of 100

Fill in the following table using your own results:

Length of match "L" (mm):

Line Spacing "x" (mm):

Estimate for p (= 2L/πx):

How close were you?

It won't be exact (because it is a random thing) but it may be close.

Different Size of Match

Try repeating the experiment using a different sized match (but not larger

then the line spacing!)

Did you get better or worse results?

Page 151: Statistics and probability

What You Have Done

You have (hopefully) had fun running an experiment.

You have had some experience with calculations.

And you have seen the relationship between theory and reality.

Random Words

Probability and English ... what a mix!

Random Letters

You would think it was easy to create random words ... just pick letters

randomly and put them together, and voila! a random word.

Well, here are 20 words made that way:

tldkl oewkx dmwol vuptg hvwjk naqid avypr zwtip

zgnzs bvdhd

muyfd ighgd xhlng oyecn vjnsl ssjrx gxald tukxj

rvfoq yxzxq

It turns out that the words are not only nonsense, but quite hard to

pronounce!

(Try saying "tldkl" or "oewkx")

You see, the probability is very unlikely ... you would have to try lots of

random combinations before getting lucky.

Why? Well, English has around 200,000 words (228,000 in the Oxford

English Dictionary, including many words no longer used) ... but how many

different words can be made with just 5 letters?

26 × 26 × 26 × 26 × 26 = 11,881,376 possible 5 letter words!

And that is just the 5 letter words ...

Page 152: Statistics and probability

Let us guess that there are 40,000 words in English that have 5 letters. So

the probability of making a real word just randomly would be:

40,000 / 11,881,376 = 0.003, or about 0.3% chance

So real words are rare. And we can see that putting random letters

together is very unlikely to produce a real word.

Vowels

We can improve our success by insisting that a word have at least one vowel,

since nearly every word in English has one (except fly, by and a few others).

Like this:

ectot gjaqv kuifg vzicu zspsu pdidb wqdis uerrs

ucgej okimw

fnevz ewxko ljgew aglgo jpfoq dcytu uwkcj dzioy

wekdx xuybk

This is a great improvement. More words can be pronounced.

But there are still lots of strange words like "zspsu" and "xuybk"

Letter Frequency

So, our next improvement is to use less of the letters like j,x,z and q

and more of the letters like e,t and s.

In fact the frequency of letters in the English Language is well known.

Here is how many times you would expect to see a letter in every 1,000

letters:

a b c d e f g h i j k l m n o p q r s t u v w x y z

82 15 28 42 127 22 20 61 70 2 8 40 24 67 75 19 1 60 63 90 27 10 24 2 20 1

Can you see that "e" is common, but "z" is rare?

"e" is lkely to occur 127 times in every 1,000, or as a ratio 127/1000 =

.127 (=12.7%)

Page 153: Statistics and probability

"z" is lkely to occur only 1 time in every 1,000, or as a ratio 1/1000 =

.001 (=0.1%)

So, by selecting letters based on that frequency (a bit like rolling a 1,000

sided die (dice), where each die has 82 a's, 15 b's ... and only one z), we

can get output like this:

elnao etgov segty laast aessn siuon oenha eaoas

ncoot ctwka

dmswo dpuoh eewis ebdni laarm syucs idvos lhina

igahh soyie

Still no real words, but some are close. And most of them can be

pronounced. (Great names if you are writing a science fiction novel!)

Try For Yourself!

You can try all three methods here ... see if you can get lucky and find a real

word:

Probability: Complement

Complement of an Event: All outcomes that are NOT the event.

When the event is Heads, the complement is Tails

When the event is {Monday, Wednesday} the complement

is {Tuesday, Thursday, Friday, Saturday, Sunday}

When the event is {Hearts} the complement is {Spades,

Clubs, Diamonds, Jokers}

Page 154: Statistics and probability

So the Complement of an event is all the other outcomes (not the ones you

want).

And together the Event and its Complement make all possible outcomes.

Probability

Probability of an event happening =

Number of ways it can happen

Total number of outcomes

Example: the chances of rolling a "4" with a die

Number of ways it can happen: 1 (there is only 1 face with a "4" on it)

Total number of outcomes: 6 (there are 6 faces altogether)

So the probability =

1

6

The probability of an event is shown using "P":

P(A) means "Probability of Event A"

The complement is shown by a little ' mark such as A' (or

sometimes Ac or A):

P(A') means "Probability of the complement of Event A"

The two probabilities always add to 1

P(A) + P(A') = 1

Example: Rolling a "5" or "6"

Page 155: Statistics and probability

Event A: {5, 6}

Number of ways it can happen: 2

Total number of outcomes: 6

P(A) =

2

=

1

6 3

The Complement of Event A is {1, 2, 3, 4}

Number of ways it can happen: 4

Total number of outcomes: 6

P(A') =

4

=

2

6 3

Let us add them:

P(A) + P(A') =

1

+

2

=

3

= 1

3 3 3

Yep, that makes 1

It makes sense, right? Event A plus all outcomes that are not Event A make

up all possible outcomes.

Why is the Complement Useful?

It is sometimes easier to work out the complement first.

Example. Throw two dice. What is the probability the two scores

are different?

Different scores are like getting a 2 and 3, or a 6 and 1. It is quite a long

list:

Page 156: Statistics and probability

A = { (1,2), (1,3), (1,4), (1,5), (1,6),

(2,1), (2,3), (2,4), ... etc ! }

But the complement (which is when the two scores are the same) is only 6

outcomes:

A' = { (1,1), (2,2), (3,3), (4,4), (5,5), (6,6) }

And the probability is easy to work out:

P(A') = 6/36 = 1/6

Knowing that P(A) and P(A') together make 1, we can calculate:

P(A) = 1 - P(A') = 1 - 1/6 = 5/6

So in this case it's easier to work out P(A') first, then find P(A)

Probability: Types of Events

Events can be Independent, Mutually Exclusive or Conditional !

Life is full of random events!

You need to get a "feel" for them to be a smart and successful person.

The toss of a coin, throw of a dice and lottery draws are all examples of

random events.

Events

When we say "Event" we mean one (or more) outcomes.

Example Events:

Getting a Tail when tossing a coin is an event

Rolling a "5" is an event.

An event can include several outcomes:

Choosing a "King" from a deck of cards (any of the 4 Kings) is also an

event

Page 157: Statistics and probability

Rolling an "even number" (2, 4 or 6) is an event

Independent Events

Events can be "Independent", meaning each event is not affected by any

other events.

This is an important idea! A coin does not "know" that it came up heads

before ... each toss of a coin is a perfect isolated thing.

Example: You toss a coin three times and it comes up "Heads" each

time ... what is the chance that the next toss will also be a "Head"?

The chance is simply 1/2, or 50%, just like ANY OTHER toss of

the coin.

What it did in the past will not affect the current toss!

Some people think "it is overdue for a Tail", but really truly the next toss of

the coin is totally independent of any previous tosses.

Saying "a Tail is due", or "just one more go, my luck is due" is

called The Gambler's Fallacy

(Learn more at Independent Events.)

Dependent Events

But some events can be "dependent" ... which means they can be affected

by previous events ...

Example: Drawing 2 Cards from a Deck

After taking one card from the deck there are less cards available, so

the probabilities change!

Let's say you are interested in the chances of getting a King.

For the 1st card the chance of drawing a King is 4 out of 52

But for the 2nd card:

Page 158: Statistics and probability

If the 1st card was a King, then the 2nd card is less likely to be a King,

as only 3 of the 51 cards left are Kings.

If the 1st card was not a King, then the 2nd card is slightly more likely

to be a King, as 4 of the 51 cards left are King.

This is because you are removing cards from the deck.

Replacement: When you put each card back after drawing it the chances

don't change, as the events are independent.

Without Replacement: The chances will change, and the events

are dependent.

You can learn more about this at Dependent Events: Conditional Probability

Tree Diagrams

When you have Dependent Events it helps to make a "Tree Diagram"

Example: Soccer Game

You are off to soccer, and love being the Goalkeeper, but that depends

who is the Coach today:

with Coach Sam your probability of being Goalkeeper is 0.5

with Coach Alex your probability of being Goalkeeper is 0.3

Sam is Coach more often ... about 6 of every 10 games (a probability

of 0.6).

Let's build the Tree Diagram!

Start with the Coaches. We know 0.6 for Sam, so it must be 0.4 for

Alex (the probabilities must add to 1):

Then fill out the branches for Sam (0.5 Yes and 0.5 No), and then for

Alex (0.3 Yes and 0.7 No):

Page 159: Statistics and probability

Now it is neatly laid out we could calculate probabilities (read more at

"Tree Diagrams").

Mutually Exclusive

Mutually Exclusive means you can't get both events at the same

time.

It is either one or the other, but not both

Examples:

Turning left or right are Mutually Exclusive (you can't do both at the

same time)

Heads and Tails are Mutually Exclusive

Kings and Aces are Mutually Exclusive

What isn't Mutually Exclusive

Kings and Hearts are not Mutually Exclusive, because you can have a

King of Hearts!

Like here:

Aces and Kings are

Mutually Exclusive

Hearts and Kings are

not Mutually Exclusive

Page 160: Statistics and probability

Read more at Mutually Exclusive Events

Probability: Independent Events

Life is full of random events!

You need to get a "feel" for them to be a smart and successful person.

The toss of a coin, throwing dice and lottery draws are all examples of

random events.

Sometimes an event can affect the next event.

Example: taking colored marbles from a bag: as you take each marble

there are less marbles left in the bag, so the probabilities change.

We call those Dependent Events, because what happens depends on what

happened before (learn more about this at Conditional probability).

But otherwise they are Independent Events ...

Independent Events

Independent Events are not affected by previous events.

This is an important idea!

A coin does not "know" it came up heads before ...

.... each toss of a coin is a perfect isolated thing.

Example: You toss a coin and it comes up "Heads" three

times ... what is the chance that the next toss will also be a

"Head"?

The chance is simply ½ (or 0.5) just like ANY toss of the coin.

What it did in the past will not affect the current toss!

Page 161: Statistics and probability

Some people think "it is overdue for a Tail", but really truly the next toss of

the coin is totally independent of any previous tosses.

Saying "a Tail is due", or "just one more go, my luck is due" is

called The Gambler's Fallacy

Of course your luck may change, because each toss of the coin has an equal

chance.

Probability of Independent Events

"Probability" (or "Chance") is how likely something is to happen.

So how do we calculate probability?

Probability of an event happening = Number of ways it can happen

Total number of outcomes

Example: what is the probability of getting a "Head" when

tossing a coin?

Number of ways it can happen: 1 (Head)

Total number of outcomes: 2 (Head and Tail)

So the probability = 1

= 0.5 2

Example: what is the probability of getting a "5" or "6" when

rolling a die?

Number of ways it can happen: 2 ("5" and "6")

Total number of outcomes: 6 ("1", "2", "3", "4", "5" and "6")

Page 162: Statistics and probability

So the probability = 2

= 1

= 0.333... 6 3

Ways of Showing Probability

Probability goes from 0 (imposssible) to 1 (certain):

It is often shown as a decimal or fraction.

Example: the probability of getting a "Head" when tossing a coin:

As a decimal: 0.5

As a fraction: 1/2

As a percentage: 50%

Or sometimes like this: 1-in-2

Two or More Events

You can calculate the chances of two or more independent events

by multiplying the chances.

Example: Probability of 3 Heads in a Row

For each toss of a coin a "Head" has a probability of 0.5:

Page 163: Statistics and probability

And so the chance of getting 3 Heads in a row is 0.125

So each toss of a coin has a ½ chance of being Heads, but lots of Heads in

a row is unlikely.

Example: Why is it unlikely to get, say, 7 heads in a row,

when each toss of a coin has a ½ chance of being Heads?

Because you are asking two different questions:

Question 1: What is the probability of 7 heads in a row?

Answer: ½×½×½×½×½×½×½ = 0.0078125 (less than 1%).

Question 2: Given that you have just got 6 heads in a row, what is

the probability thatthe next toss is also a head?

Answer: ½, as the previous tosses don't affect the next toss.

You can have a play with the Quincunx to see how lots of independent effects

can still have a pattern.

Notation

We use "P" to mean "Probability Of",

So, for Independent Events:

P(A and B) = P(A) × P(B)

Probability of A and B equals the probability of A times the probability of B

Example: you are going to a concert, and your friend says it

is some time on the weekend between 4 and 12, but won't

say more.

What are the chances it is on Sunday between 10 and 12?

Page 164: Statistics and probability

Day: there are two days on the weekend, so P(Sunday) = 0.5

Time: between 4 and 12 is 8 hours, but you want between 10 and 12

which is only 2 hours:

P(Your Time) = 2/8 = 0.25

And:

P(Sunday and Your Time) = P(Sunday) × P(Your Time) = 0.5 × 0.25

= 0.125

Or a 12.5% chance

Another Example

Imagine there are two groups:

A member of each group gets randomly chosen for the winners circle,

then one of those gets randomly chosen to get the big money prize:

What is your chance of winnning the big prize?

there is a 1/5 chance of going to the winners circle

and a 1/2 chance of winning the big prize

So you have a 1/5 chance followed by a 1/2 chance ... which makes a 1/10

chance overall:

1 ×

1 =

1 =

1

5 2 5 × 2 10

Or you can calculate using decimals (1/5 is 0.2, and 1/2 is 0.5):

Page 165: Statistics and probability

0.2 x 0.5 = 0.1

So your chance of winning the big money is 0.1 (which is the same as 1/10).

Coincidence!

Many "Coincidences" are, in fact, likely.

Example: you are in a room with 30 people, and find that

Zach and Anna celebrate their birthday on the same day.

Would you say "wow, how strange", or "that seems reasonable, with so

many people here".

In fact there is a 70% chance that would happen ... so it is likely.

Why is the chance so high?

Because you are comparing everyone to everyone else (not just one to many).

And with 30 people that is 435 comparisons

(Read Shared Birthdays to find out more.)

Example: Snap!

Did you ever say something the same as someone else, at the same

time too?

Wow, how amazing!

But you were probably sharing an experience (movie, journey,

whatever) and so your thoughts would be similar.

And there are only so many ways of saying something ...

... so it is like the card game "Snap!" ...

... if you speak enough words together, they will eventually match up.

So, maybe not so amazing, just simple chance at work.

Can you think of other cases where a "coincidence" was simply a likely thing?

Page 166: Statistics and probability

Conclusion

Probability is: (Number of ways it can happen) / (Total number

of outcomes)

Dependent Events (such as removing marbles from a bag) are

affected by previous events

Independent events (such as a coin toss) are not affected by

previous events

You can calculate the probability of 2 or

more Independent events bymultiplying

Not all coincidences are really unlikely (when you think about

them).

Conditional Probability

How to handle Dependent Events

Life is full of random events! You need to get a "feel" for them to be a smart

and successful person.

Independent Events

Events can be "Independent", meaning each event is not affected by any

other events.

Example: Tossing a coin.

Each toss of a coin is a perfect isolated thing.

What it did in the past will not affect the current toss.

The chance is simply 1-in-2, or 50%, just like ANY toss of the coin.

So each toss is an Independent Event.

Dependent Events

Page 167: Statistics and probability

But events can also be "dependent" ... which means they can be affected

by previous events ...

Example: Marbles in a Bag

2 blue and 3 red marbles are in a bag.

What are the chances of getting a blue marble?

The chance is 2 in 5

But after taking one out you change the chances!

So the next time:

if you got a red marble before, then the chance of a blue marble next

is 2 in 4

if you got a blue marble before, then the chance of a blue marble next

is 1 in 4

See how the chances change each time? Each event depends on what

happened in the previous event, and is called dependent.

That is the kind of thing we will be looking at here.

"Replacement"

Note: if you had replaced the marbles in the bag each time, then the

chances wouldnot have changed and the events would be independent:

With Replacement: the events are Independent (the chances

don't change)

Without Replacement: the events are Dependent (the chances

change)

Page 168: Statistics and probability

Tree Diagram

A Tree Diagram: is a wonderful way to picture what is going on, so let's build

one for our marbles example.

There is a 2/5 chance of pulling out a Blue marble, and a 3/5 chance for Red:

We can even go one step further and see what happens when we select a

second marble:

If a blue marble was selected first there is now a 1/4 chance of getting a blue

marble and a 3/4 chance of getting a red marble.

If a red marble was selected first there is now a 2/4 chance of getting a blue

marble and a 2/4 chance of getting a red marble.

Now we can answer questions like "What are the chances of drawing 2

blue marbles?"

Answer: it is a 2/5 chance followed by a 1/4 chance:

Page 169: Statistics and probability

Did you see how we multiplied the chances? And got 1/10 as a result.

The chances of drawing 2 blue marbles is 1/10

Notation

We love notation in mathematics! It means we can then use the power of

algebra to play around with the ideas. So here is the notation for probability:

P(A) means "Probability Of Event A"

In our marbles example Event A is "get a Blue Marble first" with a probability

of 2/5:

P(A) = 2/5

And Event B is "get a Blue Marble second" ... but for that we have 2 choices:

If we got a Blue Marble first the chance is now 1/4

If we got a Red Marble first the chance is now 2/4

So we have to say which one we want, and use the symbol "|" to mean

"given":

P(B|A) means "Event B given Event A"

In other words, event A has already happened, now what is the chance of

event B?

P(B|A) is also called the "Conditional Probability" of B given A.

And in our case:

Page 170: Statistics and probability

P(B|A) = 1/4

So the probability of getting 2 blue marbles is:

And we write it as

"Probability of event A and event B equals

the probability of event A times the probability of event B given event A"

Let's do the next example using only notation:

Example: Drawing 2 Kings from a Deck

Event A is drawing a King first, and Event B is drawing a King second.

For the first card the chance of drawing a King is 4 out of 52

P(A) = 4/52

But after removing a King from the deck the probability of the 2nd card

drawn is less likely to be a King (only 3 of the 51 cards left are Kings):

P(B|A) = 3/51

And so:

P(A and B) = P(A) x P(B|A) = (4/52) x (3/51) = 12/2652

= 1/221

So the chance of getting 2 Kings is 1 in 221, or about 0.5%

Finding Hidden Data

Page 171: Statistics and probability

Using Algebra we can also "change the subject" of the formula, like this:

Start with: P(A and B) = P(A) x P(B|A)

Swap sides: P(A) x P(B|A) = P(A and B)

Divide by P(A): P(B|A) = P(A and B) / P(A)

And we have another useful formula:

"The probability of event B given event A equals

the probability of event A and event B divided by the probability of event

A

Example: Ice Cream

70% of your friends like Chocolate, and 35% like Chocolate AND like

Strawberry.

What percent of those who like Chocolate also like Strawberry?

P(Strawberry|Chocolate) = P(Chocolate and Strawberry) /

P(Chocolate)

0.35 / 0.7 = 50%

50% of your friends who like Chocolate also like Strawberry

Big Example: Soccer Game

You are off to soccer, and want to be the Goalkeeper, but that depends who

is the Coach today:

with Coach Sam the probability of being Goalkeeper is 0.5

with Coach Alex the probability of being Goalkeeper is 0.3

Page 172: Statistics and probability

Sam is Coach more often ... about 6 out of every 10 games (a probability

of 0.6).

So, what is the probability you will be a Goalkeeper today?

Let's build a tree diagram. First we show the two possible coaches: Sam or

Alex:

The probability of getting Sam is 0.6, so the probability of Alex must be 0.4

(together the probability is 1)

Now, if you get Sam, there is 0.5 probability of being Goalie (and 0.5 of not

being Goalie):

If you get Alex, there is 0.3 probability of being Goalie (and 0.7 not):

The tree diagram is complete, now let's calculate the overall probabilities.

Remember that:

P(A and B) = P(A) x P(B|A)

Page 173: Statistics and probability

Here is how to do it for the "Sam, Yes" branch:

(When we take the 0.6 chance of Sam being coach and include the 0.5

chance that Sam will let you be Goalkeeper we end up with an 0.3 chance.)

But we are not done yet! We haven't included Alex as Coach:

An 0.4 chance of Alex as Coach, followed by an 0.3 chance gives 0.12

And the two "Yes" branches of the tree together make:

0.3 + 0.12 = 0.42 probability of being a Goalkeeper today

(That is a 42% chance)

Check

One final step: complete the calculations and make sure they add to 1:

Page 174: Statistics and probability

0.3 + 0.3 + 0.12 + 0.28 = 1

Yes, they add to 1, so that looks right.

Friends and Random Numbers

Here is another quite different example of Conditional Probability.

4 friends (Alex, Blake, Chris and Dusty) each choose a random

number between 1 and 5. What is the chance that any of them

chose the same number?

Let's add our friends one at a time ...

First, what is the chance that Alex and Blake have the same

number?

Blake compares his number to Alex's number. There is a 1 in 5 chance of a

match.

As a tree diagram:

Note: "Yes" and "No" together makes 1

(1/5 + 4/5 = 5/5 = 1)

Now, let's include Chris ...

But there are now two cases to consider:

Page 175: Statistics and probability

If Alex and Billy did match, then Chris has only one number to

compare to.

But if Alex and Billy did not match then Chris has two numbers to

compare to.

And we get this:

For the top line (Alex and Billy did match) we already have a match (a

chance of 1/5).

But for the "Alex and Billy did not match" there is now a 2/5 chance of

Chris matching (because Chris gets to match his number against both Alex

and Billy).

And we can work out the combined chance by multiplying the chances it

took to get there:

Following the "No, Yes" path ... there is a 4/5 chance of No, followed

by a 2/5 chance of Yes:

(4/5) × (2/5) = 8/25

Following the "No, No" path ... there is a 4/5 chance of No, followed

by a 3/5 chance of No:

(4/5) × (3/5) = 12/25

Also notice that when you add all chances together you still get 1 (a good

check that we haven't made a mistake):

(5/25) + (8/25) + (12/25) = 25/25 = 1

Page 176: Statistics and probability

Now what happens when we include Dusty?

It is the same idea, just more of it:

OK, that is all 4 friends, and the "Yes" chances together make 101/125:

Answer: 101/125

But notice something interesting ... if we had followed the "No" path we

could haveskipped all the other calculations and made our life easier:

The chances of not matching are:

(4/5) × (3/5) × (2/5) = 24/125

So the chances of matching are:

1 - (24/125) = 101/125

(And we didn't really need a tree diagram for that!)

And that is a popular trick in probability:

Page 177: Statistics and probability

It is often easier to work out the "No" case

(This idea is shown in more detail at Shared Birthdays.)

Probability Tree Diagrams

Calculating probabilities can be hard, sometimes you add them, sometimes

you multiply them, and often it is hard to figure out what to do ... tree

diagrams to the rescue!

Here is a tree diagram for the toss of a coin:

There are two "branches" (Heads and Tails)

The probability of each branch is written on

the branch

The outcome is written at the end of the

branch

We can extend the tree diagram to two tosses of a coin:

How do you calculate the overall probabilities?

You multiply probabilities along the branches

You add probabilities down columns

Page 178: Statistics and probability

Now we can see such things as:

The probability of "Head, Head" is 0.5×0.5 = 0.25

All probabilities add to 1.0 (which is always a good check)

The probability of getting at least one Head from two tosses is

0.25+0.25+0.25 = 0.75

... and more

That was a simple example using independent events (each toss of a coin is

independent of the previous toss), but tree diagrams are really wonderful for

figuring out dependent events (where an event depends on what happens

in the previous event) like this example:

Example: Soccer Game

You are off to soccer, and love being the Goalkeeper, but that depends who

is the Coach today:

with Coach Sam the probability of being Goalkeeper is 0.5

with Coach Alex the probability of being Goalkeeper is 0.3

Page 179: Statistics and probability

Sam is Coach more often ... about 6 out of every 10 games (a probability

of 0.6).

So, what is the probability you will be a Goalkeeper today?

Let's build the tree diagram. First we show the two possible coaches: Sam or

Alex:

The probability of getting Sam is 0.6, so the probability of Alex must be 0.4

(together the probability is 1)

Now, if you get Sam, there is 0.5 probability of being Goalie (and 0.5 of not

being Goalie):

If you get Alex, there is 0.3 probability of being Goalie (and 0.7 not):

The tree diagram is complete, now let's calculate the overall probabilities.

This is done by multiplying each probability along the "branches" of the tree.

Here is how to do it for the "Sam, Yes" branch:

Page 180: Statistics and probability

(When we take the 0.6 chance of Sam being coach and include the 0.5

chance that Sam will let you be Goalkeeper we end up with an 0.3 chance.)

But we are not done yet! We haven't included Alex as Coach:

An 0.4 chance of Alex as Coach, followed by an 0.3 chance gives 0.12.

Now we add the column:

0.3 + 0.12 = 0.42 probability of being a Goalkeeper today

(That is a 42% chance)

Check

One final step: complete the calculations and make sure they add to 1:

0.3 + 0.3 + 0.12 + 0.28 = 1

Page 181: Statistics and probability

Yes, it all adds up.

Conclusion

So there you go, when in doubt draw a tree diagram, multiply along the

branches and add the columns. Make sure all probabilities add to 1 and you

are good to go.

Mutually Exclusive Events

Mutually Exclusive: can't happen at the same time.

Examples:

Turning left and turning right are Mutually Exclusive (you can't do both

at the same time)

Tossing a coin: Heads and Tails are Mutually Exclusive

Cards: Kings and Aces are Mutually Exclusive

What is not Mutually Exclusive:

Turning left and scratching your head can happen at the same time

Kings and Hearts, because you can have a King of Hearts!

Like here:

Aces and Kings are

Mutually Exclusive

(can't be both)

Hearts and Kings are

not Mutually Exclusive

(can be both)

Probability

Page 182: Statistics and probability

Let's look at the probabilities of Mutually Exclusive events. But first, a

definition:

Probability of an event happening = Number of ways it can happen

Total number of outcomes

Example: there are 4 Kings in a deck of 52 cards. What is the

probability of picking a King?

Number of ways it can happen: 4 (there are 4 Kings)

Total number of outcomes: 52 (there are 52 cards in total)

So the probability = 4

= 1

52 13

Mutually Exclusive

When two events (call them "A" and "B") are Mutually Exclusive it

is impossible for them to happen together:

P(A and B) = 0

"The probability of A and B together equals 0 (impossible)"

But the probability of A or B is the sum of the individual probabilities:

P(A or B) = P(A) + P(B)

"The probability of A or B equals the probability of A plus the probability of

B"

Example: A Deck of Cards

In a Deck of 52 Cards:

the probability of a King is 1/13, so P(King)=1/13

the probability of an Ace is also 1/13, so P(Ace)=1/13

When we combine those two Events:

Page 183: Statistics and probability

The probability of a card being a King and an Ace is 0 (Impossible)

The probability of a card being a King or an Ace is (1/13) + (1/13)

= 2/13

Which is written like this:

P(King and Ace) = 0

P(King or Ace) = (1/13) + (1/13) = 2/13

Special Notation

Instead of "and" you will often see the symbol ∩ (which is the "Intersection"

symbol used in Venn Diagrams)

Instead of "or" you will often see the symbol ∪ (the "Union" symbol)

Example: Scoring Goals

If the probability of:

scoring no goals (Event "A") is 20%

scoring exactly 1 goal (Event "B") is 15%

Then:

The probability of scoring no goals and 1 goal is 0 (Impossible)

The probability of scoring no goals or 1 goal is 20% + 15% = 35%

Which is written:

P(A ∩ B) = 0

P(A ∪ B) = 20% + 15% = 35%

Remembering

To help you remember, think:

Page 184: Statistics and probability

"Or has more ... than And"

∪ is like a cup which holds more than ∩

Not Mutually Exclusive

Now let's see what happens when events are not Mutually Exclusive.

Example: Hearts and Kings

Hearts and Kings together is only the King of Hearts:

But Hearts or Kings is:

all the Hearts (13 of them)

all the Kings (4 of them)

But that counts the King of Hearts twice!

So we correct our answer, by subtracting the extra "and" part:

16 Cards = 13 Hearts + 4 Kings - the 1 extra King of Hearts

Count them to make sure this works!

Page 185: Statistics and probability

As a formula this is:

P(A or B) = P(A) + P(B) - P(A and B)

"The probability of A or B equals the probability of A plus the probability of

B

minus the probability of A and B"

Here is the same formula, but using ∪ and ∩:

P(A ∪ B) = P(A) + P(B) - P(A ∩ B)

A Final Example

16 people study French, 21 study Spanish and there are 30

altogether. Work out the probabilities!

This is definitely a case of not Mutually Exclusive (you can study French AND

Spanish).

Let's say b is how many study both languages:

people studying French Only must be 16-b

people studying Spanish Only must be 21-b

And we get:

And we know there are 30 people, so:

(16-b) + b + (21-b) = 30

37 - b = 30

b = 7

And we can put in the correct numbers:

Page 186: Statistics and probability

So we know all this now:

P(French) = 16/30

P(Spanish) = 21/30

P(French Only) = 9/30

P(Spanish Only) = 14/30

P(French or Spanish) = 30/30 = 1

P(French and Spanish) = 7/30

Lastly, let's check with our formula:

P(A or B) = P(A) + P(B) - P(A and B)

Put the values in:

30/30 = 16/30 + 21/30 – 7/30

Yes, it works!

Summary:

Mutually Exclusive

A and B together is impossible: P(A and B) = 0

A or B is the sum of A and B: P(A or B) = P(A) + P(B)

Not Mutually Exclusive

A or B is the sum of A and B minus A and B: P(A or B) =

P(A) + P(B) - P(A and B)

False Positives and False Negatives

Page 187: Statistics and probability

Test Says "Yes" ... or does it?

When you have a test that can say "Yes" or "No" (such as a medical test),

you have to think:

It could be wrong when it says "Yes".

It could be wrong when it says "No".

Wrong?

It is like being told you did something when

you didn't!

Or you didn't do it when you really did.

There are special names for this, called "False Positive" and "False

Negative":

They say you did They say you didn't

You really did They are right! "False Negative"

You really didn't "False Positive" They are right!

Here are some examples of "false positives" and "false negatives":

Airport Security: a "false positive" is when ordinary items

such as keys or coins get mistaken for weapons (machine

goes "beep")

Quality Control: a "false positive" is when a good quality

item gets rejected, and a "false negative" is when a poor

quality item gets accepted

Antivirus software: a "false positive" is when a normal file is

thought to be a virus

Page 188: Statistics and probability

Medical screening: low-cost tests given to a large group can

give many false positives (saying you have a disease when

you don't), and then ask you to get more accurate tests.

But many people don't understand the true numbers behind "Yes" or "No",

like in this example:

Example: Allergy or Not?

Hunter says she is itchy. There is a test for Allergy to Cats, but

this test is not always right:

For people that really do have the allergy, the test says

"Yes" 80% of the time

For people that do not have the allergy, the test says

"Yes" 10% of the time ("false positive")

Here it is in a table:

Test says "Yes" Test says "No"

Have allergy 80% 20% "False Negative"

Don't have it 10% "False Positive" 90%

Question: If 1% of the population have the allergy, and Hunter's

test says "Yes", what are the chances that Hunter really has the

allergy?

Do you think 75%? Or maybe 50%?

A test similar to this was given to Doctors and most guessed around

75% ...

... but they were very wrong!

(Source: "Probabilistic reasoning in clinical medicine: Problems and

opportunities" by David M. Eddy 1982, which this example is based on)

Page 189: Statistics and probability

There are two good ways to work this out: "Imagine a 1000" and "Tree

Diagrams".

Try Imagining A Thousand People

When trying to understand questions like this, just imagine a large group

(say 1000) and play with the numbers:

Of 1000 people, only 10 really have the allergy (1% of 1000 is

10)

The test is 80% right for people who have the allergy, so it

will get 8 of those 10 right.

But 990 do not have the allergy, and the test will say "Yes" to

10% of them,

which is 99 people it says "Yes" to wrongly (false positive)

So out of 1000 people the test says "Yes" to (8+99) = 107

people

As a table:

1% have it Test says "Yes" Test says "No"

Have allergy 10 8 2

Don't have it 990 99 891

1000 107 893

So 107 people get a "Yes" but only 8 of those really have the allergy:

8 / 107 = about 7%

So, even though Hunter's test said "Yes", it is still only 7% likely that

Hunter has a Cat Allergy.

As A Tree

Drawing a tree diagram can really help:

Page 190: Statistics and probability

First of all, let's check that all the percentages add up:

0.8% + 0.2% + 9.9% + 89.1% = 100% (good!)

And the two "Yes" answers add up to 0.8% + 9.9% = 10.7%, but only 0.8%

are correct.

0.8/10.7 = 7% (same answer as above)

Conclusion

When dealing with false positives and false negatives (or other tricky

probability questions) it pays to:

Imagine you have 1,000 (of whatever)

Or make a tree diagram

Shared Birthdays

This is a great puzzle, and you get to learn a lot about probability along the

way ...

There are 30 people in a room ... what is the chance that any

two of them celebrate their birthday on the same day? Assume

365 days in a year.

Some people think "there are 30 people, and 365 days, so 30/365 sounds

about right, and 30/365 = 0.08..."

But no!

Page 191: Statistics and probability

The probability is much higher. It is actually likely there are people who

share a birthday in that room.

Because you should compare everyone to everyone

else.

And with 30 people that is 435 comparisons.

But you also have to be careful not to over-count the

chances.

I will show you how to do it ... starting with a smaller example:

Friends and Random Numbers

4 friends (Alex, Billy, Chris and Dusty) each choose a random

number between 1 and 5. What is the chance that any of them

chose the same number?

We will add our friends one at a time ...

First, what is the chance that Alex and Billy have the same

number?

Billy compares his number to Alex's number. There is a 1 in 5 chance of a

match.

As a tree diagram:

Note: "Yes" and "No" together make 1

(1/5 + 4/5 = 5/5 = 1)

Page 192: Statistics and probability

Now, let's include Chris ...

But there are now two cases to consider (called "Conditional Probability"):

If Alex and Billy did match, then Chris has only one number to

compare to.

But if Alex and Billy did not match then Chris has two numbers to

compare to.

And we get this:

For the top line (Alex and Billy did match) we already have a match (a

chance of 1/5).

But for the "Alex and Billy did not match" there is a 2/5 chance of Chris

matching (against both Alex and Billy).

And we can work out the combined chance by multiplying the chances it

took to get there:

Following the "No, Yes" path ... there is a 4/5 chance of No, followed

by a 2/5 chance of Yes:

(4/5) × (2/5) = 8/25

Following the "No, No" path ... there is a 4/5 chance of No, followed

by a 3/5 chance of No:

(4/5) × (3/5) = 12/25

Also notice that adding all chances together is 1 (a good check that we

haven't made a mistake):

Page 193: Statistics and probability

(5/25) + (8/25) + (12/25) = 25/25 = 1

Now what happens when we include Dusty?

It is the same idea, just more of it:

OK, that is all 4 friends, and the "Yes" chances together make 101/125:

Answer: 101/125

But notice something interesting ... if we had followed the "No" path we

could haveskipped all the other calculations and made our life easier:

The chances of not matching are:

(4/5) × (3/5) × (2/5) = 24/125

So the chances of matching are:

1 - (24/125) = 101/125

(And we didn't really need a tree diagram for that!)

Page 194: Statistics and probability

And that is a popular trick in probability:

It is often easier to work out the "No" case

Example: what are the chances that with 6 people any of

them celebrate their Birthday in the same month? (Assume

equal months)

The "no match" case for:

2 people is 11/12

3 people is (11/12) × (10/12)

4 people is (11/12) × (10/12) × (9/12)

5 people is (11/12) × (10/12) × (9/12) × (8/12)

6 people is (11/12) × (10/12) × (9/12) × (8/12) × (7/12)

So the chance of not matching is:

(11/12) × (10/12) × (9/12) × (8/12) × (7/12) = 0.22...

Flip that around and we get the chance of matching:

1 - 0.22... = 0.78...

So, there is a 78% chance of any of them celebrating their Birthday in

the same month

And now we can try calculating the "Shared Birthday" question we started

with:

There are 30 people in a room ... what is the chance that any

two of them celebrate their birthday on the same day? Assume

365 days in a year.

It is just like the previous example! But bigger and more numbers:

The chance of not matching:

364/365 × 363/365 × 362/365 × ... × 336/365 = 0.294...

(I did that calculation in a spreadsheet, but there are also mathematical

shortcuts)

And the probability of matching is 1- 0.294... :

The probability of sharing a birthday = 1 - 0.294... = 0.706...

Page 195: Statistics and probability

Or a 70.6% chance, which is likely!

In fact the probability for 23 people is about 50%.

And for 57 people it is 99% (almost certain!)

So, next time you are in a room with a group of people why not find out if

there are any shared birthdays?

Footnote: In real life birthdays are not evenly spread out ... more babies are

born in Spring. Also Hospitals prefer to work on weekdays, not weekends, so

there are more births early in the week. And then there are leap years. But

you get the idea.

Combinations and Permutations

What's the Difference?

In English we use the word "combination" loosely, without thinking if

the order of things is important. In other words:

"My fruit salad is a combination of apples, grapes and

bananas" We don't care what order the fruits are in, they could also

be "bananas, grapes and apples" or "grapes, apples and bananas", its

the same fruit salad.

"The combination to the safe was 472". Now we do care about

the order. "724" would not work, nor would "247". It has to be

exactly 4-7-2.

So, in Mathematics we use more precise language:

If the order doesn't matter, it is a Combination.

If the order does matter it is a Permutation.

Combinations and Permutations Calculator

Page 196: Statistics and probability

Find out how many different ways you can choose items.

For an in-depth explanation of the formulas please visit Combinations and

Permutations.

View Larger

Note: The old version is here.

For an in-depth explanation please visit Combinations and Permutations.

Power Users!

You can now add "Rules" that will reduce the List:

Page 197: Statistics and probability

The "has" rule which says that certain items must be included (for

the entry to be included).

Example: has 2,a,b,c means that an entry must have at least two

of the letters a, b and c.

The "no" rule which means that some items from the list must not

occur together.

Example: no 2,a,b,c means that an entry must not have two or

more of the letters a, b and c.

The "pattern" rule is used to impose some kind of pattern to each

entry.

Example: pattern c,* means that the letter c must be first

(anything else can follow)

(You can discuss these rules at the forum.)

Rules In Detail

The "has" Rule

The word "has" followed by a space and a number. Then a comma and a list

of items separated by commas.

The number says how many (minimum) from the list are needed for that

result to be allowed.

Example has 1,a,b,c

Will allow if there is an a, or b, or c, or a and b, or a and c, or b and

c, or all three a,b and c.

In other words, it insists there be an a or b or c in the result.

So {a,e,f} is accepted, but {d,e,f} is rejected.

Example has 2,a,b,c

Will allow if there is an a and b, or a and c, or b and c, or all

three a,b and c.

In other words, it insists there be at least 2 of a or b or c in the result.

So {a,b,f} is accepted, but {a,e,f} is rejected.

Page 198: Statistics and probability

The "no" Rule

The word "no" followed by a space and a number. Then a comma and a list

of items separated by commas.

The number says how many (minimum) from the list are needed to be a

rejection.

Example: n=5, r=3, Order=no, Replace=no

Which normally produces:

{a,b,c} {a,b,d} {a,b,e} {a,c,d} {a,c,e} {a,d,e} {b,c,d} {b,c,e}

{b,d,e} {c,d,e}

But when we add a "no" rule like this:

a,b,c,d,e,f,g

no 2,a,b

We get:

{a,c,d} {a,c,e} {a,d,e} {b,c,d} {b,c,e} {b,d,e} {c,d,e}

The entries {a,b,c}, {a,b,d} and {a,b,e} are missing because the rule

says you can't have 2 from the list a,b (having an a or b is fine, but not

together)

Example: no 2,a,b,c

Allows only these:

{a,d,e} {b,d,e} {c,d,e}

It has rejected any with a and b, or a and c, or b and c, or even all

three a,b and c.

So {a,d,e) is allowed (only one out of a,b and c is in that)

But {b,c,d} is rejected (it has 2 from the list a,b,c)

Example: no 3,a,b,c

Allows all of these:

{a,b,d} {a,b,e} {a,c,d} {a,c,e} {a,d,e} {b,c,d} {b,c,e} {b,d,e}

{c,d,e}

Only {a,b,c} is missing because that is the only one that has 3 from the

list a,b,c

Page 199: Statistics and probability

The "pattern" Rule

The word "pattern" followed by a space and a list of items separated by

commas.

You can include these "special" items:

? (question mark) means any item. It is like a "wildcard".

* (an asterisk) means any number of items (0, 1, or more). Like a

"super wildcard".

Example: pattern ?,c,*,f

Means "any item, followed by c, followed by zero or more items, then f"

So {a,c,d,f} is allowed

And {b,c,f,g} is also allowed (there are no items between c and f,

which is OK)

But {c,d,e,f} is not, because there is no item before c.

Example: how many ways can Alex, Betty, Carol and John be

lined up, with John after Alex.

Use: n=4, r=4, order=yes, replace=no.

Alex, Betty, Carol, John

pattern *,Alex,*,John

The result is:

{Alex,Betty,Carol,John} {Alex,Betty,John,Carol}

{Alex,Carol,Betty,John} {Alex,Carol,John,Betty}

{Alex,John,Betty,Carol} {Alex,John,Carol,Betty}

{Betty,Alex,Carol,John} {Betty,Alex,John,Carol}

{Betty,Carol,Alex,John} {Carol,Alex,Betty,John}

{Carol,Alex,John,Betty} {Carol,Betty,Alex,John}

Random Variables

A Random Variable is a set of possible values from a random experiment.

Page 200: Statistics and probability

Example: Tossing a coin: we could get Heads or Tails.

Let's give them the values Heads=0 and Tails=1 and we have a

Random Variable "X":

In short:

X = {0, 1}

Note: We could have chosen Heads=100 and Tails=150 if we wanted! It

is our choice.

So:

We have an experiment (such as tossing a coin)

We give values to each event

The set of values is a Random Variable

Not Like an Algebra Variable

In Algebra a variable, like x, is an unknown value:

Example: x + 2 = 6

In this case we can find that x=4

But a Random Variable is different ...

A Random Variable has a whole set of values ...

... and it could take on any of those values, randomly.

Example: X = {0, 1, 2, 3}

X could be 1, 2, 3 or 4, randomly.

And they might each have a different probability.

Page 201: Statistics and probability

Capital Letters

We use a capital letter, like X or Y, to avoid confusion with the Algebra type

of variable.

Sample Space

A Random Variable's set of values is the Sample Space.

Example: Throw a die once

Random Variable X = "The score shown on the top face".

X could be 1, 2, 3, 4, 5 or 6

So the Sample Space is {1, 2, 3, 4, 5, 6}

Probability

We can show the probability of any one value using this style:

P(X = value) = probability of that value

Example (continued): Throw a die once

X = {1, 2, 3, 4, 5, 6}

In this case they are all equally likely, so the probability of any one is

1/6

P(X = 1) = 1/6

P(X = 2) = 1/6

P(X = 3) = 1/6

P(X = 4) = 1/6

P(X = 5) = 1/6

P(X = 6) = 1/6

Note that the sum of the probabilities = 1, as it should be.

Page 202: Statistics and probability

Example: Toss three coins.

X = "The number of Heads" is the Random Variable.

In this case, there could be 0 Heads (if all the coins land Tails up), 1 Head, 2

Heads or 3 Heads.

So the Sample Space = {0, 1, 2, 3}

But this time the outcomes are NOT all equally likely.

The three coins can land in eight possible ways:

X = "number

of Heads"

HHH

3

HHT

2

HTH

2

HTT

1

THH

2

THT

1

TTH

1

TTT

0

Page 203: Statistics and probability

Looking at the table we see just 1 case of Three Heads, but 3 cases of Two

Heads, 3 cases of One Head, and 1 case of Zero Heads. So:

P(X = 3) = 1/8

P(X = 2) = 3/8

P(X = 1) = 3/8

P(X = 0) = 1/8

Example: Two dice are tossed.

The Random Variable is X = "The sum of the scores on the two dice".

Let's make a table of all possible values:

1st Die

2nd Die

1 2 3 4 5 6

1 2 3 4 5 6 7

2 3 4 5 6 7 8

3 4 5 6 7 8 9

4 5 6 7 8 9 10

5 6 7 8 9 10 11

6 7 8 9 10 11 12

There are 6 × 6 = 36 of them, and the Sample Space = {2, 3, 4, 5, 6,

7, 8, 9, 10, 11, 12}

Let's count how often each value occurs, and work out the probabilities:

2 occurs just once, so P(X = 2) = 1/36

3 occurs twice, so P(X = 3) = 2/36 = 1/18

4 occurs three times, so P(X = 4) = 3/36 = 1/12

5 occurs four times, so P(X = 5) = 4/36 = 1/9

6 occurs five times, so P(X = 6) = 5/36

7 occurs six times, so P(X = 7) = 6/36 = 1/6

8 occurs five times, so P(X = 8) = 5/36

Page 204: Statistics and probability

9 occurs four times, so P(X = 9) = 4/36 = 1/9

10 occurs three times, so P(X = 10) = 3/36 = 1/12

11 occurs twice, so P(X = 11) = 2/36 = 1/18

12 occurs just once, so P(X = 12) = 1/36

A Range of Values

We could also calculate the probability that a Random Variable takes on a

range of values.

Example (continued) What is the probability that the sum of

the scores is 5, 6, 7 or 8?

In other words: What is P(5 ≤ X ≤ 8)?

P(5 ≤ X ≤ 8) = P(X = 5) + P(X = 6) + P(X = 7) + P(X = 8) =

(4+5+6+5)/36 = 20/36 = 5/9

Solving

We can also solve a Random Variable equation.

Example (continued) If P(X = x) = 1/12, what is the value of

x?

P(X = 4) = 1/12, and P(X = 10) = 1/12

So there are two solutions: x = 4 or x = 10

Notice the different uses of X and x:

X represents the Random Variable "The sum of the scores on the two

dice".

x represents a value that X can take.

Continuous

Random Variables can be either Discrete or Continuous:

Discrete Data can only take certain values (such as 1,2,3,4,5)

Continuous Data can take any value within a range (such as a person's

height)

Page 205: Statistics and probability

All our examples have been Discrete.

Learn more at Continuous Random Variables.

Mean, Variance, Standard Deviation

You can also learn how to find the Mean, Variance and Standard Deviation of

Random Variables.

Summary

A Random Variable is a set of possible values from a random

experiment.

The set of possible values is called the Sample Space.

A Random Variable is given a capital letter, such as X or Z.

Random Variables can be discrete or continuous.

So, we should really call this a "Permutation Lock"!

In other words:

A Permutation is an ordered Combination.

To help you to remember, think "Permutation ... Position"

Permutations

There are basically two types of permutation:

Page 206: Statistics and probability

Repetition is Allowed: such as the lock above. It could be "333".

No Repetition: for example the first three people in a running race.

You can't be first andsecond.

1. Permutations with Repetition

These are the easiest to calculate.

When you have n things to choose from ... you have n choices each time!

When choosing r of them, the permutations are:

n × n × ... (r times)

(In other words, there are n possibilities for the first choice, THEN there

are n possibilites for the second choice, and so on, multplying each time.)

Which is easier to write down using an exponent of r:

n × n × ... (r times) = nr

Example: in the lock above, there are 10 numbers to choose from

(0,1,..9) and you choose 3 of them:

10 × 10 × ... (3 times) = 103 = 1,000 permutations

So, the formula is simply:

nr

where n is the number of things to choose

from, and you choose r of them

(Repetition allowed, order matters)

2. Permutations without Repetition

In this case, you have to reduce the number of available choices each time.

Page 207: Statistics and probability

For example, what order could 16

pool balls be in?

After choosing, say, number "14" you

can't choose it again.

So, your first choice would have 16 possibilites, and your next choice would

then have 15 possibilities, then 14, 13, etc. And the total permutations would

be:

16 × 15 × 14 × 13 × ... = 20,922,789,888,000

But maybe you don't want to choose them all, just 3 of them, so that would

be only:

16 × 15 × 14 = 3,360

In other words, there are 3,360 different ways that 3 pool balls could be

selected out of 16 balls.

But how do we write that mathematically? Answer: we use the "factorial

function"

The factorial function (symbol: !) just means to multiply a series

of descending natural numbers. Examples:

4! = 4 × 3 × 2 × 1 = 24

7! = 7 × 6 × 5 × 4 × 3 × 2 × 1 = 5,040

1! = 1

Note: it is generally agreed that 0! = 1. It may seem funny that multiplying no

numbers together gets you 1, but it helps simplify a lot of equations.

So, if you wanted to select all of the billiard balls the permutations would be:

16! = 20,922,789,888,000

Page 208: Statistics and probability

But if you wanted to select just 3, then you have to stop the multiplying after

14. How do you do that? There is a neat trick ... you divide by 13! ...

16 × 15 × 14 × 13 × 12 ... = 16 × 15 × 14 = 3,360

13 × 12 ...

Do you see? 16! / 13! = 16 × 15 × 14

The formula is written:

where n is the number of things to choose

from, and you choose r of them

(No repetition, order matters)

Examples:

Our "order of 3 out of 16 pool balls example" would be:

16! =

16! =

20,922,789,888,000 = 3,360

(16-3)! 13! 6,227,020,800

(which is just the same as: 16 × 15 × 14 = 3,360)

How many ways can first and second place be awarded to 10 people?

10! =

10! =

3,628,800 = 90

(10-2)! 8! 40,320

(which is just the same as: 10 × 9 = 90)

Notation

Instead of writing the whole formula, people use different notations such as

these:

Page 209: Statistics and probability

Example: P(10,2) = 90

Combinations

There are also two types of combinations (remember the order

does not matter now):

Repetition is Allowed: such as coins in your pocket (5,5,5,10,10)

No Repetition: such as lottery numbers (2,14,15,27,30,33)

1. Combinations with Repetition

Actually, these are the hardest to explain, so I will come back to this later.

2. Combinations without Repetition

This is how lotteries work. The numbers are drawn one at a time, and if you

have the lucky numbers (no matter what order) you win!

The easiest way to explain it is to:

assume that the order does matter (ie permutations),

then alter it so the order does not matter.

Going back to our pool ball example, let us say that you just want to know

which 3 pool balls were chosen, not the order.

We already know that 3 out of 16 gave us 3,360 permutations.

But many of those will be the same to us now, because we don't care what

order!

For example, let us say balls 1, 2 and 3 were chosen. These are the

possibilites:

Order does matter Order doesn't

matter

1 2 3 1 2 3

Page 210: Statistics and probability

1 3 2

2 1 3

2 3 1

3 1 2

3 2 1

So, the permutations will have 6 times as many possibilites.

In fact there is an easy way to work out how many ways "1 2 3" could be

placed in order, and we have already talked about it. The answer is:

3! = 3 × 2 × 1 = 6

(Another example: 4 things can be placed in 4! = 4 × 3 × 2 × 1 =

24 different ways, try it for yourself!)

So, all we need to do is adjust our permutations formula to reduce it by how

many ways the objects could be in order (because we aren't interested in the

order any more):

That formula is so important it is often just written in big parentheses like

this:

where n is the number of things to choose

from, and you choose r of them

(No repetition, order doesn't matter)

It is often called "n choose r" (such as "16 choose 3")

And is also known as the "Binomial Coefficient"

Notation

As well as the "big parentheses", people also use these notations:

Page 211: Statistics and probability

Example

So, our pool ball example (now without order) is:

16! =

16! =

20,922,789,888,000 = 560

3!(16-3)! 3!×13! 6×6,227,020,800

Or you could do it this way:

16×15×14 =

3360 = 560

3×2×1 6

So remember, do the permutation, then reduce by a further "r!"

... or better still ...

Remember the Formula!

It is interesting to also note how this formula is nice and symmetrical:

In other words choosing 3 balls out of 16, or choosing 13 balls out of 16 have

the same number of combinations.

16! =

16! =

16! = 560

3!(16-3)! 13!(16-13)! 3!×13!

Pascal's Triangle

You can also use Pascal's Triangle to find the values. Go down to row "n" (the

top row is 0), and then along "r" places and the value there is your answer.

Here is an extract showing row 16:

Page 212: Statistics and probability

1 14 91 364 ...

1 15 105 455 1365 ...

1 16 120 560 1820 4368 ...

1. Combinations with Repetition

OK, now we can tackle this one ...

Let us say there are five flavors of icecream: banana,

chocolate, lemon, strawberry and vanilla. You can have

three scoops. How many variations will there be?

Let's use letters for the flavors: {b, c, l, s, v}. Example

selections would be

{c, c, c} (3 scoops of chocolate)

{b, l, v} (one each of banana, lemon and vanilla)

{b, v, v} (one of banana, two of vanilla)

(And just to be clear: There are n=5 things to choose from, and you

choose r=3 of them.

Order does not matter, and you can repeat!)

Now, I can't describe directly to you how to calculate this, but I can show

you a special techniquethat lets you work it out.

Think about the ice cream being in boxes, you could

say "move past the first box, then take 3 scoops,

then move along 3 more boxes to the end" and you

will have 3 scoops of chocolate!

So, it is like you are ordering a robot to get your ice

cream, but it doesn't change anything, you still get

Page 213: Statistics and probability

what you want.

Now you could write this down as (arrow means move,

circle means scoop).

In fact the three examples above would be written like this:

{c, c, c} (3 scoops of chocolate):

{b, l, v} (one each of banana, lemon and

vanilla):

{b, v, v} (one of banana, two of vanilla):

OK, so instead of worrying about different flavors, we have

a simpler problem to solve: "how many different ways can you arrange

arrows and circles"

Notice that there are always 3 circles (3 scoops of ice cream) and 4 arrows

(you need to move 4 times to go from the 1st to 5th container).

So (being general here) there are r + (n-1) positions, and we want to

choose r of them to have circles.

This is like saying "we have r + (n-1) pool balls and want to choose r of

them". In other words it is now like the pool balls problem, but with slightly

changed numbers. And you would write it like this:

where n is the number of things to choose

from, and you choose r of them

(Repetition allowed, order doesn't matter)

Interestingly, we could have looked at the arrows instead of the circles, and

we would have then been saying "we have r + (n-1) positions and want to

choose (n-1) of them to have arrows", and the answer would be the same

...

Page 214: Statistics and probability

So, what about our example, what is the answer?

(5+3-1)! =

7! =

5040 = 35

3!(5-1)! 3!×4! 6×24

In Conclusion

Phew, that was a lot to absorb, so maybe you could read it again to be sure!

But knowing how these formulas work is only half the battle. Figuring out

how to interpret a real world situation can be quite hard.

But at least now you know how to calculate all 4 variations of "Order

does/does not matter" and "Repeats are/are not allowed".

Random Variables

A Random Variable is a set of possible values from a random experiment.

Example: Tossing a coin: we could get Heads or Tails.

Let's give them the values Heads=0 and Tails=1 and we have a

Random Variable "X":

In short:

X = {0, 1}

Note: We could have chosen Heads=100 and Tails=150 if we wanted! It

is our choice.

So:

We have an experiment (such as tossing a coin)

We give values to each event

The set of values is a Random Variable

Page 215: Statistics and probability

Not Like an Algebra Variable

In Algebra a variable, like x, is an unknown value:

Example: x + 2 = 6

In this case we can find that x=4

But a Random Variable is different ...

A Random Variable has a whole set of values ...

... and it could take on any of those values, randomly.

Example: X = {0, 1, 2, 3}

X could be 1, 2, 3 or 4, randomly.

And they might each have a different probability.

Capital Letters

We use a capital letter, like X or Y, to avoid confusion with the Algebra type

of variable.

Sample Space

A Random Variable's set of values is the Sample Space.

Example: Throw a die once

Random Variable X = "The score shown on the top face".

X could be 1, 2, 3, 4, 5 or 6

So the Sample Space is {1, 2, 3, 4, 5, 6}

Page 216: Statistics and probability

Probability

We can show the probability of any one value using this style:

P(X = value) = probability of that value

Example (continued): Throw a die once

X = {1, 2, 3, 4, 5, 6}

In this case they are all equally likely, so the probability of any one is

1/6

P(X = 1) = 1/6

P(X = 2) = 1/6

P(X = 3) = 1/6

P(X = 4) = 1/6

P(X = 5) = 1/6

P(X = 6) = 1/6

Note that the sum of the probabilities = 1, as it should be.

Example: Toss three coins.

X = "The number of Heads" is the Random Variable.

In this case, there could be 0 Heads (if all the coins land Tails up), 1 Head, 2

Heads or 3 Heads.

So the Sample Space = {0, 1, 2, 3}

But this time the outcomes are NOT all equally likely.

The three coins can land in eight possible ways:

X = "number

of Heads"

HHH

3

Page 217: Statistics and probability

HHT

2

HTH

2

HTT

1

THH

2

THT

1

TTH

1

TTT

0

Looking at the table we see just 1 case of Three Heads, but 3 cases of Two

Heads, 3 cases of One Head, and 1 case of Zero Heads. So:

P(X = 3) = 1/8

P(X = 2) = 3/8

P(X = 1) = 3/8

P(X = 0) = 1/8

Example: Two dice are tossed.

The Random Variable is X = "The sum of the scores on the two dice".

Page 218: Statistics and probability

Let's make a table of all possible values:

1st Die

2nd Die

1 2 3 4 5 6

1 2 3 4 5 6 7

2 3 4 5 6 7 8

3 4 5 6 7 8 9

4 5 6 7 8 9 10

5 6 7 8 9 10 11

6 7 8 9 10 11 12

There are 6 × 6 = 36 of them, and the Sample Space = {2, 3, 4, 5, 6,

7, 8, 9, 10, 11, 12}

Let's count how often each value occurs, and work out the probabilities:

2 occurs just once, so P(X = 2) = 1/36

3 occurs twice, so P(X = 3) = 2/36 = 1/18

4 occurs three times, so P(X = 4) = 3/36 = 1/12

5 occurs four times, so P(X = 5) = 4/36 = 1/9

6 occurs five times, so P(X = 6) = 5/36

7 occurs six times, so P(X = 7) = 6/36 = 1/6

8 occurs five times, so P(X = 8) = 5/36

9 occurs four times, so P(X = 9) = 4/36 = 1/9

10 occurs three times, so P(X = 10) = 3/36 = 1/12

11 occurs twice, so P(X = 11) = 2/36 = 1/18

12 occurs just once, so P(X = 12) = 1/36

A Range of Values

We could also calculate the probability that a Random Variable takes on a

range of values.

Example (continued) What is the probability that the sum of

the scores is 5, 6, 7 or 8?

In other words: What is P(5 ≤ X ≤ 8)?

P(5 ≤ X ≤ 8) = P(X = 5) + P(X = 6) + P(X = 7) + P(X = 8) =

(4+5+6+5)/36 = 20/36 = 5/9

Page 219: Statistics and probability

Solving

We can also solve a Random Variable equation.

Example (continued) If P(X = x) = 1/12, what is the value of

x?

P(X = 4) = 1/12, and P(X = 10) = 1/12

So there are two solutions: x = 4 or x = 10

Notice the different uses of X and x:

X represents the Random Variable "The sum of the scores on the two

dice".

x represents a value that X can take.

Continuous

Random Variables can be either Discrete or Continuous:

Discrete Data can only take certain values (such as 1,2,3,4,5)

Continuous Data can take any value within a range (such as a person's

height)

All our examples have been Discrete.

Learn more at Continuous Random Variables.

Mean, Variance, Standard Deviation

You can also learn how to find the Mean, Variance and Standard Deviation of

Random Variables.

Summary

A Random Variable is a set of possible values from a random

experiment.

Page 220: Statistics and probability

The set of possible values is called the Sample Space.

A Random Variable is given a capital letter, such as X or Z.

Random Variables can be discrete or continuous.

Random Variables - Continuous

A Random Variable is a set of possible values from a random experiment.

Example: Tossing a coin: we could get Heads or Tails.

Let's give them the values Heads=0 and Tails=1 and we have a

Random Variable "X":

In short:

X = {0, 1}

Note: We could have chosen Heads=100 and Tails=150 if we wanted! It

is our choice.

Continuous

Random Variables can be either Discrete or Continuous:

Discrete Data can only take certain values (such as 1,2,3,4,5)

Continuous Data can take any value within a range (such as a person's

height)

In our Introduction to Random Variables (please read that first!) we look at

many examples of Discrete Random Variables.

But here we look at the more advanced topic of Continuous Random

Variables.

Page 221: Statistics and probability

The Uniform Distribution

(Also called the Rectangular Distribution).

The Uniform Distribution has equal probability for all values of the Random

variable between a and b:

The probability of any value between a and b is p

We also know that p = 1/(b-a), because the total of all probabilities must be

1, so

the area of the rectangle = 1

p × (b−a) = 1

p = 1/(b−a)

We can write:

P(X = x) = 1/(b−a) for a ≤ x ≤ b

P(X = x) = 0 otherwise

Page 222: Statistics and probability

Example: Old Faithful erupts every 91 minutes. You arrive

there at random and wait for 20 minutes ... what is the

probability you will see it erupt?

This is actually easy to calculate, 20 minutes out of 91 minutes is:

p = 20/91 = 0.22 (to 2 decimals)

But let's use the Uniform Distribution for practice.

To find the probability between a and a+20, find the blue area:

Area = (1/91) x (a+20 - a) = (1/91) x 20 = 20/91 = 0.22 (to

2 decimals)

So there is a 0.22 probability you will see Old Faithful erupt.

If you waited the full 91 minutes you would be sure (p=1) to have seen

it erupt.

But remember this is a random thing! It might erupt the moment you

arrive, or any time in the 91 minutes.

Cumulative Uniform Distribution

We can have the Uniform Distribution as a cumulative (adding up as it goes

along) distribution:

Page 223: Statistics and probability

The probability starts at 0 and builds up to 1

This type of thing is called a "Cumulative distribution function", often

shortened to "CDF"

Example (continued):

Let's use the "CDF" of the Uniform Distribution to work out the

probability:

At a+20 the probability has accumulated to about 0.22

Other Distributions

Knowing how to use the Uniform

Distribution helps when dealing with

more complicated distributions like

this one:

The general name for any of these is probability density function or "pdf"

The Normal Distribution

Page 224: Statistics and probability

The most important continuous distribution is the Standard Normal

Distribution

It is so important the Random Variable has its own special letter Z.

The graph for Z is a symmetrical bell-shaped curve:

Usually we want to find the probability of Z being between certain values.

Example: P(0 < Z < 0.45)

(What is the probability that Z is between 0 and 0.45)

This is found by using the Standard Normal Distribution Table

Start at the row for 0.4, and read along until 0.45: there is the value

0.1736

P(0 < Z < 0.45) = 0.1736

Summary

A Random Variable is a variable whose possible values are

numerical outcomes of a random experiment.

Random Variables can be discrete or continuous.

An important example of a continuous Random variable is

the Standard Normalvariable, Z.

Page 225: Statistics and probability

Random Variables - Mean, Variance, Standard

Deviation

A Random Variable is a set of possible values from a random experiment.

Example: Tossing a coin: we could get Heads or Tails.

Let's give them the values Heads=0 and Tails=1 and we have a

Random Variable "X":

So:

We have an experiment (like tossing a coin)

We give values to each event

The set of values is a Random Variable

Learn more at Random Variables.

Mean, Variance and Standard Deviation

They have special notation:

μ is the Mean of X and is also called the Expected Value of X

Var(X) is the Variance of X

σ is the Standard Deviation of X

Mean or Expected Value

When we know the probability p of every value x we can calculate the

Expected Value (Mean) of X:

μ = Σxp

Page 226: Statistics and probability

Note: Σ is Sigma Notation, and means to sum up.

To calculate the Expected Value:

multiply each value by its probability

sum them up

It is a weighted mean: values with higher probability have higher

contribution to the mean.

Variance

The Variance is:

Var(X) = Σx2p − μ2

To calculate the Variance:

square each value and multiply by its probability

sum them up and we get Σx2p

then subtract the square of the Expected Value μ2

Standard Deviation

The Standard Deviation is the square root of the Variance:

σ = √Var(X)

An example will help!

Page 227: Statistics and probability

You plan to open a new McDougals Fried Chicken, and found

these stats for similar restaurants:

Percent Year's Earnings

20% $50,000 Loss

30% $0

40% $50,000 Profit

10% $150,000 Profit

Using that as probabilities for your new restaurant's profit, what is the

Expected Value and Standard Deviation?

The Random Variable is X = 'possible profit'.

Sum up xp and x2p:

Probability p

Earnings ($'000s) x

xp

x2p

0.2 -50 -10 500

0.3 0 0 0

0.4 50 20 1000

0.1 150 15 2250

Σp = 1 Σxp = 25 Σx2p = 3750

μ = Σxp = 25

Var(X) = Σx2p − μ2 = 3750 − 252 = 3750 − 625 = 3125

σ = √3125 = 56 (to nearest whole number)

But remember these are in thousands of dollars, so:

μ = $25,000

σ = $56,000

So you might expect to make $25,000, but with a very wide deviation

possible.

Let's try that again, but with a much higher probability for $50,000:

Example (continued):

Now with different probabilities (the $50,000 value has a high

probability of 0.7 now):

Page 228: Statistics and probability

Probability p

Earnings ($'000s) x

xp

x2p

0.1 -50 -5 250

0.1 0 0 0

0.7 50 35 1750

0.1 150 15 2250

Σp = 1 Sums: Σxp = 45 Σx2p = 4250

μ = Σxp = 45

Var(X) = Σx2p − μ2 = 4250 − 452 = 4250 − 2025 = 2225

σ = √2225 = 47 (to nearest whole number)

In thousands of dollars:

μ = $45,000

σ = $47,000

The mean is now much closer to the most probable value.

And the standard deviation is a little smaller (showing that the values

are more central.)

Continuous

Random Variables can be either Discrete or Continuous:

Discrete Data can only take certain values (such as 1,2,3,4,5)

Continuous Data can take any value within a range (such as a person's

height)

Here we looked only at discrete data, as finding the Mean, Variance and

Standard Deviation of continuous data needs Integration.

Summary

Page 229: Statistics and probability

A Random Variable is a variable whose possible values are

numerical outcomes of a random experiment.

The Mean (Expected Value) is: μ = Σxp

The Variance is: Var(X) = Σx2p − μ2

The Standard Deviation is: σ = √Var(X)

Quincunx Explained

A Quincunx or "Galton Board" (named after Sir Francis

Galton) is a triangular array of pegs.

Balls are dropped onto the top peg and then bounce their

way down to the bottom where they are collected in little

bins.

Each time a ball hits one of the pegs, it bounces either left

or right.

But this is interesting: if there is an equal

chance of bouncing left or right, then the

pegs collecting in the bins form the

classic "bell-shaped" curve of the normal

distribution.

(If the probabilities are not even, you still

get a nice "skewed" version of the normal

distribution.)

Formula

You can actually calculate the probabilities!

Page 230: Statistics and probability

Think about this: a ball would end up in the bin k places

from the right if it has taken k left turns.

In this example, the ball has taken two bounces to the left,

and all other bounces were to the right. It ended up in the

bin two places from the right.

In the general case, if the quincunx has n rows then a

possible path for the ball would be k bounces to the left

and (n-k)bounces to the right.

And if the probability of bouncing to the left is p then we

can calculate the probability of a certain path like this:

The ball bounces k times to the left with a probability of p: pk

And the other bounces (n-k) have the opposite probability

of: (1-p)(n-k)

So, the probability of following such a path is pk(1-p)(n-k)

But there could be many such paths! For example the left turns could be

the 1st and 2nd, or 1st and 3rd, or 2nd and 7th, etc.

You could list all such paths (LLRRR.., LRLRR..., LRRL...), but there are two

easier ways.

How Many Paths

Page 231: Statistics and probability

You can look at Pascal's Triangle. In fact, the Quincunx is just like Pascal's

Triangle, with pegs instead of numbers. The number on each peg shows you

how many different paths can be taken to get to that peg. Amazing but true.

Or you can use this formula from the subject of Combinations:

This is commonly called "n choose k" and written

C(n,k).

It is the calculation of the number of ways of

distributing k things in a sequence of n.

(The "!" means "factorial", for example 4! =

1×2×3×4 = 24)

Putting it all together, the resulting formula is:

(Which, by the way, is the formula for the binomial distribution.)

Example:

For 10 rows (n=10) and probability of bouncing left of 0.5 (p=0.5), we can

calculate the probability of being in the 3rd bin from the right (k=3) as:

also:

(This means there are 120 different paths that would end

up with the ball in the 3rd bin from the right.)

So we get:

Page 232: Statistics and probability

In fact we can build a whole table for rows=10 and probability=0.5 like this:

From Right: 10 9 8 7 6 5 4 3 2 1 0

Probability: 0.001 0.010 0.044 0.117 0.205 0.246 0.205 0.117 0.044 0.010 0.001

Example: 100 balls 0 1 4 12 21 24 21 12 4 1 0

Now, of course, this is a random thing so your results may vary from this

ideal situation.

Another Example:

If the probability were 0.8 then the table would look like this:

From Right 10 9 8 7 6 5 4 3 2 1 0

Probability 0.107 0.268 0.302 0.201 0.088 0.026 0.006 0.001 0.000 0.000 0.000

Example: 100 balls 11 26 30 20 9 3 1 0 0 0 0

Try It Yourself

Run 100 (or more) balls through the Quincunx and see what results you get.

I have done this many times myself while developing the software. I never

got the perfect result, but always something surprisingly close. Good Luck!

The Binomial Distribution

"Bi" means "two" (like a bicycle has two wheels) ...

... so this is about things with two results.

Page 233: Statistics and probability

Tossing a Coin:

Did we get Heads (H) or

Tails (T)

We say the probability of the coin landing H is ½

And the probability of the coin landing T is ½

Throwing a Die:

Did we get a four ... ?

... or not?

We say the probability of a four is 1/6 (one of the six faces is a four).

And the probability of not four is 5/6 (five of the six faces are not a four)

Let's Toss a Coin!

Toss a fair coin three times ... what is the chance of getting two Heads?

Outcome: the result of three coin tosses

Event: "Two Heads" out of three coin tosses

We could get any one of these outcomes (H stands for heads and T for

Tails):

HHH

HHT

Page 234: Statistics and probability

HTH

HTT

THH

THT

TTH

TTT

Which outcomes do we want?

"Two Heads" could be in any order: "HHT", "THH" and "HTH" all have two

Heads (and one Tail).

So 3 of the outcomes produce "Two Heads".

What is the probability of each outcome?

Each outcome is equally likely, and there are 8 of them. So each has a

probability of 1/8

So the probability of event "Two Heads" is:

Number of

outcomes we want

Probability of

each outcome

3 × 1/8 = 3/8

Let's Calculate Them All:

Page 235: Statistics and probability

The calculations are (P means "Probability of"):

P(Three Heads) = P(HHH) = 1/8

P(Two Heads) = P(HHT) + P(HTH) + P(THH) = 1/8 + 1/8 + 1/8

= 3/8

P(One Head) = P(HTT) + P(THT) + P(TTH) = 1/8 + 1/8 + 1/8 = 3/8

P(Zero Heads) = P(TTT) = 1/8

We can write this in terms of a Random Variable, X, = "The number of Heads

from 3 tosses of a coin":

P(X = 3) = 1/8

P(X = 2) = 3/8

P(X = 1) = 3/8

P(X = 0) = 1/8

And we can also draw a Bar Graph:

It is symmetrical!

Making a Formula

Now ... what are the chances of 5 heads in 9 tosses ... to list all outcomes

(512) would take a long time!

So let's make a formula.

In our previous example, how could we get the values 1, 3, 3 and 1 ?

Page 236: Statistics and probability

They are actually in the third row of Pascal‟s

Triangle ... !

Can we make them using a formula?

Sure we can, and here it is:

n = total number

k = number we want

It is often called "n choose k" and you can read more

about it at Combinations and Permutations.

Note: the "!" means "factorial", for example 4! = 1×2×3×4 = 24

Let's use it:

Example: 3 tosses getting 2 Heads

We have n=3 and k=2

n! =

3! =

3×2×1 = 3

k!(n-k)! 2!(3-2)! 2×1 × 1

So there are 3 outcomes for "2 Heads"

(We knew that already, but now we have a formula for it.)

Let's use it for a harder question:

Example: what are the chances of 5 heads in 9 tosses?

We have n=9 and k=5

n! = 9! = 9×8×7×6×5×4×3×2×1 = 126

Page 237: Statistics and probability

k!(n-k)! 5!(9-5)! 5×4×3×2×1 × 4×3×2×1

And for 9 tosses there are 29 = 512 total outcomes, so we get the

probability:

Number of

outcomes we want

Probability of

each outcome

126 × 1

= 126

512 512

P(X=5) = 126

= 63

= 0.24609375 512 256

About a 25% chance.

(Easier than listing them all.)

Bias!

So far the chances of success or failure have been equally likely.

But what if the coins are biased (land more on one side than another) or

choices are not 50/50.

Example: You sell sandwiches. 70% of people choose

chicken, the rest choose pork.

What is the probability of selling 2 chicken sandwiches to the

next 3 customers?

This is just like the heads and tails example, but with 70/30 instead of

50/50.

Let's draw a tree diagram:

Page 238: Statistics and probability

The "Two Chicken" cases are highlighted.

Notice that the probabilities for "two chickens" all work out to be 0.147 ,

because we are multiplying two 0.7s and one 0.3 in each case.

Can we get the 0.147 from a formula? What we want is "two 0.7s and one

0.3"

0.7 is the probability of each choice we want, call it p

2 is the number of choices we want, call it k

Probability of "choices we want" (two chickens) is: pk

And

The probability of the opposite choice is: 1-p

The total number of choices is: n

The number of opposite choices is: n-k

Probability of "opposite choices" (one pork) is: (1-p)(n-k)

So all choices together is:

pk(1-p)(n-k)

Example: (continued)

p = 0.7 (chance of chicken)

Page 239: Statistics and probability

n = 3

k = 2

So we get:

pk(1-p)(n-k) = 0.72(1-0.7)(3-2) = 0.72(0.3)(1) = 0.7 × 0.7 × 0.3

= 0.147

which is the probability of each outcome.

And the total number of those outcomes is:

n! =

3! =

3×2×1 = 3

k!(n-k)! 2!(3-2)! 2×1 × 1

And we get:

Number of

outcomes we want

Probability of

each outcome

3 × 0.147 = 0.441

So the probability of event "2 people out of 3 choose chicken" = 0.441

OK. That was a lot of work for something we knew already, but now we can

answer harder questions.

Example: You say "70% choose chicken, so 7 of the next 10

customers should choose chicken" ... what are the chances

you are right?

p = 0.7

n = 10

k = 7

So we get:

pk(1-p)(n-k) = 0.77(1-0.7)(10-7) = 0.77(0.3)(3) = 0.0022235661

That is the probability of each outcome.

And the total number of those outcomes is:

n! = 10! = 10×9×8 = 120

Page 240: Statistics and probability

k!(n-k)! 7!(10-3)! 3×2×1

And we get:

Number of

outcomes we want

Probability of

each outcome

120 × 0.0022235661 = 0.266827932

In fact the probability of 7 out of 10 choosing chicken is only

about 27%

Moral of the story: even though the long-run average is 70%, don't

expect 7 out of the next 10.

Putting it Together

Now we know how to calculate how many:

n!

k!(n-k)!

And the probability of each:

pk(1-p)(n-k)

We can multiply them together:

Probability of k out of n ways:

P(k out of n) = n!

pk(1-p)

(n-k)

k!(n-k)!

The General Binomial Probability Formula

Important Notes:

The trials are independent,

There are only two possible outcomes at each trial,

The probability of "success" at each trial is constant.

Page 241: Statistics and probability

Throw the Die

A fair die is thrown four times. Calculate the probabilities of getting:

0 Twos

1 Two

2 Twos

3 Twos

4 Twos

In this case n=4, p = P(Two) = 1/6

X is the Random Variable „Number of Twos from four throws‟.

Substitute x = 0 to 4 into the formula:

P(k out of n) = n!

pk(1-p)(n-k) k!(n-k)!

Like this (to 4 decimal places):

P(X = 0) = (4!/0!4!) × (1/6)0(5/6)4 = 1 × 1 × (5/6)4 = 0.4823

P(X = 1) = (4!/1!3!) × (1/6)1(5/6)3 = 4 × (1/6) × (5/6)3 = 0.3858

P(X = 2) = (4!/2!2!) × (1/6)2(5/6)2 = 6 × (1/6)2 × (5/6)2 = 0.1157

P(X = 3) = (4!/3!1!) × (1/6)3(5/6)1 = 4 × (1/6)3 × (5/6) = 0.0154

P(X = 4) = (4!/4!0!) × (1/6)4(5/6)0 = 1 × (1/6)4 × 1 = 0.0008

Summary: "for the 4 throws, there is a 48% chance of no twos, 39% chance

of 1 two, 12% chance of 2 twos, 1.5% chance of 3 twos, and a tiny 0.08%

chance of all throws being a two (but it still could happen!)"

This time the Bar Graph is not symmetrical:

Page 242: Statistics and probability

It is not symmetrical!

It is skewed because p is not 0.5

Sports Bikes

Your company makes sports bikes. 90% pass final inspection (and 10% fail

and need to be fixed).

What is the expected Mean and Variance of the 4 next inspections?

First, let's calculate all probabilities.

n = 4,

p = P(Pass) = 0.9

X is the Random Variable "Number of passes from four inspections".

Substitute x = 0 to 4 into the formula:

P(k out of n) = n!

pk(1-p)(n-k) k!(n-k)!

Page 243: Statistics and probability

Like this:

P(X = 0) = (4!/0!4!) × 0.900.14 = 1 × 1 × 0.0001 = 0.0001

P(X = 1) = (4!/1!3!) × 0.910.13 = 4 × 0.9 × 0.001 = 0.0036

P(X = 2) = (4!/2!2!) × 0.920.12 = 6 × 0.81 × 0.01 = 0.0486

P(X = 3) = (4!/3!2!) × 0.930.11 = 4 × 0.729 × 0.1 = 0.2916

P(X = 4) = (4!/4!0!) × 0.940.10 = 1 × 0.6561 × 1 = 0.6561

Summary: "for the 4 next bikes, there is a tiny 0.01% chance of no passes,

0.36% chance of 1 pass, 5% chance of 2 passes, 29% chance of 3 passes,

and a whopping 66% chance they all pass the inspection."

Mean, Variance and Standard Deviation

Let's calculate the Mean, Variance and Standard Deviation for the Sports Bike

inspections.

There are (relatively) simple formulas for them. They are a little hard to

prove, but they do work!

The mean, or "expected value", is:

μ = np

For the sports bikes:

μ = 4 × 0.9 = 3.6

So we would expect 3.6 bikes (out of 4) to pass the inspection.

Makes sense really ... 0.9 chance for each bike times 4 bikes equals 3.6

The formula for Variance is:

Variance: σ2 = np(1-p)

And Standard Deviation is the square root of variance:

σ = √(np(1-p))

For the sports bikes:

Variance: σ2 = 4 × 0.9 × 0.1 = 0.36

Standard Deviation is:

σ = √(0.36) = 0.6

Page 244: Statistics and probability

Note: we could also calculate them manually, by making a table like this:

X P(X) X × P(X) X2 × P(X)

0 0.0001 0 0

1 0.0036 0.0036 0.0036

2 0.0486 0.0972 0.1944

3 0.2916 0.8748 2.6244

4 0.6561 2.6244 10.4976

SUM: 3.6 13.32

The mean is the Sum of (X × P(X)):

μ = 3.6

The variance is the Sum of (X2 × P(X)) minus Mean2:

Variance: σ2 = 13.32 − 3.62 = 0.36

Standard Deviation is:

σ = √(0.36) = 0.6

And we got the same results as before (yay!)

Summary

The General Binomial Probability Formula

P(k out of n) = n!

pk(1-p)(n-k) k!(n-k)!

Mean value of X: μ = np

Variance of X: σ2 = np(1-p)

Standard Deviation of X: σ = √(np(1-p))

Normal Distribution

Page 245: Statistics and probability

Data can be "distributed" (spread out) in different ways.

It can be spread out

more on the left

Or more on the right

Or it can be all jumbled up

But there are many cases where the data tends to be around a central value

with no bias left or right, and it gets close to a "Normal Distribution" like this:

A Normal Distribution

The "Bell Curve" is a Normal Distribution.

And the yellow histogram shows some data that follows it closely, but not

perfectly (which is usual).

Page 246: Statistics and probability

It is often called a "Bell Curve"

because it looks like a bell.

Many things closely follow a Normal Distribution:

heights of people

size of things produced by machines

errors in measurements

blood pressure

marks on a test

We say the data is "normally distributed".

The Normal Distribution has:

mean = median = mode

symmetry about the center

50% of values less than the

mean

and 50% greater than the

mean

Quincunx

You can see a normal distribution being created by random

chance!

It is called the Quincunx and it is an amazing machine.

Have a play with it!

Standard Deviations

Page 247: Statistics and probability

The Standard Deviation is a measure of how spread out numbers are (read

that page for details on how to calculate it).

When you calculate the standard deviation of your data, you will find that

(generally):

68% of values are within

1 standard deviation of the mean

95% are within 2 standard

deviations

99.7% are within 3 standard

deviations

Example: 95% of students at school are between 1.1m and

1.7m tall.

Assuming this data is normally distributed can you calculate the

mean and standard deviation?

The mean is halfway between 1.1m and 1.7m:

Page 248: Statistics and probability

Mean = (1.1m + 1.7m) / 2 = 1.4m

95% is 2 standard deviations either side of the mean (a

total of 4 standard deviations) so:

1 standard deviation = (1.7m-1.1m) / 4

= 0.6m / 4 = 0.15m

And this is the result:

It is good to know the standard deviation, because we can say that any value

is:

likely to be within 1 standard deviation (68 out of 100 should be)

very likely to be within 2 standard deviations (95 out of 100 should

be)

almost certainly within 3 standard deviations (997 out of 1000

should be)

Standard Scores

The number of standard deviations from the mean is also called the

"Standard Score", "sigma" or "z-score". Get used to those words!

Example: In that same school one of your friends is 1.85m

tall

You can see on the bell curve that 1.85m is 3 standard

deviations from the mean of 1.4, so:

Your friend's height has a "z-score" of 3.0

It is also possible to calculate how many standard deviations 1.85 is

from the mean

How far is 1.85 from the mean?

It is 1.85 - 1.4 = 0.45m from the mean

How many standard deviations is that? The standard deviation is

0.15m, so:

0.45m / 0.15m = 3 standard deviations

Page 249: Statistics and probability

So to convert a value to a Standard Score ("z-score"):

first subtract the mean,

then divide by the Standard Deviation

And doing that is called "Standardizing":

You can take any Normal Distribution and convert it to The Standard Normal

Distribution.

Example: Travel Time

A survey of daily travel time had these results (in minutes):

26, 33, 65, 28, 34, 55, 25, 44, 50, 36, 26, 37, 43, 62, 35, 38, 45, 32,

28, 34

The Mean is 38.8 minutes, and the Standard Deviation is 11.4

minutes (you can copy and paste the values into the Standard

Deviation Calculator if you want).

Convert the values to z-scores ("standard scores").

To convert 26:

first subtract the mean: 26 - 38.8 = -12.8,

then divide by the Standard Deviation: -12.8/11.4 = -1.12

So 26 is -1.12 Standard Deviations from the Mean

Here are the first three conversions

Original Value Calculation Standard Score

(z-score)

26 (26-38.8) / 11.4 = -1.12

33 (33-38.8) / 11.4 = -0.51

65 (65-38.8) / 11.4 = +2.30

Page 250: Statistics and probability

... ... ...

And here they are graphically:

You can calculate the rest of the z-scores yourself!

Here is the formula for z-score that we have been using:

z is the "z-score" (Standard Score)

x is the value to be standardized

μ is the mean

σ is the standard deviation

Why Standardize ... ?

It can help you make decisions about your data.

Example: Professor Willoughby is marking a test.

Here are the students results (out of 60 points):

20, 15, 26, 32, 18, 28, 35, 14, 26, 22, 17

Most students didn't even get 30 out of 60, and most will fail.

The test must have been really hard, so the Prof decides to Standardize

all the scores and only fail people 1 standard deviation below the mean.

The Mean is 23, and the Standard Deviation is 6.6, and these are

the Standard Scores:

-0.45, -1.21, 0.45, 1.36, -0.76, 0.76, 1.82, -1.36, 0.45, -0.15, -0.91

Only 2 students will fail (the ones who scored 15 and 14 on the test)

Page 251: Statistics and probability

It also makes life easier because we only need one table (the Standard

Normal Distribution Table), rather than doing calculations individually for

each value of mean and standard deviation.

In More Detail

Here is the Standard Normal Distribution with percentages for every half of

a standard deviation, and cumulative percentages:

Example: Your score in a recent test was 0.5 standard

deviations above the average, how many people scored lower than

you did?

Between 0 and 0.5 is 19.1%

Less than 0 is 50% (left half of the curve)

So the total less than you is:

50% + 19.1% = 69.1%

In theory 69.1% scored less than you did (but with real data the

percentage may be different)

Page 252: Statistics and probability

A Practical Example: Your company packages sugar in 1 kg bags.

When you weigh a sample of bags you get these

results:

1007g, 1032g, 1002g, 983g, 1004g, ... (a

hundred measurements)

Mean = 1010g

Standard Deviation = 20g

Some values are less than 1000g ... can you fix

that?

The normal distribution of your measurements looks like this:

31% of the bags are less than 1000g,

which is cheating the customer!

Because it is a random thing we can't stop bags having less than 1000g, but

we can reduce it a lot ...

if 1000g was at -3 standard deviations there would be

only 0.1% (very small)

at -2.5 standard deviations we can calculate:

below 3 is 0.1% and between 3 and 2.5 standard deviations

is 0.5%, together that is 0.1%+0.5% = 0.6% (a good

choice I think)

Page 253: Statistics and probability

So let us adjust the machine to have 1000g at 2.5 standard

deviations from the mean.

We could adjust it to:

increase the amount of sugar in each bag (this would change the

mean), or

make it more accurate (this would reduce the standard deviation)

Let us try both:

Adjust the mean amount in each bag

The standard deviation is 20g,

and we need 2.5 of them:

2.5 × 20g = 50g

So the machine should

average 1050g, like this:

Adjust the accuracy of the machine

Or we can keep the same mean (of 1010g),

but then we

need 2.5 standard deviations to be equal to

10g:

10g / 2.5 = 4g

So the standard deviation should be 4g,

like this:

(We hope the machine is that accurate!)

Page 254: Statistics and probability

Or perhaps we could have some combination of better accuracy and slightly

larger average size, I will leave that up to you!

In Even More Detail!

We have a Standard Normal Distribution Table if you want more accurate

values.

Standard Normal Distribution Table

This is the "bell-shaped" curve of the Standard Normal Distribution.

It is a Normal Distribution with mean 0 and standard deviation 1.

It shows you the percent of population:

between 0 and Z (option "0 to Z")

less than Z (option "Up to Z")

Page 255: Statistics and probability

greater than Z (option "Z onwards")

It is correct to 0.1%, for example 17.36% is rounded to 17.4%

The Table

You can get more accurate values from the table below. The table shows the

area from 0 to Z.

Instead of one LONG table, we have put the "0.1"s running down, then the

"0.01"s running along. (Example of how to use is below)

Z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0.

0

0.000

0

0.004

0

0.008

0

0.012

0

0.016

0

0.019

9

0.023

9

0.027

9

0.031

9

0.035

9

0.

1

0.039

8

0.043

8

0.047

8

0.051

7

0.055

7

0.059

6

0.063

6

0.067

5

0.071

4

0.075

3

0.

2

0.079

3

0.083

2

0.087

1

0.091

0

0.094

8

0.098

7

0.102

6

0.106

4

0.110

3

0.114

1

0.

3

0.117

9

0.121

7

0.125

5

0.129

3

0.133

1

0.136

8

0.140

6

0.144

3

0.148

0

0.151

7

0.

4

0.155

4

0.159

1

0.162

8

0.166

4

0.170

0

0.173

6

0.177

2

0.180

8

0.184

4

0.187

9

0.

5

0.191

5

0.195

0

0.198

5

0.201

9

0.205

4

0.208

8

0.212

3

0.215

7

0.219

0

0.222

4

0.

6

0.225

7

0.229

1

0.232

4

0.235

7

0.238

9

0.242

2

0.245

4

0.248

6

0.251

7

0.254

9

0.

7

0.258

0

0.261

1

0.264

2

0.267

3

0.270

4

0.273

4

0.276

4

0.279

4

0.282

3

0.285

2

0.

8

0.288

1

0.291

0

0.293

9

0.296

7

0.299

5

0.302

3

0.305

1

0.307

8

0.310

6

0.313

3

0.

9

0.315

9

0.318

6

0.321

2

0.323

8

0.326

4

0.328

9

0.331

5

0.334

0

0.336

5

0.338

9

1.

0

0.341

3

0.343

8

0.346

1

0.348

5

0.350

8

0.353

1

0.355

4

0.357

7

0.359

9

0.362

1

Page 256: Statistics and probability

1.

1

0.364

3

0.366

5

0.368

6

0.370

8

0.372

9

0.374

9

0.377

0

0.379

0

0.381

0

0.383

0

1.

2

0.384

9

0.386

9

0.388

8

0.390

7

0.392

5

0.394

4

0.396

2

0.398

0

0.399

7

0.401

5

1.

3

0.403

2

0.404

9

0.406

6

0.408

2

0.409

9

0.411

5

0.413

1

0.414

7

0.416

2

0.417

7

1.

4

0.419

2

0.420

7

0.422

2

0.423

6

0.425

1

0.426

5

0.427

9

0.429

2

0.430

6

0.431

9

1.

5

0.433

2

0.434

5

0.435

7

0.437

0

0.438

2

0.439

4

0.440

6

0.441

8

0.442

9

0.444

1

1.

6

0.445

2

0.446

3

0.447

4

0.448

4

0.449

5

0.450

5

0.451

5

0.452

5

0.453

5

0.454

5

1.

7

0.455

4

0.456

4

0.457

3

0.458

2

0.459

1

0.459

9

0.460

8

0.461

6

0.462

5

0.463

3

1.

8

0.464

1

0.464

9

0.465

6

0.466

4

0.467

1

0.467

8

0.468

6

0.469

3

0.469

9

0.470

6

1.

9

0.471

3

0.471

9

0.472

6

0.473

2

0.473

8

0.474

4

0.475

0

0.475

6

0.476

1

0.476

7

2.

0

0.477

2

0.477

8

0.478

3

0.478

8

0.479

3

0.479

8

0.480

3

0.480

8

0.481

2

0.481

7

2.

1

0.482

1

0.482

6

0.483

0

0.483

4

0.483

8

0.484

2

0.484

6

0.485

0

0.485

4

0.485

7

2.

2

0.486

1

0.486

4

0.486

8

0.487

1

0.487

5

0.487

8

0.488

1

0.488

4

0.488

7

0.489

0

2.

3

0.489

3

0.489

6

0.489

8

0.490

1

0.490

4

0.490

6

0.490

9

0.491

1

0.491

3

0.491

6

2.

4

0.491

8

0.492

0

0.492

2

0.492

5

0.492

7

0.492

9

0.493

1

0.493

2

0.493

4

0.493

6

2.

5

0.493

8

0.494

0

0.494

1

0.494

3

0.494

5

0.494

6

0.494

8

0.494

9

0.495

1

0.495

2

2.

6

0.495

3

0.495

5

0.495

6

0.495

7

0.495

9

0.496

0

0.496

1

0.496

2

0.496

3

0.496

4

2.

7

0.496

5

0.496

6

0.496

7

0.496

8

0.496

9

0.497

0

0.497

1

0.497

2

0.497

3

0.497

4

Page 257: Statistics and probability

2.

8

0.497

4

0.497

5

0.497

6

0.497

7

0.497

7

0.497

8

0.497

9

0.497

9

0.498

0

0.498

1

2.

9

0.498

1

0.498

2

0.498

2

0.498

3

0.498

4

0.498

4

0.498

5

0.498

5

0.498

6

0.498

6

3.

0

0.498

7

0.498

7

0.498

7

0.498

8

0.498

8

0.498

9

0.498

9

0.498

9

0.499

0

0.499

0

Example: Percent of Population Between 0 and 0.45

Start at the row for 0.4, and read along until 0.45: there is the value

0.1736

And 0.1736 is 17.36%

So 17.36% of the population are between 0 and 0.45 Standard

Deviations from the Mean.

Because the curve is symmetrical, the same table can be used for values

going either direction, so a negative 0.45 also has an area of 0.1736

Example: Percent of Population Z Between -1 and 2

From −1 to 0 is the same as from 0 to +1:

At the row for 1.0, first column 1.00, there is the value 0.3413

From 0 to +2 is:

At the row for 2.0, first column 2.00, there is the value 0.4772

Add the two to get the total between -1 and 2:

0.3413 + 0.4772 = 0.8185

And 0.8185 is 81.85%

Skewed Data

Data can be "skewed", meaning it tends to have a long tail on one side or

the other:

Page 258: Statistics and probability

Negative Skew No Skew Positive Skew

Negative Skew?

Why is it called negative skew? Because

the long "tail" is on the negative side of the

peak.

People sometimes say it is "skewed to the

left" (the long tail is on the left hand side)

The mean is also on the left of the peak.

The Normal Distribution has No Skew

A Normal Distribution is not skewed.

It is perfectly symmetrical.

And the Mean is exactly at the peak.

Page 259: Statistics and probability

Positive Skew

And positive skew is when the long tail is

on the positive side of the peak, and some

people say it is "skewed to the right".

The mean is on the right of the peak

value.

Example: Income Distribution

Here is some data I

extracted from a recent

Census.

As you can see it

is positively skewed ...

in fact the tail continues

way past $100,000

Calculating Skewness

"Skewness" (the amount of skew) can be calculated, for example you could

use the SKEW() function in Excel or OpenOffice Calc.