57
MAT199: Math Alive Probability and Statistics Ian Griffiths Mathematical Institute, University of Oxford, Department of Mathematics, Princeton University

MAT199: Math Alive Probability and Statistics Ian Griffiths Mathematical Institute, University of Oxford, Department of Mathematics, Princeton University

Embed Size (px)

Citation preview

MAT199: Math AliveProbability and Statistics

Ian Griffiths

Mathematical Institute, University of Oxford,

Department of Mathematics, Princeton University

Probability and Statistics• Probability arises in many forms of everyday life:

e.g.,

• There is a one in six chance of rolling a 6 on a die.

Probability and Statistics• Probability arises in many forms of everyday life:

e.g.,

• There is a one in six chance of rolling a 6 on a die.

• There is a 30% chance of rain today.

Probability and Statistics• Probability arises in many forms of everyday life:

e.g.,

• There is a one in six chance of rolling a 6 on a die.

• There is a 30% chance of rain today.

• There is a 1% chance that Princeton will win March Madness in the next ten years.

Disease testing• While at the doctors’ surgery, the doctor decides to do a routine test for a very rare

disease which affects only 1 in 10,000 people.

• The test is 99% accurate.

• The test comes back positive. What should the doctor say?

Disease testing• While at the doctors’ surgery, the doctor decides to do a routine test for a very rare

disease which affects only 1 in 10,000 people.

• The test is 99% accurate.

• The test comes back positive. What should the doctor say?

(a) “It’s bad news I’m afraid…”

Disease testing• While at the doctors’ surgery, the doctor decides to do a routine test for a very rare

disease which affects only 1 in 10,000 people.

• The test is 99% accurate.

• The test comes back positive. What should the doctor say?

(a) “It’s bad news I’m afraid…”

(b) “Don’t worry, it’s still very unlikely you have the disease.”

Disease testing• While at the doctors’ surgery, the doctor decides to do a routine test for a very rare

disease which affects only 1 in 10,000 people.

• The test is 99% accurate.

• The test comes back positive. What should the doctor say?

(a) “It’s bad news I’m afraid…”

(b) “Don’t worry, it’s still very unlikely you have the disease.”

He should say (b). In fact, the patient still has less than 1% chance of having the disease.

Cups and prizes

• The probability of guessing the correct cup is one in three.

• A prize is placed under one cup. What is the probability that you guess the correct cup?

1 2 3

Cups and prizes

• The probability of guessing the correct cup is one in three.

• You choose a cup. I then reveal that the prize is not under one of the other cups and offer you a chance to switch. Should you stick or switch?

• If you stick then your chance of winning is still one in three. But what are your chances if you switch…?

• A prize is placed under one cup. What is the probability that you guess the correct cup?

1 2 3

Cups and prizes• Let’s suppose the prize is under cup 1. (The argument is exactly the same if the prize is

under cup 2 or cup 3.) 1 2 3

Cups and prizes

• If you choose to switch then there are three possible outcomes:

1) You choose cup 1 (containing the prize). I uncover one of the other empty cups. You switch to the remaining covered cup. You lose.

• Let’s suppose the prize is under cup 1. (The argument is exactly the same if the prize is under cup 2 or cup 3.)

1 2 3

Cups and prizes

• If you choose to switch then there are three possible outcomes:

1) You choose cup 1 (containing the prize). I uncover one of the other empty cups. You switch to the remaining covered cup. You lose.

2) You choose cup 2. I uncover the empty cup (cup 3). You switch to the remaining covered cup (cup 1). You win.

• Let’s suppose the prize is under cup 1. (The argument is exactly the same if the prize is under cup 2 or cup 3.)

1 2 3

Cups and prizes

• If you choose to switch then there are three possible outcomes:

1) You choose cup 1 (containing the prize). I uncover one of the other empty cups. You switch to the remaining covered cup. You lose.

2) You choose cup 2. I uncover the empty cup (cup 3). You switch to the remaining covered cup (cup 1). You win.

3) You choose cup 3. I uncover the empty cup (cup 2). You switch to the remaining covered cup (cup 1). You win.

• Let’s suppose the prize is under cup 1. (The argument is exactly the same if the prize is under cup 2 or cup 3.)

1 2 3

Cups and prizes

• If you choose to switch then there are three possible outcomes:

1) You choose cup 1 (containing the prize). I uncover one of the other empty cups. You switch to the remaining covered cup. You lose.

2) You choose cup 2. I uncover the empty cup (cup 3). You switch to the remaining covered cup (cup 1). You win.

3) You choose cup 3. I uncover the empty cup (cup 2). You switch to the remaining covered cup (cup 1). You win.

So if you switch then you win two times out of three. This is twice as good a chance of winning than if you had stayed with your original choice.

• Let’s suppose the prize is under cup 1. (The argument is exactly the same if the prize is under cup 2 or cup 3.)

1 2 3

Thomas Bayes1701–1761

Bayes’ theorem

Summary of lecture 13• Probability of an event A occurring is denoted by p(A)

• We use three rules for calculating probabilities:

p(A and B) = p(A) x p(B) (2)

p(A or B) = p(A) + p(B) (1)

p(A given B) = p(A and B) (3) p(B)

This is called Bayes’ theorem

What is the probability of two people in a room sharing a birthday?

What is the probability of two people in a room sharing a birthday?

Number of people in the room

Probability of sharing a birthday

What is the probability of two people in a room sharing a birthday?

Number of people in the room

Probability of sharing a birthday

• You need only 22 people in a room for the probability of two people sharing a birthday to be greater than 0.5.

Statistics• “I can prove anything with statistics except the truth.”

George Canning.

Statistics• “I can prove anything with statistics except the truth.”

George Canning.

Unjustified accuracy • 87.452522% of all statistics claim a precision of results that are not backed up by

the method used.

Statistics• “I can prove anything with statistics except the truth.”

George Canning.

Unjustified accuracy • 87.452522% of all statistics claim a precision of results that are not backed up by

the method used.

Manipulating averages• A statistician died crossing a river that was, on average, six inches deep.

Statistics• “I can prove anything with statistics except the truth.”

George Canning.

Unjustified accuracy • 87.452522% of all statistics claim a precision of results that are not backed up by

the method used.

Manipulating averages• A statistician died crossing a river that was, on average, six inches deep.

Plain wrong• Math illiteracy affects 8 out of every 5 people.

• If there is a 50/50 chance of something going wrong then 9 times out of 10 it will.

Summary of lecture 14• The expected value for an event,

E = (probability of event 1 happening) x (value for event 1) + (probability of event 2 happening) x (value for event 2)

+… + (probability of event N happening) x (value for event N)

• The variance of an event,

Variance = (probability of event 1 happening) x (value for event 1 – Expectation value, E)2 + (probability of event 2 happening) x (value for event 2 – Expectation value, E) 2 +… + (probability of event N happening) x (value for event N – Expectation value, E) 2

• The standard deviation of an event = variance

Pascal’s triangle1

1 11 2 1

1 3 3 11 4 6 4 1

1 5 10 10 5 11 6 15 20 15 6 1

1 7 21 35 35 21 7 11 8 28 56 70 56 28 8 11 9 36 84 126 126 84 36 9

11 10 45 120 210 252 210 120 45

10 1…

n =10

What happens if we take more points?

n =10 n =20

What happens if we take more points?

n =10 n =20

n =100

What happens if we take more points?

n =10 n =20

n =100

This is called the binomial distribution

What happens if we take more points?

• We can scale the distribution so that the sum of all the possible outcomes = 1 and the values correspond to probabilities.

Analysing the final curve

• We can scale the distribution so that the sum of all the possible outcomes = 1 and the values correspond to probabilities.

• The result is approximated by the formula:

r

p

n = 100

Analysing the final curve

• We can scale the distribution so that the sum of all the possible outcomes = 1 and the values correspond to probabilities.

• The result is approximated by the formula:

r

p

n = 100

This is the mean value

Analysing the final curve

• We can scale the distribution so that the sum of all the possible outcomes = 1 and the values correspond to probabilities.

• The result is approximated by the formula:

r

p

n = 100

This is the mean value

This is related to the spread of

the data

Analysing the final curve

The normal distribution

The normal distribution

• In general we write:

The normal distribution

• In general we write:

The mean value

The standard deviation

The normal distribution

• In general we write:

The mean value

The standard deviation

• This is called the normal distribution.

The normal distribution

• The normal distribution is also known as a bell curve.

Summary of lecture 15• The normal distribution:

Tossing a coin 100 times. This shows the probability, p, of getting a head a total of r times.

r

p

Summary of lecture 15• The normal distribution:

Tossing a coin 100 times. This shows the probability, p, of getting a head a total of r times.

r

p

Summary of lecture 15• The normal distribution:

The mean value

The standard deviation

Tossing a coin 100 times. This shows the probability, p, of getting a head a total of r times.

In this example the mean is m = 50and the standard deviation is σ = 10.

r

p

Summary of lecture 15• The normal distribution:

The mean value

The standard deviation

Tossing a coin 100 times. This shows the probability, p, of getting a head a total of r times.

In this example the mean is m = 50and the standard deviation is σ = 10.

• This kind of distribution arises in many situations, e.g. distribution of heights of people in the U.S.

r

p

The Griffiths University

The Griffiths University• Here are some statistics for the 2012 intake of students at the Griffiths University:

Success rate for males who applied to the Griffiths University: 17%Success rate for females who applied to the Griffiths University: 82%

The Griffiths University• Here are some statistics for the 2012 intake of students at the Griffiths University:

Success rate for males who applied to the Griffiths University: 17%Success rate for females who applied to the Griffiths University: 82%

• Is the Griffiths University a fair University?

(a) Yes.(b) No.(c) Don’t know.

Representing dataGood forms of data representation

Representing data

• This is a visualization of the military campaign of Bonaparte against Russia. The band shows the route taken by Napoleon’s army. The thickness of the band is proportional to the number of surviving troops.

Good forms of data representation

Representing dataGood forms of data representation

Representing dataGood forms of data representation

• This shows the air pollution in Los Angeles at different times of the day.

Representing dataBad forms of data representation

Representing data

Midterm grades for students who sit at the front of the lecture theatre.

Midterm grades for students who sit at the back of the lecture theatre.

Bad forms of data representation• Where is the best place to sit in the Math Alive lectures?

Representing dataBad forms of data representation – exploiting viewing angles

• We can also use perspective to skew data…

Representing dataBad forms of data representation – exploiting viewing angles

• We can also use perspective to skew data…• This is a fairer way of representing the data:

Representing dataBad forms of data representation – exploiting viewing angles

Representing dataBad forms of data representation

• Here we show the Nobel prizes won by various countries. Can you see why this representation might be unfair?