Oct. 10 Statistic for the day: Practical significance UFO ...personal.psu.edu/drh20/100/fall2008/lecture/lectures/lecture19.pdf · UFO sightings in the same state: Something positive

1

Oct. 10 Statistic for the day: Correlation between (per capita, 1997-2007) number of bigfoot

sightings in a state and number of UFO sightings in the same state:

Something positive (yes, annoyingly vague!)

Practical significance

From the previous lecture: 15.3% of non-eyelens wearers and 18.8% of lens-wearers reported that they abstain from alcohol. This size of difference, even if it is really in the population, is probably uninteresting. Yet we have seen that a large sample size can make it statistically significant.

Hence, in the interpretation of statistical significance, we should also address the issue of practical significance.

In other words, we should answer the skeptic’s second question: WHO CARES?

Dr. Jonas Salk (1914-1995)

Best known for the development of a killed-virus polio vaccine, called the Salk vaccine.

Research question: Is the Salk polio vaccine effective?

Randomized, double-blinded experiment carried out in 1954 on 400,000 children.

Control proportion = 142 / 200,000 = .00071 or .071%

Treatment proportion = 56 / 200,000 = .00028 or .028%

Difference: Control – Treatment = .00043 or .043%

Very small difference. But this was expected so they took large samples. But is the difference significant? Does the research advocate (Dr. Jonas Salk) win?

Expected counts are printed below observed polio not Total C 142 199,858 200,000 99 199,901

T 56 199,944 200,000 99 199,901

Total 198 399,802 400,000

Chi-Sq = 18.677 + 0.009 + 18.677 + 0.009 = 37.372

The research advocate wins easily. We say that the vaccine is statistically significant. But is it practically significant?

2

Recall the difference in proportions (risk) for

Contol – Treatment = .00043

This represents an estimate of the proportion of children saved from polio by the vaccine. It’s a small proportion, but “small” depends on the context.

Population of US in 2000: 286,196,812.

Population of Children under age of 20: 82,997,075

Very rough approximation of number of children saved from polio by the vaccine: 83 million times .00043 around 36,000

That is certainly practically significant.

Notice: Combined mean weight is 8 pounds heavier in 2005. But women are only 6 pounds heavier on average, and men are actually lighter. How is this possible?

Simpson’s paradox: Remember this example?

A missing third variable (percent of men) explains the difference. Read section 12.4, page 229.

Probability

Relative Frequency

Personal Opinion

Experiment Repeated Sampling

Experience Non-repeatable

Event Calculate Probability by

Physical World Assumptions

Estimate Probability by Repeated Sampling

Check by Repeated Sampling

Chapter 16

Relative frequency Repeated sampling Physical world assumption

• coin • dice • cards

Roll this strange die. What is the probability of 4? What is the physical world assumption?

Example 1

Probability of getting a 4 = 4/6 = 2/3, assuming all sides are equally likely

Relative frequency Repeated sampling Estimate probability by repeated sampling

Throw a tack. What is the probability of landing point up?

Example 2

Personal probability Opinion, experience Non-repeatable event

What is the probability of getting an A in STAT 100?

Example 3

3

Rows: Attend regular religious services Columns: Ban same-sex marriage in Const

No Amend Yes Amend All No Serv 83.0% 17.0% 100.0% Yes Serv 66.7% 33.3% 100.0% All 78.1% 21.9% 100.0%

Pick someone at random from the class. The probability that he/she favors the Constitutional amendment is 21.9%.

Probability provides a new terminology for some old ideas.

“He/she favors the Constitutional amendment” is called an event. In this lecture, events will be shown in red.

If A is an event, then P(A) is shorthand for “probability of A”.

Rules: For combining probabilities 0 < Probability < 1

1. If there are only two possible outcomes, then their probabilities must sum to 1.

2. If two events cannot happen at the same time, they are called mutually exclusive. The probability of at least one happening (one or the other) is the sum of their probabilities. [Rule 1 is a special case of this.]

3. If two events do not influence each other, they are called independent. The probability that they happen at the same time is the product of their probabilities.

4. If the occurrence of one event forces the occurrence of another event, then the probability of the second event is always at least as large as the probability of the first event.

Rule 1: If there are only two possible outcomes, then their probabilities must sum to 1. According to Example 3, page 302: P(lost luggage) = 1/176 = .0057 Thus, P(luggage not lost) = 1 – 1/176 = 175/176 = .9943

The point of rule 1 is that P(lost) + P(not lost) = 1 so if we know P(lost), then we can find P(not lost).

Sounds simple, right? It can be surprisingly powerful.

Application of Rule 1: The birthday problem

With 150 people, the probability of no matching birthdays is 0.000000000000000245

Thus, P(at least one match) = .891

132! 185

229=

Row total! Column total

Overall total= 106.6

73! 110

218= 63.6

(Observed" Expected)2

Expected

P (No Match) =365! 364! · · ·! 326

365! 365! · · ·! 365= 0.109

(94" 106.6)2

106.6+

(38" 25.4)2

25.4+

(91" 78.4)2

78.4+

(6" 18.6)2

18.6= 18.3

(73" 62.4)2

62.4= 1.80

(53" 63.6)2

63.6= 1.77

(35" 45.6)2

45.6= 2.46

1Rule 2: If two events cannot happen at the same time, they are called mutually exclusive.

Example 5, page 303:

Suppose P(A in stat) = .50 and P(B in stat) = .30. Then P( A or B in stat) = .50 + .30 = .80

Note that the events ‘A in stat’ and ‘B in stat’ are mutually exclusive. Do you see why?

In this case, the probability of at least one happening is the sum of their probabilities. [Rule 1 is a special case of this.]

Data from Spring 2004 Rows: sex Columns: cell phone

no yes All female 12 124 136 male 14 87 101

All 26 211 237

P(female with cell phone) = 124/237 = .523 P(male without cell phone) = 14/237 = .059

These are mutually exclusive events. Therefore, P(female w. cell or male w/o cell) = .523 + .059 = .582.

Select a student at random from STAT 100, Spring 2004

4

Data from Spring 2004 Rows: gender Columns: cell phone

no yes All

female 12 124 136 male 14 87 101

All 26 211 237

P(female) = 136/237 = .574 P(cell phone) = 211/237 = .890

These events are not mutually exclusive; they can happen at the same time. Thus, P(female or cell phone) is NOT .574 + .890.

Select a student at random from STAT 100, Spring 2004 Rule 3: If two events do not influence each other, they are called independent.

In this case, the probability that they happen at the same time is the product of their probabilities.

Example 8, page 303:

Suppose you believe that P(A in stat) = .5 and P(A in history) = .6.

Further, you believe that the two events are independent, so that they do not influence each other.

Then P(A in stat and A in history) = (.5)×(.6) = .3

Is this a reasonable assumption?

An informal test for independence of events:

Ask whether the probability of getting an A in history would change if you learned you had an A in statistics. (If not, then they’re independent!)

Standard example of independent events: Flip two coins, say, one nickel and one quarter.

The probability of heads on the nickel does not change if you learn that you got heads on the quarter!

Rule 4: If the occurrence of one event forces the occurrence of another event, then the probability of the second event is always at least as large as the probability of the first event.

If event A forces event B to occur, then P(A) < P(B)

Special case: P(E and F) < P(E)

P(E and F) < P(F)

(because ‘E and F’ forces E to occur and it forces F to occur).

Tattoos and Pierces, STAT 100 Fall 2008

The event ‘have tattoo and have body pierce’ forces the event ‘have tattoo’ to occur.

Hence P(tattoo and pierce) < P(tattoo)

We can see it in the table: P(tattoo and pierce) = 11/231 < P(tattoo) = 34/231

Rows: Tattoo Columns: Body pierce

No pierce Yes pierce All No Tattoo 164 33 197 Yes Tattoo 23 11 34 All 187 44 231

Mary likes earrings and spends time at festivals shopping for jewelry. Her boy friend and several of her close girl friends have tattoos. They have encouraged her to also get a tattoo.

Unknown to you, Mary will be sitting next to you in the next STAT 100 class.

Rank the following statements from most likely to least likely:

A. Mary is a physics major.

B. Mary is a physics major with pierced ears.

C. Mary has pierced ears.

Documents

Oct. 10 Statistic for the day: Practical significance UFO ...personal.psu.edu/drh20/100/fall2008/lecture/lectures/lecture19.pdf · UFO sightings in the same state: Something positive