Tuesday, September 10, 2013 Introduction to hypothesis testing

Tuesday, September 10, 2013

Introduction to hypothesis testing

Last time:•

Probability & the Distribution of Sample Means

• We can use the Central Limit Theorem to calculate z-scores associated with individual sample means (the z-scores are based on the distribution of all possible sample means).

• Each z-score describes the exact location of its respective sample mean, relative to the distribution of sample means.

• Since the distribution of sample means is normal, we can then use the unit normal table to determine the likelihood of obtaining a sample mean greater/less than a specific sample mean.


• When using z scores to represent sample means, the correct formula to use is:


• EXAMPLE: What is the probability of obtaining a sample mean greater than M = 60 for a random sample of n = 16 scores selected from a normal population with a mean of μ = 65 and a standard deviation of σ = 20?

• M = 60; μ = 65; σ = 20; n = 161

5

6560

MM

MZ

Last topic before the exam:• Hypothesis testing (pulls together

everything we’ve learned so far and applies it to testing hypotheses about about sample means).

• Before we move on, questions about CLT, distributions of samples, standard error of the mean and how to calculate it?

Hypothesis testing

• Example: Testing the effectiveness of a new memory treatment for patients with memory problems

– Our pharmaceutical company develops a new drug treatment that is designed to help patients with impaired memories.

– Before we market the drug we want to see if it works. – The drug is designed to work on all memory patients,

but we can’t test them all (the population). – So we decide to use a sample and conduct the following

experiment.– Based on the results from the sample we will make

conclusions about the population.

Hypothesis testing

• Example: Testing the effectiveness of a new memory treatment for patients with memory problems

Memory treatment

No Memorytreatment

Memory patients

MemoryTest

MemoryTest

55 errors

60 errors

5 error diff

• Is the 5 error difference: – A “real” difference due to the effect of the treatment– Or is it just sampling error?

Testing Hypotheses

• Hypothesis testing– Procedure for deciding whether the outcome of a study

(results for a sample) support a particular theory (which is thought to apply to a population)

– Core logic of hypothesis testing• Considers the probability that the result of a study could have

come about by chance if the experimental procedure had no effect

• If this probability is low, scenario of no effect is rejected and the theory behind the experimental procedure is supported

Hypothesis testingCan make predictions about likelihood of outcomes based on this distribution.Distribution of possible outcomes

(of a particular sample size, n)

• In hypothesis testing, we compare our observed samples with the distribution of possible samples (transformed into standardized distributions)

• This distribution of possible samples is often Normally Distributed (This follows from the Central Limit Theorem).

Inferential statistics

• Hypothesis testing– Core logic of hypothesis testing

• Considers the probability that the result of a study could have come about if the experimental procedure had no effect

• If this probability is low, scenario of no effect is rejected and the theory behind the experimental procedure is supported

• Step 1: State your hypotheses• Step 2: Set your decision criteria• Step 3: Collect your data & compute your test statistics • Step 4: Make a decision about your null hypothesis

– A four step program

– Step 1: State your hypotheses: as a research hypothesis and a null hypothesis about the populations• Null hypothesis (H0)

• Research hypothesis (HA)

Hypothesis testing

• There are no differences between conditions (no effect of treatment)

• Generally, not all groups are equal

This is the one that you test

• Hypothesis testing: a four step program

– You aren’t out to prove the alternative hypothesis • If you reject the null hypothesis, then you’re left with

support for the alternative(s) (NOT proof!)

In our memory example experiment:

Testing Hypotheses

μTreatment > μNo Treatment

μTreatment < μNo Treatment

H0:

HA:

– Our theory is that the treatment should improve memory (fewer errors).

– Step 1: State your hypotheses


One -tailed

In our memory example experiment:

Testing Hypotheses

μTreatment > μNo Treatment

μTreatment < μNo Treatment

H0:

HA:

– Our theory is that the treatment should improve memory (fewer errors).

– Step 1: State your hypotheses


μTreatment = μNo Treatment

μTreatment ≠ μNo Treatment

H0:

HA:

– Our theory is that the treatment has an effect on memory.

One -tailed Two -tailedno direction

specifieddirectionspecified

One-Tailed and Two-Tailed Hypothesis Tests

• Directional hypotheses– One-tailed test

• Nondirectional hypotheses– Two-tailed test

Testing Hypotheses

– Step 1: State your hypotheses– Step 2: Set your decision criteria


• Your alpha (α) level will be your guide for when to reject or fail to reject the null hypothesis.

– Based on the probability of making a certain type of error

Testing Hypotheses

– Step 1: State your hypotheses– Step 2: Set your decision criteria– Step 3: Collect your data & Compute sample statistics


Testing Hypotheses

– Step 1: State your hypotheses– Step 2: Set your decision criteria– Step 3: Collect your data & Compute sample statistics


• Descriptive statistics (means, standard deviations, etc.)• Inferential statistics (z-test, t-tests, ANOVAs, etc.)

Testing Hypotheses

– Step 1: State your hypotheses– Step 2: Set your decision criteria– Step 3: Collect your data & compute sample statistics– Step 4: Make a decision about your null hypothesis


• Based on the outcomes of the statistical tests researchers will either:

– Reject the null hypothesis– Fail to reject the null hypothesis

• This could be the correct conclusion or the incorrect conclusion

Error types

• Type I error (α): concluding that there is a difference between groups (“an effect”) when there really isn’t. – Sometimes called “significance level” or “alpha level”– We try to minimize this (keep it low)

• Type II error (β): concluding that there isn’t an effect, when there really is.– Related to the Statistical Power of a test (1-β)

Error typesReal world (‘truth’)

H0 is correct

H0 is wrong

Experimenter’s conclusions

Reject H0

Fail to Reject H0

There really isn’t an effect

There really isan effect


H0 is correct

H0 is wrong


Reject H0

Fail to Reject H0

I conclude that there is an effect

I can’t detect an effect


H0 is correct

H0 is wrong


Reject H0

Fail to Reject H0

Type I error

Type II error

Performing your statistical test

H0: is true (no treatment effect) H0: is false (is a treatment effect)

Two populations

One population

• What are we doing when we test the hypotheses?

Real world (‘truth’)

MA

they aren’t the same as those in the population of memory patients

MA

the memory treatment sample are the same as those in the population of memory patients.

Performing your statistical test• What are we doing when we test the hypotheses?

– Computing a test statistic: Generic test

Could be difference between a sample and a population, or between different samples

Based on standard error or an estimate of the standard error

“Generic” statistical test• The generic test statistic distribution (think of this as the

distribution of sample means)– To reject the H0, you want a computed test statistic that is large– What’s large enough?

• The alpha level gives us the decision criterion

Distribution of the test statistic

α-level determines where these boundaries go

“Generic” statistical test

If test statistic is here Reject H0

If test statistic is here Fail to reject H0

Distribution of the test statistic

• The generic test statistic distribution (think of this as the distribution of sample means)– To reject the H0, you want a computed test statistics that is large– What’s large enough?



Reject H0

Fail to reject H0


One -tailedTwo -tailed Reject H0

Fail to reject H0

Reject H0

Fail to reject H0

α = 0.05

0.025

0.025split up into the two tails


Reject H0

Fail to reject H0



Fail to reject H0

Reject H0

Fail to reject H0

α = 0.05

0.05all of it in one tail


Reject H0

Fail to reject H0



Fail to reject H0

Reject H0

Fail to reject H0

α = 0.05

0.05

all of it in one tail

“Generic” statistical testAn example: One sample z-test

Memory example experiment:

• We give a n = 16 memory patients a memory improvement treatment.

• How do they compare to the general population of memory patients who have a distribution of memory errors that is Normal, μ = 60, σ = 8?

• After the treatment they have an average score of M = 55 memory errors.

• Step 1: State the hypotheses

H0: The treatment sample is the same as (or worse than) the population of memory patients.

HA: The treatment sample does better than the population (fewer errors)

μTreatment ≥ μpop = 60

μTreatment < μpop = 60






• Step 2: Set your decision criteria



α = 0.05One -tailed







• Step 3: Collect your data &








α = 0.05One -tailed• Step 3: Collect your data &

compute your test statistics

= -2.5









• Step 4: Make a decision about your null hypothesis

5%

Reject H0







• After the treatment they have an average score of μ = 55 memory errors.


• Step 4: Make a decision about your null hypothesis- Reject H0

- Support for our HA, the evidence suggests that the treatment decreases the number of memory errors



Documents

Tuesday, September 10, 2013 Introduction to hypothesis testing