48
Introduction to statistics for hypothesis testing Hari Bharadwaj Martinos Center for Biomedical Imaging, Massachusetts General Hospital January 29, 2015 Hari (MGH) WhyNHow Stats January 29, 2015 1 / 26

Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Introduction to statistics for hypothesis testing

Hari Bharadwaj

Martinos Center for Biomedical Imaging, Massachusetts General Hospital

January 29, 2015

Hari (MGH) WhyNHow Stats January 29, 2015 1 / 26

Page 2: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Outline

1 Detection and Hypothesis testing

2 Parametric modeling

3 The Universal Frequentist Recipe

4 Common test statistics for comparing averages

5 Non-parametric approaches

6 Multiple Comparisons and Topological Inference

Hari (MGH) WhyNHow Stats January 29, 2015 2 / 26

Page 3: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Detection and Hypothesis testing

Rejecting a hypothesis aka detection

H0: The “null” hypothesis i.e., the hypothesis that the data mightallow you to reject

H1: The alternate hypothesis

Example: H0: Average IQ of Group1 subjects = Group2 subjectsH1: Average IQ of Group1 subjects < Group2 subjects

Given data we wish to probabilistically test out the hypotheses

Traditional (most used in neuroscience till date) approach: Does thedata amount to evidence against the null hypothesis? → Our focusfor today

Modern approach: What is the best (among a hypothesized set) ofmodels for the data?

Hari (MGH) WhyNHow Stats January 29, 2015 3 / 26

Page 4: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Detection and Hypothesis testing

Frequentist and Bayesian approaches

Frequentist - When H0 is true (i.e., the correct model for the data),what is the probability (p value) that we’ll see the data that we havei.e p(data|H0)?

Bayesian - Given the data we have, what is the probability that H0 istrue i.e p(H0|data)? Which is more likely: H0 or H1?

Hari (MGH) WhyNHow Stats January 29, 2015 4 / 26

Page 5: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Parametric modeling

Normal distribution

Figure: p(χ) = 1σ√

2πe

(χ−µ)2

2σ2 , Normal distributions are good models of most real

life data where clustering around the average happens, example: Adult humanheight

Hari (MGH) WhyNHow Stats January 29, 2015 5 / 26

Page 6: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Parametric modeling

‘Alien’ example

H1: A is an alienH0: A is a human being

Given: Adult human height is normally distributed with µ = 170cmand σ = 10 cm

A is 195 cm tall (Our data)

Frequentist: Given H0, the height of A is normally distributed

p(χ > µ+ 2σ) < 0.05 ⇒ With p < 0.05, H0 is false. Is A is an alien?

Hari (MGH) WhyNHow Stats January 29, 2015 6 / 26

Page 7: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Parametric modeling

Frequentist versus Bayesian

Clinical test to screen school children for a certain disease. The test is96% accurate. That is, if the test is administered on a population ofchildren with disease (H1), it tests +ve 96% of the time. Similarly if wetest a population of children with no disease (H0), it tests -ve 96%percent of the time.

Is this a good test?

If a random school child tests positive:

1 What is the conclusion based on the frequentist approach with ap < 0.05 threshold?

2 What is the probability that he/she actually does not have the disease?

Bayes Rule: p(H0|data) ∝ p(data|H0)p(H0)

Hari (MGH) WhyNHow Stats January 29, 2015 7 / 26

Page 8: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Parametric modeling

Frequentist versus Bayesian

Clinical test to screen school children for a certain disease. The test is96% accurate. That is, if the test is administered on a population ofchildren with disease (H1), it tests +ve 96% of the time. Similarly if wetest a population of children with no disease (H0), it tests -ve 96%percent of the time.

Is this a good test?

If a random school child tests positive:

1 What is the conclusion based on the frequentist approach with ap < 0.05 threshold?

2 What is the probability that he/she actually does not have the disease?

Bayes Rule: p(H0|data) ∝ p(data|H0)p(H0)

Hari (MGH) WhyNHow Stats January 29, 2015 7 / 26

Page 9: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Parametric modeling

Frequentist versus Bayesian

Clinical test to screen school children for a certain disease. The test is96% accurate. That is, if the test is administered on a population ofchildren with disease (H1), it tests +ve 96% of the time. Similarly if wetest a population of children with no disease (H0), it tests -ve 96%percent of the time.

Is this a good test?

If a random school child tests positive:1 What is the conclusion based on the frequentist approach with a

p < 0.05 threshold?

2 What is the probability that he/she actually does not have the disease?

Bayes Rule: p(H0|data) ∝ p(data|H0)p(H0)

Hari (MGH) WhyNHow Stats January 29, 2015 7 / 26

Page 10: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Parametric modeling

Frequentist versus Bayesian

Clinical test to screen school children for a certain disease. The test is96% accurate. That is, if the test is administered on a population ofchildren with disease (H1), it tests +ve 96% of the time. Similarly if wetest a population of children with no disease (H0), it tests -ve 96%percent of the time.

Is this a good test?

If a random school child tests positive:1 What is the conclusion based on the frequentist approach with a

p < 0.05 threshold?2 What is the probability that he/she actually does not have the disease?

Bayes Rule: p(H0|data) ∝ p(data|H0)p(H0)

Hari (MGH) WhyNHow Stats January 29, 2015 7 / 26

Page 11: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

The Universal Frequentist Recipe

Where we are...

1 Detection and Hypothesis testing

2 Parametric modeling

3 The Universal Frequentist Recipe

4 Common test statistics for comparing averages

5 Non-parametric approaches

6 Multiple Comparisons and Topological Inference

Hari (MGH) WhyNHow Stats January 29, 2015 8 / 26

Page 12: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

The Universal Frequentist Recipe

Universal Frequentist Recipe

ALL univariate statistical tests entail the following:

1 Construct H0 and H1, could be competing models

2 Calculate a statistic, a scalar (T ), that summarizes the effect you aretrying to capture (example: difference in mean IQs of 2 groups)

3 Model the distribution of T when H0 is true (Here is where usuallymany assumptions come in)

4 If p(T |H0) < 0.05 or any other ad hoc threshold, reject H0 (Thisdoesn’t necessarily mean we have evidence for H1)

Hari (MGH) WhyNHow Stats January 29, 2015 9 / 26

Page 13: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

The Universal Frequentist Recipe

Universal Frequentist Recipe

ALL univariate statistical tests entail the following:

1 Construct H0 and H1, could be competing models

2 Calculate a statistic, a scalar (T ), that summarizes the effect you aretrying to capture (example: difference in mean IQs of 2 groups)

3 Model the distribution of T when H0 is true (Here is where usuallymany assumptions come in)

4 If p(T |H0) < 0.05 or any other ad hoc threshold, reject H0 (Thisdoesn’t necessarily mean we have evidence for H1)

Hari (MGH) WhyNHow Stats January 29, 2015 9 / 26

Page 14: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

The Universal Frequentist Recipe

Universal Frequentist Recipe

ALL univariate statistical tests entail the following:

1 Construct H0 and H1, could be competing models

2 Calculate a statistic, a scalar (T ), that summarizes the effect you aretrying to capture (example: difference in mean IQs of 2 groups)

3 Model the distribution of T when H0 is true (Here is where usuallymany assumptions come in)

4 If p(T |H0) < 0.05 or any other ad hoc threshold, reject H0 (Thisdoesn’t necessarily mean we have evidence for H1)

Hari (MGH) WhyNHow Stats January 29, 2015 9 / 26

Page 15: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

The Universal Frequentist Recipe

Universal Frequentist Recipe

ALL univariate statistical tests entail the following:

1 Construct H0 and H1, could be competing models

2 Calculate a statistic, a scalar (T ), that summarizes the effect you aretrying to capture (example: difference in mean IQs of 2 groups)

3 Model the distribution of T when H0 is true (Here is where usuallymany assumptions come in)

4 If p(T |H0) < 0.05 or any other ad hoc threshold, reject H0 (Thisdoesn’t necessarily mean we have evidence for H1)

Hari (MGH) WhyNHow Stats January 29, 2015 9 / 26

Page 16: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Common test statistics for comparing averages

Where we are...

1 Detection and Hypothesis testing

2 Parametric modeling

3 The Universal Frequentist Recipe

4 Common test statistics for comparing averages

5 Non-parametric approaches

6 Multiple Comparisons and Topological Inference

Hari (MGH) WhyNHow Stats January 29, 2015 10 / 26

Page 17: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Common test statistics for comparing averages

One sample t-test

Testing for the average of a normal population to have a certainmean µ0

Example: sample of 10 subjectsH1: The average IQ of TDs is different from 100H0: The average IQ of TDs is 100

IQs = 87, 110, 93, 99, 75, 102, 90, 83, 100, 70

x̄ =x1 + x2 + · · ·+ xk

k(1)

S =1

k − 1

k∑1

(xi − x̄)2 (2)

t =x̄ − 100√

S/k(3)

t = −2.3, p = 0.047 ⇒ H0 is rejected

Hari (MGH) WhyNHow Stats January 29, 2015 11 / 26

Page 18: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Common test statistics for comparing averages

Two (independent) sample t-test

Testing for the means of 2 independent populations to be equal

Example: sample of 10 subjects in each group (need not be samenumber)H1: The average IQ of TDs is different from ASDsH0: The average IQ of TDs is same as ASDs

TDs = 87, 110, 93, 99, 75, 102, 90, 83, 100, 70ASDs = 77, 81, 64,100, 84, 72, 69, 90, 68, 70

x̄td , x̄asd =x1 + x2 + · · ·+ xk

k(4)

Std , Sasd =1

k − 1

k∑1

(xi − x̄)2 (5)

t =x̄td − x̄asd√

Sasd/kasd + Std/ktd(6)

H1 can be one-sided: IQ of TDs > IQ of ASD

Hari (MGH) WhyNHow Stats January 29, 2015 12 / 26

Page 19: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Common test statistics for comparing averages

Paired t-test

Testing for changes between conditions in the same block (subject)Example: sample of 10 TDs at ages 5 and 15H1: IQ increases when you grow to 15 from 5H0: IQ does not change between ages 5 and 1515 years = 87, 110, 93, 99, 75, 102, 90, 83, 100, 705 years = 68, 90, 63, 80, 70, 70, 88, 83, 90, 60

x̄ =(x15y

1 − x5y1 ) + (x15y

2 − x5y2 ) + · · ·+ (x15y

k − x5yk )

k(7)

S =1

k − 1

k∑1

(x15yi − x5y

i − x̄)2 (8)

t =x̄√S

(9)

More ‘sensitive’ than an unpaired (H0: IQs of 5 and 15 year olds isthe same on an average), Unpaired with this data ⇒ block effects!

Hari (MGH) WhyNHow Stats January 29, 2015 13 / 26

Page 20: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Common test statistics for comparing averages

t-test summary

One sample t-test: IID normal data, test for mean being a certainvalue (variance unknown)

2 sample t-test: IID normal data, test for means being equal (givenvariances are equal but unknown)

Paired t-test: IID normal differences, test for paireddifferences/changes being 0 or not 0.

We know the p-value under the null hypothesis ⇒ We don’t know themiss rate, we only guarantee a false-alarm rate

To know the sensitivity, (i.e) the probability that we detect an ‘effect’when there is an effect, we need to analyze distributions of dataunder H1

Hari (MGH) WhyNHow Stats January 29, 2015 14 / 26

Page 21: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Common test statistics for comparing averages

t-test summary

One sample t-test: IID normal data, test for mean being a certainvalue (variance unknown)

2 sample t-test: IID normal data, test for means being equal (givenvariances are equal but unknown)

Paired t-test: IID normal differences, test for paireddifferences/changes being 0 or not 0.

We know the p-value under the null hypothesis ⇒ We don’t know themiss rate, we only guarantee a false-alarm rate

To know the sensitivity, (i.e) the probability that we detect an ‘effect’when there is an effect, we need to analyze distributions of dataunder H1

Hari (MGH) WhyNHow Stats January 29, 2015 14 / 26

Page 22: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Common test statistics for comparing averages

t-test summary

One sample t-test: IID normal data, test for mean being a certainvalue (variance unknown)

2 sample t-test: IID normal data, test for means being equal (givenvariances are equal but unknown)

Paired t-test: IID normal differences, test for paireddifferences/changes being 0 or not 0.

We know the p-value under the null hypothesis ⇒ We don’t know themiss rate, we only guarantee a false-alarm rate

To know the sensitivity, (i.e) the probability that we detect an ‘effect’when there is an effect, we need to analyze distributions of dataunder H1

Hari (MGH) WhyNHow Stats January 29, 2015 14 / 26

Page 23: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Common test statistics for comparing averages

t-test summary

One sample t-test: IID normal data, test for mean being a certainvalue (variance unknown)

2 sample t-test: IID normal data, test for means being equal (givenvariances are equal but unknown)

Paired t-test: IID normal differences, test for paireddifferences/changes being 0 or not 0.

We know the p-value under the null hypothesis ⇒ We don’t know themiss rate, we only guarantee a false-alarm rate

To know the sensitivity, (i.e) the probability that we detect an ‘effect’when there is an effect, we need to analyze distributions of dataunder H1

Hari (MGH) WhyNHow Stats January 29, 2015 14 / 26

Page 24: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Common test statistics for comparing averages

t-test summary

One sample t-test: IID normal data, test for mean being a certainvalue (variance unknown)

2 sample t-test: IID normal data, test for means being equal (givenvariances are equal but unknown)

Paired t-test: IID normal differences, test for paireddifferences/changes being 0 or not 0.

We know the p-value under the null hypothesis ⇒ We don’t know themiss rate, we only guarantee a false-alarm rate

To know the sensitivity, (i.e) the probability that we detect an ‘effect’when there is an effect, we need to analyze distributions of dataunder H1

Hari (MGH) WhyNHow Stats January 29, 2015 14 / 26

Page 25: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Where we are...

1 Detection and Hypothesis testing

2 Parametric modeling

3 The Universal Frequentist Recipe

4 Common test statistics for comparing averages

5 Non-parametric approaches

6 Multiple Comparisons and Topological Inference

Hari (MGH) WhyNHow Stats January 29, 2015 15 / 26

Page 26: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Universal Frequentist Recipe

ALL univariate statistical tests entail the following:

1 Construct H0 and H1, could be competing models

2 Calculate a statistic, a scalar (T ), that summarizes the effect you aretrying to capture (example: difference in mean IQs of 2 groups)

3 Determine the distribution of T when H0 is true (Here is whereusually many assumptions come in)

4 If p(T |H0) < 0.05 or any other ad hoc threshold, reject H0 (Thisdoesn’t necessarily mean we have evidence for H1)

Hari (MGH) WhyNHow Stats January 29, 2015 16 / 26

Page 27: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Universal Frequentist Recipe

ALL univariate statistical tests entail the following:

1 Construct H0 and H1, could be competing models

2 Calculate a statistic, a scalar (T ), that summarizes the effect you aretrying to capture (example: difference in mean IQs of 2 groups)

3 Determine the distribution of T when H0 is true (Here is whereusually many assumptions come in)

4 If p(T |H0) < 0.05 or any other ad hoc threshold, reject H0 (Thisdoesn’t necessarily mean we have evidence for H1)

Hari (MGH) WhyNHow Stats January 29, 2015 16 / 26

Page 28: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Universal Frequentist Recipe

ALL univariate statistical tests entail the following:

1 Construct H0 and H1, could be competing models

2 Calculate a statistic, a scalar (T ), that summarizes the effect you aretrying to capture (example: difference in mean IQs of 2 groups)

3 Determine the distribution of T when H0 is true (Here is whereusually many assumptions come in)

4 If p(T |H0) < 0.05 or any other ad hoc threshold, reject H0 (Thisdoesn’t necessarily mean we have evidence for H1)

Hari (MGH) WhyNHow Stats January 29, 2015 16 / 26

Page 29: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Universal Frequentist Recipe

ALL univariate statistical tests entail the following:

1 Construct H0 and H1, could be competing models

2 Calculate a statistic, a scalar (T ), that summarizes the effect you aretrying to capture (example: difference in mean IQs of 2 groups)

3 Determine the distribution of T when H0 is true (Here is whereusually many assumptions come in)

4 If p(T |H0) < 0.05 or any other ad hoc threshold, reject H0 (Thisdoesn’t necessarily mean we have evidence for H1)

Hari (MGH) WhyNHow Stats January 29, 2015 16 / 26

Page 30: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Permutation tests: Professors versus politicians

H0: Both groups have the same IQ on average

Hari (MGH) WhyNHow Stats January 29, 2015 17 / 26

Page 31: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Permutation tests: Professors versus politicians

H0: Both groups have the same IQ on average

Hari (MGH) WhyNHow Stats January 29, 2015 17 / 26

Page 32: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Permutation tests: Professors versus politicians

IQ of 2 groups of 10 subjects each: Use our recipe

Let T = mean IQ of group 1 - mean IQ of group 2

Under H0, we want to know what the distribution of T is

Non-parametric approach: Under H0, group does not have any effecton data ⇒ We can assign group to subjects randomly

Thus we can get many groupings, here we can have up to(2010

)> 180, 000 permutations where the subjects from the 2 groups

are mixed

For each of these permutations we can get a Tperm ⇒ We have adistribution for T under H0

Is this generalizable to the population or applicable only to the cohort?

Hari (MGH) WhyNHow Stats January 29, 2015 18 / 26

Page 33: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Permutation tests: Professors versus politicians

IQ of 2 groups of 10 subjects each: Use our recipe

Let T = mean IQ of group 1 - mean IQ of group 2

Under H0, we want to know what the distribution of T is

Non-parametric approach: Under H0, group does not have any effecton data ⇒ We can assign group to subjects randomly

Thus we can get many groupings, here we can have up to(2010

)> 180, 000 permutations where the subjects from the 2 groups

are mixed

For each of these permutations we can get a Tperm ⇒ We have adistribution for T under H0

Is this generalizable to the population or applicable only to the cohort?

Hari (MGH) WhyNHow Stats January 29, 2015 18 / 26

Page 34: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Permutation tests: Professors versus politicians

IQ of 2 groups of 10 subjects each: Use our recipe

Let T = mean IQ of group 1 - mean IQ of group 2

Under H0, we want to know what the distribution of T is

Non-parametric approach: Under H0, group does not have any effecton data ⇒ We can assign group to subjects randomly

Thus we can get many groupings, here we can have up to(2010

)> 180, 000 permutations where the subjects from the 2 groups

are mixed

For each of these permutations we can get a Tperm ⇒ We have adistribution for T under H0

Is this generalizable to the population or applicable only to the cohort?

Hari (MGH) WhyNHow Stats January 29, 2015 18 / 26

Page 35: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Permutation tests: Professors versus politicians

IQ of 2 groups of 10 subjects each: Use our recipe

Let T = mean IQ of group 1 - mean IQ of group 2

Under H0, we want to know what the distribution of T is

Non-parametric approach: Under H0, group does not have any effecton data ⇒ We can assign group to subjects randomly

Thus we can get many groupings, here we can have up to(2010

)> 180, 000 permutations where the subjects from the 2 groups

are mixed

For each of these permutations we can get a Tperm ⇒ We have adistribution for T under H0

Is this generalizable to the population or applicable only to the cohort?

Hari (MGH) WhyNHow Stats January 29, 2015 18 / 26

Page 36: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Permutation tests: Professors versus politicians

IQ of 2 groups of 10 subjects each: Use our recipe

Let T = mean IQ of group 1 - mean IQ of group 2

Under H0, we want to know what the distribution of T is

Non-parametric approach: Under H0, group does not have any effecton data ⇒ We can assign group to subjects randomly

Thus we can get many groupings, here we can have up to(2010

)> 180, 000 permutations where the subjects from the 2 groups

are mixed

For each of these permutations we can get a Tperm ⇒ We have adistribution for T under H0

Is this generalizable to the population or applicable only to the cohort?

Hari (MGH) WhyNHow Stats January 29, 2015 18 / 26

Page 37: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Permutation tests: Professors versus politicians

IQ of 2 groups of 10 subjects each: Use our recipe

Let T = mean IQ of group 1 - mean IQ of group 2

Under H0, we want to know what the distribution of T is

Non-parametric approach: Under H0, group does not have any effecton data ⇒ We can assign group to subjects randomly

Thus we can get many groupings, here we can have up to(2010

)> 180, 000 permutations where the subjects from the 2 groups

are mixed

For each of these permutations we can get a Tperm ⇒ We have adistribution for T under H0

Is this generalizable to the population or applicable only to the cohort?

Hari (MGH) WhyNHow Stats January 29, 2015 18 / 26

Page 38: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Permutation tests: Example 2

10 subjects who are politicians or have an IQ score less than 80 or bothH1: Politicians are more likely to have IQ < 80H0: They are unrelated attributes

The data is not normally distributed, its categorical

Let x1 = (0 1 1 0 0 0 1 1 1 1) and x2 = (1 0 1 0 0 0 1 1 1 1) denotepolitician or not and IQ less than 80 or not respectively for the 10subjects

d = norm(x1 − x2) is a good measure of the conjunction between the2 attributes, d =

√2 here

Permute x1 or x2 values randomly and get a distribution for d andfind p(d ≤

√2)

Hari (MGH) WhyNHow Stats January 29, 2015 19 / 26

Page 39: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Permutation tests: Example 2

10 subjects who are politicians or have an IQ score less than 80 or bothH1: Politicians are more likely to have IQ < 80H0: They are unrelated attributes

The data is not normally distributed, its categorical

Let x1 = (0 1 1 0 0 0 1 1 1 1) and x2 = (1 0 1 0 0 0 1 1 1 1) denotepolitician or not and IQ less than 80 or not respectively for the 10subjects

d = norm(x1 − x2) is a good measure of the conjunction between the2 attributes, d =

√2 here

Permute x1 or x2 values randomly and get a distribution for d andfind p(d ≤

√2)

Hari (MGH) WhyNHow Stats January 29, 2015 19 / 26

Page 40: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Permutation tests: Example 2

10 subjects who are politicians or have an IQ score less than 80 or bothH1: Politicians are more likely to have IQ < 80H0: They are unrelated attributes

The data is not normally distributed, its categorical

Let x1 = (0 1 1 0 0 0 1 1 1 1) and x2 = (1 0 1 0 0 0 1 1 1 1) denotepolitician or not and IQ less than 80 or not respectively for the 10subjects

d = norm(x1 − x2) is a good measure of the conjunction between the2 attributes, d =

√2 here

Permute x1 or x2 values randomly and get a distribution for d andfind p(d ≤

√2)

Hari (MGH) WhyNHow Stats January 29, 2015 19 / 26

Page 41: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Non-parametric approaches

Permutation tests: Example 2

10 subjects who are politicians or have an IQ score less than 80 or bothH1: Politicians are more likely to have IQ < 80H0: They are unrelated attributes

The data is not normally distributed, its categorical

Let x1 = (0 1 1 0 0 0 1 1 1 1) and x2 = (1 0 1 0 0 0 1 1 1 1) denotepolitician or not and IQ less than 80 or not respectively for the 10subjects

d = norm(x1 − x2) is a good measure of the conjunction between the2 attributes, d =

√2 here

Permute x1 or x2 values randomly and get a distribution for d andfind p(d ≤

√2)

Hari (MGH) WhyNHow Stats January 29, 2015 19 / 26

Page 42: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Multiple Comparisons and Topological Inference

Where we are...

1 Detection and Hypothesis testing

2 Parametric modeling

3 The Universal Frequentist Recipe

4 Common test statistics for comparing averages

5 Non-parametric approaches

6 Multiple Comparisons and Topological Inference

Hari (MGH) WhyNHow Stats January 29, 2015 20 / 26

Page 43: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Multiple Comparisons and Topological Inference

MCP

H1: Coin is biasedH0: Coin is unbiased

Test: Toss coin 10 times, if Head or Tail shows up 9 or more times,reject H0 (p(9 or more heads/H0) ≈ 0.02)

This means if we repeat the test 100 times we’ll get 9 or more headsonly 2 times on an average

What if we test 100 coins (each of them once) with this test?

We will get on average 2 coins that actually show 9 or more heads...Are they biased coind?

Hari (MGH) WhyNHow Stats January 29, 2015 21 / 26

Page 44: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Multiple Comparisons and Topological Inference

Why should we worry about the MCP

Figure: Of the order of 10,000 sources ⇒ Large number of correlated tests

Hari (MGH) WhyNHow Stats January 29, 2015 22 / 26

Page 45: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Multiple Comparisons and Topological Inference

Why should we worry about the MCP

Figure: 1000 time bins × 50 frequency bins ⇒ Large number of correlated tests

Hari (MGH) WhyNHow Stats January 29, 2015 23 / 26

Page 46: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Multiple Comparisons and Topological Inference

Multiple testing corrections

For discrete tests (example: multiple end point drug trials):Non-parametric family-wise testing or False Discovery Rate (FDR)approaches

For data sets with an inherent topology (example: Time courses,whole brain signals, time-frequency maps): Random field theory orNon-parametric topological inference tests

Hari (MGH) WhyNHow Stats January 29, 2015 24 / 26

Page 47: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Multiple Comparisons and Topological Inference

Topological Inference

We have a topological map of statistics (mean power, t-values,F-values, TF coherence etc.)

Unlikely excursions under H0 of this map should be identified asevidence for H1

Hari (MGH) WhyNHow Stats January 29, 2015 25 / 26

Page 48: Introduction to statistics for hypothesis testing...Detection and Hypothesis testing Rejecting a hypothesis aka detection H 0: The \null" hypothesis i.e., the hypothesis that the data

Multiple Comparisons and Topological Inference

General References

1 Venables, W. N., & Ripley, B. D. (2002). Modern applied statistics with S. Springer.

2 Pinheiro, J. C., & Bates, D. M. (2000). Mixed-effects models in S and S-PLUS. Springer.

3 Box, G. E., & Tiao, G. C. (2011). Bayesian inference in statistical analysis (Vol. 40). John Wiley & Sons.

4 K.J. Friston, A.P. Holmes, K.J. Worsley, J.-P. Poline, C.D. Frith, R.S.J. Frackowiak (1995): Statistical Parametric Mapsin Functional Imaging: A General Linear Approach. Human Brain Mapping 2:189-210.

5 Y. Benjamini & Y. Hochberg (1995): Controlling the false discovery rate: a practical and powerful approach to multipletesting. J Roy Statist Soc Ser B. 57: 289 - 300.

6 Genovese, C.R., Lazar, N.A. & Nichols, T.E. (2002): Thresholding of statistical maps in functional neuroimaging usingthe false discovery rate. NeuroImage, 15:772-786.

7 Nichols T.E. & Holmes A.P. (2001): Nonparametric permutation tests for functional neuroimaging, a primer withexamples. Human Brain Mapping 15:125.

8 Worsley, K.J., Marrett, S., Neelin, P., Vandal, A.C., Friston, K.J. & Evans, A.C. (1996a): A unified statistical approachfor determining significant signals in images of cerebral activation. Human Brain Mapping, 4:58-73.

Hari (MGH) WhyNHow Stats January 29, 2015 26 / 26