61
CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences Shurong Fang

CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

  • Upload
    others

  • View
    27

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

CH4. Multiple Comparison Procedures

Michigan Technological UniversityDepartment of Mathematical Sciences

Shurong Fang

Page 2: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• In population-based association studies, we generally aim to test for the presence of associations between the trait and each of multiple genotypes across several SNPs and gene loci.

• However, testing multiple hypotheses can result in an inflation of the error rate.

Page 3: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• Measures of error (Section 4.1)

• Methods for adjusting for multiple testing (Section 4.2)

• Resampling-based approaches (Section 4.3)

Page 4: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

4.1 Measures of error

• Much of the literature on methods for adjusting for multiple comparisons describes controlling one of two error rates:

• the family-wise error rate (FWER) and • the false discovery rate (FDR).

Page 5: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •
Page 6: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •
Page 7: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Family-wise error rate• FWER is the probability of making at least one

type-I error.

• FWER under the complete null (FWEC) is the probability that at least one type-I error occurs given that all nulls are true.

• FWER under a set of null (FWEP) is the probability that at least one type-I error occurs given that a subset of nulls are true.

Page 8: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• Weak control of FWER implies FWEC≤ α.• Strong control of FWER implies FWEP≤ α

for all subsets of null hypotheses.[Eg] Consider m = 2 null hypotheses: H0¹, H0².There are a total of four partial nulls.

• Strong Control of the FWER means that FWEP ≤αfor all four of these configurations of true nulls.

Page 9: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

False discovery rate

• FDR is the expected proportion of null hypotheses that are true among those that are declared significant.

Page 10: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Relationship between FDR and FWER

• Assume: all nulls are true V=R

• If all null hypotheses are true, then the FDR is equal to the FWER.

Page 11: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• If not all null hypotheses are true, V < R.

• In general. FDR ≤ FWER.• This implies that any approach that

controls the FWER will also control the FDR.

Page 12: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

4.2 Single-step and step-down adjustments

(1) Single-step adjustment• A single criterion is used to assess the

significance of all test statistics or corresponding p-values.

(2) Step-down adjustment• It involves ordering test statistics or p-

values and then using a potentially different criterion for each of the ordered values.

Page 13: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Bonferroni adjustment (Single-step)• Consider m hypotheses:and each test is controlled at a level α.

• Assume that m tests are independent.

10 0,..., mH H

( )0 0Pr | is truei iiType I rej H H α− = ≤

Page 14: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• Consider a level of 5%.• If m=1, we can control FWEC at a level of

5%.• If m=10, we can control FWEC at a level

of 40.1%. This means that although we are controlling each of the ten individual tests at a level of 5%, overall error may be as great as 40.1%.

Page 15: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• Bonferroni adjustment for multiple comparisons is a single-step procedure. involving using .

• If m=10 and we want to control this at an overall level of 5%, then let

• .• Now, .

• If we control each of m tests at level, then our overall FWEC will be controlled at a level of .

mα α′ =

0.05 10 0.005α′ = =( )10FWEC 1 1 0.005 0.049≤ − − =

mα α′ =

α

Page 16: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Bonferroni adjustment is quite conservative.(1) it assumes all tests are independent,

which is generally not the case, and(2) it is based on FWER control.

Page 17: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Tukey and Scheffe tests (single-step)

• Tukey's studentized range test is useful for the comparison of means between groups (Tukey. 1977).

• Suppose an ANOVA model• H0 : α1 = α2 = … = αm

• If this F-test is significant. indicating a departure from the null, then interest may lie in H0. αi = αj .

Page 18: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• Tukey's honestly significantly different (HSD) test is based on the studentized range distribution.

Page 19: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• This HSD statistic is very similar to the statistic we construct in performing a two-sample t-test for the comparison of two independent means.

• Specifically, ts = sqrt(2)*t, where t is a two-sample t-statistic based on equal sample sizes.

HSD and t-test

Page 20: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

The adjustment for the number of tests

• As the number of df, given by m, increase, the critical value qm;(mn)-m also increases, and it becomes harder to reject the null hypothesis.

Page 21: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Unbalanced data

• Suppose our trait is the age of onset of breast cancer and we have a single biallelic candidate SNP taking on the values A1A1, A1A2 and A2A2.

• Let u1, u2 and u3 be respectively the population-level mean age of onset of breast cancer for each genotype.

• H0: u1=u2; H0: u1=u3; H0: u2=u3• In this setting, the sample sizes could be

unequal across genotypes, and thus an extension of Tukey's method is required.

Page 22: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Tukey-Kramer method

• The Tukey-Kramer method involves simply replacing n with the harmonic mean of the two sample sizes. given by

Page 23: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Tukey’s single-step method using R

• ptukey(q, nm, df, lower.tail = TRUE,… ) • qtukey(p, nm, df, lower.tail = TRUE,… )

• TukeyHSD(aov, “factor”, conf.level = 0.95, …) Create a set of confidence intervals on the differences between the means of the levels of a factor with the specified family-wise probability of coverage.

• TukeyHSD( ) and plot( ) will not work with a MANOVA (multivariate analysis of variance) fit.

Page 24: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Scheffe's method

• Tukey's method provides an adjustment for testing for differences between all pairs of means.

• Scheffe's method involves testing a larger set of hypotheses that includes all contrasts of the factor-level means.

Page 25: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• A contrast in the one-way ANOVA setting is defined as a linear combination of the means.

such that

• H0: μi=μj H0: μi-μj=0

• x

Page 26: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• We let L be the set of all linear contrasts of the factor means. Scheffe's method is an adjustment approach that controls the FWER when we are interested in testing whether each element of L is equal to zero.

[Eg]

Page 27: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

F-test

• In order to construct an F-test statistic for testing a single contrast H0: λ’μ=0, we begin by defining the vector ρ as

Page 28: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Scheffe's method

• In this case, the numerator df are set equal to m-1, where in the usual setting for testing a single contrast, we set this equal to 1.

Page 29: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• Consider m = 5 groups, each of size n = 20.• H0: μ1=μ2

• w/o making a multiple comparisonadjustment: F=4.5; F1;40-1=4.08, so reject H0.

• w/ adjustment, Fs=1.125; F4;100-5=2.47, so fail to reject H0.

Page 30: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

False discovery rate control (FDR)

• To control FDR, we begin w/ the Benjamini and Hochberg (B-H) adjustment, a step-down adjustment.

• Consider m independent null hypotheses given by H01;…;H0m and the resulting p-values are given by p1;…; pm.

• We want to control the false discovery rate at a level q.

Page 31: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •
Page 32: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Example

• test for an association between each of 10 SNPs and the presence of disease.

• Assume that each SNP is in a separate gene and our tests are independent.

Page 33: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• for each SNP i =1, … ,10. we construct a 2 by 3 contingency table and calculate a chisq-statistic corresponding to H0: ORi =1.

• The ordered p-values are given by0.001 0.012 0.014 0.122 0.2450.320 0.550 0.776 0.840 0.995

• The Bonferroni adjustment would lead us to use the adjusted significance level of 0.05/10 =0.005. Then, we would only reject only the 1st H0.

• Using the B-H method, we would compare the ith ordered p-value to αi=0.05(i/10). Then, we would reject the first three H0.

Page 34: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

defining rejection criteria

(1) calculate an adjusted p-values

(2) update these p-values to ensure monotonicity.

Page 35: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• The B-H procedure for controlling the FDR assumes independence of the test statistics corresponding to the true null hypotheses.

• The B-H procedure will also control the FDR if the test statistics corresponding to the true null hypotheses are positively regression dependent (PRD).

Page 36: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• Benjamini and Yekutieli (2001) also propose an extension of the B-H approach that controls the FDR in settings for which PRD does not hold.

• This extension is to simply replace q with

(B-Y) adjustment

Page 37: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

The q-value

• The q-value is an alternative measure of significance based on the FDR concept that was recently proposed for GWAS.

Page 38: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

positive false discovery rate (pFDR)

• The pFDR cannot be controlled in a traditional sense since it is identically equal to 1 if the proportion of true H0, given by m0=m, is equal to 1. That is, we cannot guarantee that pFDR ≤ α for α< 1.

Page 39: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

q-value• The q-value is based on an alternative

paradigm that eliminates this concern. • Specifically, rather than fixing an error rate

and then estimating a significance threshold that maintains this rate on average, we fix the significance threshold and then estimate the rate over that threshold.

• Significance Analysis of Microarrays (SAM): fix rejection regions and then estimate the corresponding FDRs.

Page 40: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

p-value and q-value

• We can write the p-value, corresponding to an observed test statistic T = t as

• The q-value is defined similarly as

Page 41: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

p-value and q-value• In the context of SAM, the q-value is computed

as the FDR for the smallest estimated rejection region for which the gene under consideration is called significant.

• Intuitively, the q-value is the minimum pFDR that can occur over the set of nested rejection regions when we reject our test statistic.

• This parallels w/ the definition of p-value as the minimum typeI error rate that can occur over all nested rejection regions when we reject our test statistic.

Page 42: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

q-value• Similar to FDR control, use of the q-value is most

appropriate when the number of tests is large. In this case, the probability that at least one test is declared significant, given by Pr(R>0), is close to 1.

• If we set the tuning parameter λ=0 (a quantity that informs us about the proportion of true null hypotheses), then the q-value results in the same adjusted p-values as the B-H adjusted FDR.

• However, this is a conservative estimate of the q-value, and optimizing the choice of is tenable.

Page 43: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Resampling-based methods

• One primary advantage is that they offer a natural approach to account for underlying unknown correlation structure among multiple hypotheses.

Page 44: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

free step-down resampling (FSDR)

• The idea behind the resampling-based approach is that by taking repeated samples of the observed data, we can simulate the distribution of the test statistics (or p-values) under the complete null hypothesis, ,and then compare the observed test statistics to this empirical distribution to ascertain the corresponding significance of our tests.

Page 45: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

subset pivotality condition

• The distribution of test statistics is the same under any combination of true null hypotheses.

• The test statistic distribution is invariant to whether all null hypotheses are indeed true or a partial set of null hypotheses are true.

• The covariance between test statistics is assumed to be the same under all scenarios of true and false null hypotheses.

Page 46: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Algorithm: Free step-down resampling (MaxT):

1. Determine the “observed” test statistics and p-values, and let the corresponding ordered absolute value of the test statistics, sorted from smallest to largest, be given by |T|(1),…, |T|(m).

T1 Tm

Page 47: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

2. Generate the (approximate) distribution of test statistics under the complete null.

Estimate the residuals

is drawn with replacement from the original set of residuals

Refit the model and get new statistics

Take absolute values:

Page 48: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

3. Compare the observed test statistics to test statistics under the complete null to get adjusted p-values. The resampling component of Step 2 is repeated B times.

Page 49: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

R package• The mt.maxT() and mt.minP() functions in the

multtest package can be used to implement the FSDR approach.

• However, application of these functions, originally written for gene expression data, to data arising from SNP association studies is not straightforward.

• Expression Data: association between the presence of the disease and gene expression levels across several genes.

• SNP Association Data: association between the trait and each class label.

Page 50: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

a binary trait logistic model

• Rather than resampling from the residuals, generate new binary traits such that

• However, the subset pivotality condition is not met for this setting

Page 51: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Null unrestricted bootstrap< the subset pivotality assumption is violated >

P0 is the data distribution under the complete null;P is the true data distribution;Qn(P0) is the distribution of test statistics under the

null generated distribution;Q0n is the complete null distribution of test statistics.• The basic idea behind the null unrestricted

bootstrap approach is to project the distribution of test statistics, given by Qn(P) for the true data generating distribution, onto the space of mean zero distributions to arrive at Q0n.

Page 52: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •
Page 53: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Alternative paradigms

Methods for reducing the number of tests:• First, we present the approach, first described by

Cheverud (2001), that takes into account LD structure to determine the effective number of tests (Meff).

• Second, we describe a global testing framework, proposed independently by Goeman et al. (2004) for gene expression data and Foulkes et al. (2005) for SNP data, that obviates the need for a multiple testing adjustment in some settings.

Page 54: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Effective number of tests• primary motivation: SNPs are usually

correlated with one another.• The variance of the e-values of a correlation

matrix of a set of variables captures information on the collective correlation of this set.

• The proportion reduction in the number of tests due to correlation is characterized by the ratio

Page 55: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• The effective number of tests ranges from 1 to M and is defined by

• If all variables are perfectly correlated, then Var[λobs] = M and Meff = 1.

• If we have no correlation among the set of SNPs under investigation, then Var[λobs] = 0 and Meff = M.

Page 56: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• Gao et al. (2008) proposed Meff-G: the number of principal components of the correlation matrix that explain C% of the variability in the data.

• Formally, Meff-G is the minimum x such that

Page 57: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• Once the effective number of tests is determined, the resulting value can be used in place of M in our usual adjustment of multiple testing.

[Eg] Bonferroni adjustment

• The results of this analysis may be sensitive to the choice of correlation matrix.

Page 58: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Global tests

• ANOVA

Page 59: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

[Eg]

• suppose we have two SNPs within a single gene so that there are 3^2 = 9 possible genotypes, given by

Page 60: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

• While , it actually tends to become harder to reject the null hypothesis with more groups.

• In other words, the power of our test is limited by the number of groups, as the df are expended on relatively rare genotypes.

• This well-known degrees-of-freedom problem is described for the analysis of haplotype-trait association studies in Chapman et al. (2003), Clayton et al. (2004), Tzeng et al. (2006) and Foulkes et al. (2008).

Page 61: CH4. Multiple Comparison Proceduresshurongf/CH4.pdf · CH4. Multiple Comparison Procedures Michigan Technological University Department of Mathematical Sciences. Shurong Fang •

Extension of the ANOVA

• Assume

• The null is that there is no variability in the effects of genotypes on the trait. A likelihood ratio test can be applied to test this hypothesis, and the resulting test statistic has a

unknownindependent