Statistical Hypothesis Testing

Embed Size (px)

Text of Statistical Hypothesis Testing

  • Statistical hypothesis testing

    A statistical hypothesis test is a method of statistical inference using data from

    a scientific study. In statistics, a result is called statistically significant if it has been

    predicted as unlikely to have occurred bychance alone, according to a pre-determined

    threshold probability, the significance level. The phrase "test of significance" was

    coined by statistician Ronald Fisher.[1] These tests are used in determining what

    outcomes of a study would lead to a rejection of the null hypothesis for a pre-specified

    level of significance; this can help to decide whether results contain enough

    information to cast doubt on conventional wisdom, given that conventional wisdom

    has been used to establish the null hypothesis. The critical region of a hypothesis test

    is the set of all outcomes which cause the null hypothesis to be rejected in favor of

    the alternative hypothesis. Statistical hypothesis testing is sometimes

    called confirmatory data analysis, in contrast to exploratory data analysis, which

    may not have pre-specified hypotheses. In the Neyman-Pearson framework (see

    below), the process of distinguishing between the null & alternative hypotheses is

    aided by identifying two conceptual types of errors (type 1 & type 2), and by

    specifying parametric limits on e.g. how much type 1 error will be permitted.


    1 Variations and sub-classes

    2 The testing process

    2.1 Interpretation

    2.2 Use and importance

    2.3 Cautions

    3 Example

    3.1 Lady tasting tea

    3.2 Analogy Courtroom trial

    3.3 Example 1 Philosopher's beans

    3.4 Example 2 Clairvoyant card game

    3.5 Example 3 Radioactive suitcase

    4 Definition of terms

    5 Common test statistics

    6 Origins and early controversy

  • 6.1 Early choices of null hypothesis

    7 Null hypothesis statistical significant testing vs hypothesis testing

    8 Criticism

    8.1 Alternatives to significance testing

    9 Philosophy

    10 Education

    11 See also

    12 References

    13 Further reading

    14 External links

    Variations and sub-classes

    Statistical hypothesis testing is a key technique of both Frequentist

    inference and Bayesian inference although they have notable differences. Statistical

    hypothesis tests define a procedure that controls (fixes) the probability of

    incorrectly deciding that a default position (null hypothesis) is incorrect based on how

    likely it would be for a set of observations to occur if the null hypothesis were true.

    Note that this probability of making an incorrect decision is not the probability that

    the null hypothesis is true, nor whether any specific alternative hypothesis is true. This

    contrasts with other possible techniques of decision theory in which the null

    and alternative hypothesis are treated on a more equal basis. One

    naive Bayesian approach to hypothesis testing is to base decisions on the posterior

    probability,[2][3] but this fails when comparing point and continuous hypotheses. Other

    approaches to decision making, such as Bayesian decision theory, attempt to balance

    the consequences of incorrect decisions across all possibilities, rather than

    concentrating on a single null hypothesis. A number of other approaches to reaching a

    decision based on data are available via decision theory and optimal decisions, some

    of which have desirable properties, yet hypothesis testing is a dominant approach to

    data analysis in many fields of science. Extensions to the theory of hypothesis testing

    include the study of the power of tests, the probability of correctly rejecting the null

    hypothesis given that it is false. Such considerations can be used for the purpose

    of sample size determination prior to the collection of data.

    The testing process

    In the statistics literature, statistical hypothesis testing plays a fundamental role.[4] The

    usual line of reasoning is as follows:

  • 1. There is an initial research hypothesis of which the truth is unknown.

    2. The first step is to state the relevant null and alternative hypotheses. This is

    important as mis-stating the hypotheses will muddy the rest of the process.

    Specifically, the null hypothesis allows to attach an attribute: it should be

    chosen in such a way that it allows us to conclude whether the alternative

    hypothesis can either be accepted or stays undecided as it was before the test.[5]

    3. The second step is to consider the statistical assumptions being made about the

    sample in doing the test; for example, assumptions about the statistical

    independence or about the form of the distributions of the observations. This is

    equally important as invalid assumptions will mean that the results of the test

    are invalid.

    4. Decide which test is appropriate, and state the relevant test statistic T.

    5. Derive the distribution of the test statistic under the null hypothesis from the

    assumptions. In standard cases this will be a well-known result. For example

    the test statistic might follow a Student's t distribution or