If you can't read please download the document
Upload
adena-mendoza
View
51
Download
1
Embed Size (px)
DESCRIPTION
TESTING HYPOTHESES. Two ways of arriving at a conclusion. 1. Deductive inference. sample. population. 2. Inductive inference. sample. population. IF YOUR DATA ARE:. 1. Continuous data. 4. Equal variance (F-test). 2. Ratio or interval. - PowerPoint PPT Presentation
Citation preview
TESTING HYPOTHESES
Two ways of arriving at a conclusion2. Inductive inferencesamplepopulationsamplepopulation1. Deductive inference
IF YOUR DATA ARE:1. Continuous data2. Ratio or interval 3. Approximately normal distribution4. Equal variance (F-test)5. Conclusions about population based on sample (inductive)6. Sample size > 10samplepopulation
Imagine the following experiment:2 groups of cricketsGroup 1 fed a diet with extra supplementsGroup 2 fed a diet with no supplementsWeights Mean = 12.8Mean = 9.49
12.113.913.012.114.912.212.914.913.612.013.513.612.015.912.412.010.912.111.010.9
9.18.911.010.19.99.28.011.98.69.08.59.610.010.99.48.011.97.110.08.9
What youre doing here is comparing two samples that, because youve not violated any of the assumptions we saw before, should represent populations that look like this:9.4912.8Are the means of these populations different??FrequencyWeight
Are the means of these populations different??To answer this question use a statistical testA statistical test is just a method of determining mathematically whether you definitively say yes or no to this questionWhat test should I use??
IF YOU HAVENT VIOLATED ANY OF THE ASSUMPTIONS WE MENTIONED BEFORENumber of groups compared2 other than 2T -testDirection of difference specified?YesNoOne-tailedTwo- tailedDoes each data point in one data set (population) have a corresponding one in the other data set?YesNoPaired t-testUnpaired t-testAre the means of two populations the same?Are the means of more than two populations the same?Number of factors being tested12>2Does each data point in one data set (population) have a corresponding one in the other data sets?Two way ANOVAANOVAYesNoOne way ANOVARepeated Measures ANOVAOther tests
A simple t-test1. State hypothesesHo there is no difference between the means of the two populations of crickets (i.e. the extra nutrients had no effect on weight)H1 there is a difference between the means of the two populations of crickets (i.e. the extra nutrients had an effect on weight)
A simple t-test2. Calculate a t-value (any stats program does this for you)3. Use a probability table for the test you used to determine the probability that corresponds to the t-value that was calculated.(for the truly masochistic)
A simple t-test2. Calculate a t-value (any stats program does this for you)3. Use a probability table for the test you used to determine the probability that corresponds to the t-value that was calculated.Data Test statisticProbability
Unpaired t test Do the means of Nutrient fed and No nutrient differ significantly? P value The two-tailed P value is < 0.0001, considered extremely significant. t = 7.941 with 38 degrees of freedom. 95% confidence interval Mean difference = -3.307 (Mean of No nutrient minus mean of Nutrient fed) The 95% confidence interval of the difference: -4.150 to -2.464 Assumption test: Are the standard deviations equal? The t test assumes that the columns come from populations with equal SDs. The following calculations test that assumption. F = 1.192 The P value is 0.7062. This test suggests that the difference between the two SDs is not significant. Assumption test: Are the data sampled from Gaussian distributions? The t test assumes that the data are sampled from populations that follow Gaussian distributions. This assumption is tested using the method Kolmogorov and Smirnov: Group KS P Value Passed normality test? =============== ====== ======== ======================= Nutrient fed 0.1676 >0.10 Yes No nutrient 0.1279 >0.10 Yes
Interpretation of p < .0001?This means that there is less than 1 chance in 10,000 that these two means are from the same population.In the world of statistics, that is too small a chance to have happened randomly and so the Ho is rejected and the H1 accepted
For all statistical tests that youll use, it is convention that the minimum probability that two samples can differ and still be from the same population is 5% or p = .05
Nonparametric Statistics(Nominal Data)&Goodness-of-Fit Tests
What happens if you violate any of the assumptions?Step 1 - Panic
What happens if you violate any of the assumptions?Step 1 - PanicStep 2 - It depends on what assumptions have been violated.
AssumptionOther testsAnother solution?1. Continuous dataYes2. Ratio/interval Yes3. Normal distributionYesTransform the data4. Equal varianceYes - Welchs5. Sample PopulationYes6. N
Nonparametric TestsThese tests are used when the assumptions of t-tests andANOVA have been violated
They are called nonparametric because there is no estimation of parameters (means, standard deviations or variances) involved.Several kinds:Goodness-of-Fit tests - when you calculate an expected valueNon-parametric equivalents of parametric tests
Goodness-of-Fit TestsUse with nominal scale datae.g. results of genetic crossesAlso, youre using the population to deduce what the sample should look like
Classic example - genetic crosses
Do they conform to an expected Mendelian ratio?Back to our little ball creatures - Critterus sphericalesPhenotypes:
A_B_
A_bb
aaB_
aabbMendelian inheritance-Predict a 9:3:3:1 ratio
-sampled 320 animals
A_B_A_bbaaB_aabbObserved (o)19453676
-sampled 320 animals
A_B_A_bbaaB_aabbObserved (o)19453676Expected (e)180606020
-sampled 320 animals
A_B_A_bbaaB_aabbObserved (o)19453676Expected (e)180606020o - e14-77-14
-sampled 320 animals
A_B_A_bbaaB_aabbObserved (o)19453676Expected (e)180606020o - e14-77-14(o - e)21964949196
-sampled 320 animals
A_B_A_bbaaB_aabbObserved (o)19453676Expected (e)180606020o - e14-77-14(o - e)21964949196(o - e)2e1.08.82.829.8
-sampled 320 animals(o -e)2eSC2 = = 1.08 + .82 + .82 + 9.8 = 12.52df = number of classes -1 = 3
A_B_A_bbaaB_aabbObserved (o)19453676Expected (e)180606020o - e14-77-14(o - e)21964949196(o - e)2e1.08.82.829.8
X2 = 12.52Critical value for 3 degrees of freedomat .05 level is7.82X2 TableConclusion: Probability of these data fitting the expected distribution is < .05,therefore they are not from a Mendelian populationThe actual probability of X2 =12.52 and df = 3 is .01 > p > .001
A little X2 wrinkle - the Yates correctionFormula is (o -e)2eSC2 =Except of df = 1 (i.e. youre using two categories of data)Then the formula becomes
(|o -e| - 0.5)2 eSC2 =
A second goodness-of-fit test
G-test or Log-Likelihood Ratio
Use if |o - e | < ee.g. if o is 12 and e is 7G = 2 o ln= 4.60517 * o log10oeoe S S
Summary!
Type of dataNumber of samplesAre data related?Test to useNominal2YesMcNemarNominal2NoFishers ExactNominal>2YesCochrans Q
All of the parametric tests (remember the big flow chart!) have non-parametric equivalents (or analogues)
Type of dataNumber of samplesAre data related?Test to use
Nominal2YesMcNemarNominal2NoFishers ExactNominal>2YesCochrans QOrdinal1NoKomolgorov- SmirnovOrdinal+2YesWilcoxon(paired t-test analogue)Ordinal+2NoMann Whitney U (unpaired t-test analogue)Ordinal+>2NoKruskal Wallis (analogue of one-way ANOVAOrdinal>2YesFriedman two-way ANOVA