Upload
cuthbert-johnston
View
227
Download
1
Tags:
Embed Size (px)
Citation preview
Chapter 8Parameter Estimates and Hypothesis Testing
Estimating the Population Standard Deviation
• The SD and the mean of a population is an estimate because we don’t have all the scores (this is why it’s called “inferential statistics” because we are estimating)
• Estimating σ: the sample SD tends to underestimate the σ
-this is due to sampling error
-since the sample SD underestimates σ it is called a biased estimator • To correct for the biased estimator we subtract 1 from N
• Estimated Standard Error of the Mean: allows us to predict what the standard deviation of an entire distribution of means would be if we had measured the whole population.
-ie. the standard deviation of the sampling distribution of means• -σM symbolizes the standard deviation of an entire distribution of means
-We can estimate σM from a single sample
*when we do this it is called the Estimated Standard Error of the Means *symbolized as SEM
• Formula:
• Simplest raw score formula:
Estimating the Standard Error of the Mean
Standard error of the mean can’t be a negative number, in the same way
an SD can’t!
Estimating the Population Mean• Point Estimate: the sample mean (M) is used to estimate μ
-the most precise (best) estimate
• Confidence Intervals: a range of values is estimated within which it is assumed that μ is contained
-Goal is to bracket the μ within a specific bracket of high and low sample means.
EX: Let’s say we want to predict the average temperature for the next 4th of July. We could predict that the temp will fall between 95 & 105 degrees.
Q: What would be the problem with saying the temp will be between 0 & 120 degrees?
A: We wouldn’t be able to plan a picnic with that confidence interval because it isn’t precise at all!
**.99 confidence interval isn’t as precise as .95 because it gives a bigger range that the mean could fall in
Confidence Intervals• Using the normal curve to calculate confidence intervals (when σ is known):
-A probability value can be calculated that indicates the degree of confidence we might have that μ is really in this interval
-Typically, we like to be at least 95% sure that the predicted μ falls within our confidence interval.
.95 confidence interval = ±1.96 σM + M
**At a .95, any value that falls beyond z=1.96 leads to the conclusion that the mean is not part of the known population
.99 confidence interval = ±2.58 σM + M
**At a .99, any value that falls beyond z=2.58 leads to the conclusion that the mean is not part of the known population
Usually we don’t know σ so we can’t calculate
confidence intervals this way
t-distribution
• Use the t-distribution to calculate confidence intervals
-when σ is unknown or you have a small sample size (N=30 or less)
*usually it is unknown & with sampling, we often have a small N!
• The smaller the sample size, the less certain we are of normality of the entire sampling distribution.
• Therefore, we use a t-distribution which is a family of distributions each of which deviates from normality depending on sample size.
• t-distributions are distinguished by their degrees of freedom which are based on sample size.
-df = N – 1
-as the df increases the t-distribution becomes more like the normal distribution
• The critical values on the t-distribution are at the .05 (95% confidence level) and .01 (99% confidence level) levels
William Sealy GossettDeveloper of the t-
distribution
Calculating Confidence Intervals
• Use the t-distribution to determine critical values
-NOTE: Critical values of t should be calculated using 3 digits after the decimal (as they appear in Table C & D).
Step 1: Calculate df
df = N – 1
Step 2: find ±t.05 or ±t.01 (as instructed)
look up value in Table C at either the .05 level or .01 using the df
Step 3: Complete this formula
*Note answer will be 2 numbers (a range)
• We can use t-tests to answer research questions
• T-tests answer statistical questions such as:
1) Is the difference between the sample mean & μ statistically significant?
2) What is the probability that a sample mean could deviate from μ the amount that it does?
3) Is the sample from this population or not?
• Example Research Problem: A researcher theorizes that the population mean among college students taking the new Social Conformity Test is a “neutral” 100. Scores higher than 100 represent more conformity than average and scores lower than represent less conformity than average. A random sample of 30 students was selected and found to have a mean of 103 with a standard deviation of 10.83.
-Q: Using the t-test what statistical questions could we ask about this problem?
One Sample t Tests
• We must test the Null Hypothesis (Ho), the hypothesis of no difference
Ho: μ1 = μ2 There is no significant difference between the sample mean and the population mean.
OR
The sample is from the population.
Ha: μ1 ≠ μ2 The alternate hypothesis (the hypothesis of difference) says that the sample mean deviates enough from μ that we can conclude the sample is NOT from the population in question.
***Note: μ1 is our sample mean or our “point estimate”
-ie. It’s a μ because theoretically, it estimates our population mean
One Sample t Tests
Step 1: determine the number of degrees of freedom
df = N – 1
Step 2: calculate a t-value for our sample mean so we can see it’s relation to the μ
EX: using the previous example N=30 M=103 SD=10.83
Step 3: Look at Table C Handout under the calculated df at the .05 level & compare the calculated t-value.
-If your calculated t-value is equal to or greater than the table value then you reject the null hypothesis.
-We can NOT reject our null because the table value is 2.045 (df=29)
**we conclude that there is no difference between the sample mean & μ
Calculating One Sample t Tests
• In order to reject Ho, the t-values must fall within the .05 or .01 critical areas
• Using the previous example, let’s say we did reject the null hypothesis & accept the alternate hypothesis
• We must now say how unlikely our null hypothesis is
-Instead of saying: “It is highly unlikely that there is not a difference between the sample mean and population mean” OR “it is highly unlikely that our sample mean is from our population”
-We would say ”there is a significant difference between the population mean and the sample mean” or “the means have been found to be significantly different”
-ie. the probability that the groups are the same or that the sample mean is from the population is very, very low
• Well, how low is very low?
-when the probability that the groups or means are the same is less than 5% or 1%
-ie. “There is less than a 5% chance that these groups are the same”
• Significance levels are called alpha levels and are represented by alpha symbol (α) or
p-values (p<.05 or p<.01%)
Statistical Significance
• We use strict levels of significance to reduce the probability of committing a Type I Error (aka. Alpha Error) or Type II Error
-Alpha Error: we reject the null (Ho) but we should have accepted it
*ie. Saying there is a difference between the groups when there isn’t
• The probability of making a type one error is equal to alpha (.05 or .01)
-p <.05 (or α <.05) means that the probability of making a mistake in rejecting Ho is less than 5 in 100
*ie. At the .95 significance level, we are willing to make a mistake in rejecting Ho 5% of the time
-p <.01 (or α <.01) means that the probability of making a mistake in rejecting Ho is less than 1 in 100
*ie. At the .99 significance level, we are willing to make a mistake in rejecting Ho 1% of the time
Type I Error
Two Tail (Nondirectional)
Sample two-tail question: Is there a difference between groups?
•Use Table C for critical values
•It doesn’t matter if the t-value is positive or negative
One Tail (Directional)
Sample one-tail question: Does one group perform better (or worse), score higher (or lower) than another? Is one drug more (or less) likely to be effective?
•Use Table D for critical values
•It DOES matter if the t-value is positive or negative and will depend on the Ha
•If you hypothesize a positive t-value & get a negative one or vice versa, you must accept Ho (even if it’s larger than the table value)