Chapter 8 Parameter Estimates and Hypothesis Testing

Chapter 8Parameter Estimates and Hypothesis Testing

Estimating the Population Standard Deviation

• The SD and the mean of a population is an estimate because we don’t have all the scores (this is why it’s called “inferential statistics” because we are estimating)

• Estimating σ: the sample SD tends to underestimate the σ

-this is due to sampling error

-since the sample SD underestimates σ it is called a biased estimator • To correct for the biased estimator we subtract 1 from N

• Estimated Standard Error of the Mean: allows us to predict what the standard deviation of an entire distribution of means would be if we had measured the whole population.

-ie. the standard deviation of the sampling distribution of means• -σM symbolizes the standard deviation of an entire distribution of means

-We can estimate σM from a single sample

*when we do this it is called the Estimated Standard Error of the Means *symbolized as SEM

• Formula:

• Simplest raw score formula:

Estimating the Standard Error of the Mean

Standard error of the mean can’t be a negative number, in the same way

an SD can’t!

Estimating the Population Mean• Point Estimate: the sample mean (M) is used to estimate μ

-the most precise (best) estimate

• Confidence Intervals: a range of values is estimated within which it is assumed that μ is contained

-Goal is to bracket the μ within a specific bracket of high and low sample means.

EX: Let’s say we want to predict the average temperature for the next 4th of July. We could predict that the temp will fall between 95 & 105 degrees.

Q: What would be the problem with saying the temp will be between 0 & 120 degrees?

A: We wouldn’t be able to plan a picnic with that confidence interval because it isn’t precise at all!

**.99 confidence interval isn’t as precise as .95 because it gives a bigger range that the mean could fall in

Confidence Intervals• Using the normal curve to calculate confidence intervals (when σ is known):

-A probability value can be calculated that indicates the degree of confidence we might have that μ is really in this interval

-Typically, we like to be at least 95% sure that the predicted μ falls within our confidence interval.

.95 confidence interval = ±1.96 σM + M

**At a .95, any value that falls beyond z=1.96 leads to the conclusion that the mean is not part of the known population

.99 confidence interval = ±2.58 σM + M

**At a .99, any value that falls beyond z=2.58 leads to the conclusion that the mean is not part of the known population

Usually we don’t know σ so we can’t calculate

confidence intervals this way

t-distribution

• Use the t-distribution to calculate confidence intervals

-when σ is unknown or you have a small sample size (N=30 or less)

*usually it is unknown & with sampling, we often have a small N!

• The smaller the sample size, the less certain we are of normality of the entire sampling distribution.

• Therefore, we use a t-distribution which is a family of distributions each of which deviates from normality depending on sample size.

• t-distributions are distinguished by their degrees of freedom which are based on sample size.

-df = N – 1

-as the df increases the t-distribution becomes more like the normal distribution

• The critical values on the t-distribution are at the .05 (95% confidence level) and .01 (99% confidence level) levels

William Sealy GossettDeveloper of the t-

distribution

Calculating Confidence Intervals

• Use the t-distribution to determine critical values

-NOTE: Critical values of t should be calculated using 3 digits after the decimal (as they appear in Table C & D).

Step 1: Calculate df

df = N – 1

Step 2: find ±t.05 or ±t.01 (as instructed)

look up value in Table C at either the .05 level or .01 using the df

Step 3: Complete this formula

*Note answer will be 2 numbers (a range)

• We can use t-tests to answer research questions

• T-tests answer statistical questions such as:

1) Is the difference between the sample mean & μ statistically significant?

2) What is the probability that a sample mean could deviate from μ the amount that it does?

3) Is the sample from this population or not?

• Example Research Problem: A researcher theorizes that the population mean among college students taking the new Social Conformity Test is a “neutral” 100. Scores higher than 100 represent more conformity than average and scores lower than represent less conformity than average. A random sample of 30 students was selected and found to have a mean of 103 with a standard deviation of 10.83.

-Q: Using the t-test what statistical questions could we ask about this problem?

One Sample t Tests

• We must test the Null Hypothesis (Ho), the hypothesis of no difference

Ho: μ1 = μ2 There is no significant difference between the sample mean and the population mean.

OR

The sample is from the population.

Ha: μ1 ≠ μ2 The alternate hypothesis (the hypothesis of difference) says that the sample mean deviates enough from μ that we can conclude the sample is NOT from the population in question.

***Note: μ1 is our sample mean or our “point estimate”

-ie. It’s a μ because theoretically, it estimates our population mean

One Sample t Tests

Step 1: determine the number of degrees of freedom

df = N – 1

Step 2: calculate a t-value for our sample mean so we can see it’s relation to the μ

EX: using the previous example N=30 M=103 SD=10.83

Step 3: Look at Table C Handout under the calculated df at the .05 level & compare the calculated t-value.

-If your calculated t-value is equal to or greater than the table value then you reject the null hypothesis.

-We can NOT reject our null because the table value is 2.045 (df=29)

**we conclude that there is no difference between the sample mean & μ

Calculating One Sample t Tests

• In order to reject Ho, the t-values must fall within the .05 or .01 critical areas

• Using the previous example, let’s say we did reject the null hypothesis & accept the alternate hypothesis

• We must now say how unlikely our null hypothesis is

-Instead of saying: “It is highly unlikely that there is not a difference between the sample mean and population mean” OR “it is highly unlikely that our sample mean is from our population”

-We would say ”there is a significant difference between the population mean and the sample mean” or “the means have been found to be significantly different”

-ie. the probability that the groups are the same or that the sample mean is from the population is very, very low

• Well, how low is very low?

-when the probability that the groups or means are the same is less than 5% or 1%

-ie. “There is less than a 5% chance that these groups are the same”

• Significance levels are called alpha levels and are represented by alpha symbol (α) or

p-values (p<.05 or p<.01%)

Statistical Significance

• We use strict levels of significance to reduce the probability of committing a Type I Error (aka. Alpha Error) or Type II Error

-Alpha Error: we reject the null (Ho) but we should have accepted it

*ie. Saying there is a difference between the groups when there isn’t

• The probability of making a type one error is equal to alpha (.05 or .01)

-p <.05 (or α <.05) means that the probability of making a mistake in rejecting Ho is less than 5 in 100

*ie. At the .95 significance level, we are willing to make a mistake in rejecting Ho 5% of the time

-p <.01 (or α <.01) means that the probability of making a mistake in rejecting Ho is less than 1 in 100

*ie. At the .99 significance level, we are willing to make a mistake in rejecting Ho 1% of the time

Type I Error

Two Tail (Nondirectional)

Sample two-tail question: Is there a difference between groups?

•Use Table C for critical values

•It doesn’t matter if the t-value is positive or negative

http://www.youtube.com/watch?v=B9u_grPccUs

One Tail (Directional)

Sample one-tail question: Does one group perform better (or worse), score higher (or lower) than another? Is one drug more (or less) likely to be effective?

•Use Table D for critical values

•It DOES matter if the t-value is positive or negative and will depend on the Ha

•If you hypothesize a positive t-value & get a negative one or vice versa, you must accept Ho (even if it’s larger than the table value)

Documents

Chapter 8 Parameter Estimates and Hypothesis Testing