STATISTICAL ANALYSIS Frequency Distribution # Indivi duals Median Mean MedianMean Median Figure...

Preview:

Citation preview

STATISTICAL ANALYSIS

Figure 1. Histogram of final exam scores.

Frequency Distribution# In

divi

dual

s

MedianMean

MedianMean MeanMedian

Figure 2.Frequency distributions of three different samples.

A B C

negative normal positive

Descriptive Statistics: used to describe, simplify, and summarize a collection of data in a clear understandable way.Ex. Mean, standard deviation, frequency

Inferential Statistics: allows you to make inferences about a population from a sample. It’s used to test the Ho.Ex. t-test, ANOVA, ANCOVA

Raw Data: the data you collect directly from the organisms or environment you are studying.

Population (N): the total # of individuals in the population of interest.

Sample (n): the number of observations or individuals measured

Mean: n

XX

_X = 1.2+3.0+0.5+2.3+1.5 5

= 1.7 m

Median: Middle number in a data set. Order from smallest to largest. 0.5 1.2 1.5 2.3 3.0

Range: Difference between the largest and smallest data points in a sample. 3.0 - 0.5 = 2.5 m

Fern Height (m)

1.2 3.0 0.5 2.3 1.5

Hand calculations

Sample Variance1

)( 22

n

XXs

_ _ X X-X (X-X)2 0.5 -1.2 1.441.2 -0.5 0.251.5 -0.2 0.042.3 0.6 0.363.0 1.3 1.69

3.78

n = 5_X = 1.7

3.78 4

S2 = = 0.945

Standard Deviation1

)( 2

n

XXs S = 0.945 = 0.972

Standard Errorn

s

n

sSE

2435.

5

972.

5

945.SE

Confidence Interval

)(CI df95 SEtX

95% confidence interval

mean

Mean = 1.7SE = 0.435

Standard Error

df = n-1

df = 5-1 = 4

CI95 = 1.7 ± (2.78*0.435)

CI95 = 1.7 ± 1.21 m

df α (2-tail) α (1-tail)

.10

.05 .05 .025

.02

.01 .01 .005

.005 .0025

.002

.001 .001

.0005

1 6.314 12.71 31.82 63.66 127.3 318.3 636.6 2 2.920 4.303 6.965 9.925 14.09 22.33 31.60 3 2.353 3.182 4.541 5.841 7.453 10.21 12.92 4 2.132 2.776 3.747 4.604 5.598 7.173 8.610 5 2.015 2.571 3.365 4.032 4.773 5.893 6.869

df = n-1

_X - u s n

t =

Testing Ho using t-test

Distribution must be normalUse when n > 30

Ho = 0Ha > 0

t = 1.7 - 00.972/ 5 = 0.39 df=5-1=4

Ignore “-” signs

df α (2-tail) α (1-tail)

.10

.05 .05 .025

.02

.01 .01 .005

.005 .0025

.002

.001 .001

.0005

1 6.314 12.71 31.82 63.66 127.3 318.3 636.6 2 2.920 4.303 6.965 9.925 14.09 22.33 31.60 3 2.353 3.182 4.541 5.841 7.453 10.21 12.92 4 2.132 2.776 3.747 4.604 5.598 7.173 8.610 5 2.015 2.571 3.365 4.032 4.773 5.893 6.869 6 1.943 2.447 3.143 3.707 4.317 5.208 5.959

t-Table

two tail is more strict

one-tail two-tail2.7760.39

Your t-value is less than t-critical. Fail to reject Ho.

Comparing two sample means

All right. It is time to use Excel!

X1 X2

5 64 73 55 94 62 85 7

t =

_ _X 1 - X2

s12 + s2

2

n1 n2

df = n1+ n2-2

Variances must be equal

Ho: X1=X2

Ha: X1=X2

Using Excel

Click Tools, Data Analysis, t-test: Two-sample Assuming Unequal Variances

Enter datax1 x2

5 64 73 55 94 62 85 7

Data Set

0

x1 x2 t-Test: Two-Sample Assuming Unequal Variances5 64 7 x1 x23 5 Mean 4 6.865 9 Variance 1.33 1.814 6 Observations 7 72 8 Hypothesized Mean Difference 05 7 df 12

t Stat -4.26P(T<=t) one-tail 0.00t Critical one-tail 1.78P(T<=t) two-tail 0.00t Critical two-tail 2.18

t-critical Value

Your Value

Reporting results in the literature: Sample X1 and sample X2 are significantly different, t(12) = -4.26, p < 0.05, 2-tail.

df

Do you accept or reject the Ho?

Lets the reader know that you set alpha to 0.05. The p-value is less than 0.05, therefore the results are significant.

When you compare more than 2 samples, you need to do an Analysis of Variance (ANOVA).

The relationship between an ANOVA and t-test: F=t2

An ANOVA is the similar to a t-test in the sense that you compare the F value to the critical F value.

You also need to look at the p-value. If you set alpha to 0.05 and your p value is less that alpha, you can reject the Ho; if greater than alpha then fail to reject the Ho.

ANOVA

Descriptive Stats

Correlation & Regression

Correlation: describe the strength of association between two variables. The correlation coefficient is designated as r. r can range from -1 to +1.

+1 = a strong positive correlation 0 = no correlation

-1 = a strong negative correlation

r = +1 r = -1

r = 0

This relationship can be represented by a regression equation.

Table 1. Sample data of tree fern age, height, and basal area. Age (years) Height (m) Basal Area

(m2) 10 0.5 .025 25 1.2 .053 30 1.5 .052 40 2.3 .110 60 3.0 .180

Equation for a straight line:y = bx + a y = DV, b = slope, x = IV, a = intercept

Simple Linear Regression: shows relationship between variables.

Correlation of Tree Fern Height & Basal Area

r = .967, p < .01

0

0.5

1

1.5

2

2.5

3

3.5

0 0.05 0.1 0.15 0.2

Basal Area ~ Meters Sqr

Hei

gh

t ~

Met

ers

Tree Fern Height vs. Age

0

0.5

1

1.5

2

2.5

3

3.5

0 20 40 60 80

Age ~ Years

Hei

gh

t ~

Met

ers

Tree Fern Height vs. Age

y = 0.0518(± .02) x - 0.0098R2 = 0.98, p<.01

0

0.5

1

1.5

2

2.5

3

3.5

0 20 40 60 80

Age ~ Years

Hei

gh

t ~

Met

ers

Recommended