76
1 Describing distributions with numbers William P. Wattles Psychology 302

1 Describing distributions with numbers William P. Wattles Psychology 302

Embed Size (px)

Citation preview

1

Describing distributions with numbers

William P. Wattles

Psychology 302

2

Measuring the Center of a distribution

Mean– The arithmetic average– Requires measurement data

Median– The middle value

Mode– The most common value

3

Measuring the center with the Mean

xX

n

4

Our first formula

X the mean

X the individual score

n the number of individuals

sum of

5

The Mean

One number that tells us about the middle using all the data.

The group not the individual has a mean.

Population

Sample

6

Sample mean

X

7

Mu, the population mean

Population

Sample X

8

Calculate the mean with Excel

Save the file psy302 to your hard drive– right click on the file – save to desktop or temp

Open file psy302 Move flower trivia score to new sheet

9

Calculate the mean with Excel

Rename Sheet– double click sheet tab, type flower

Calculate the sum– type label: total

Calculate the mean– type label: mean

Check with average function

10

Measuring the center with the Median

Rank order the values If the number of observations is odd the

median is the center observation If the number of observations is even

the median is the mean of the middle two observations. (half way between them)

11

Measuring the center with the Median

Mediann

1

2

12

The mean versus the median

The Mean– uses all the data– has arithmetic properties

The Median – less influenced by Outliers and extreme

values

Mean vs. MedianBetty 28,514.00$ Mike 22,316.00$ Tom 30,112.00$ Miriam 29,521.00$ Stacy 21,555.00$ David 125,366.00$ Mary Lou 22,132.00$ John 27,561.00$ Gail 24,635.00$ Arthur 30,125.00$

mean 36,183.70$ median 28,037.50$

5

The Mean

The mean uses all the data. The group not the individual has a

mean. We calculate the mean on

Quantitative Data

Three things to remember

The mean tells us where the middle of the data lies.

We also need to know how spread out the data are.

Mean = 68 inches Mean = 68 inchesStd Dev= .57 inchesStd Dev= 8.54 inches

Measuring Spread

Knowing about the middle only tells us part of the story of the data.

We need to know how spread out the data are.

Variability

Variety is the spice of life

Without variability things are just boring

exam3 Psy314 Health Psychology69% 61% 79% 100%54% 60% 85% 83%58% 75% 85% 73%87% 57% 80% 83%65% 68% 58% 50%83% 55% 59% 79%89% 74% 85% 63%

Why is the mean alone not enough to describe a

distribution?Outliers is NOT the answer!!!!

The mean tells us the middle but not how spread out the

scores are.

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

14

Example of Spread

New York mean annual high

temperature 62

14

Example of Spread

San Francisco mean annual high

temperature 65

16

Example of Spread

New York mean max min range sd 62 84 39 45 17.1

San Francisco 65 73 55 18 6.4

Example of Variability

Psy 302 Spring 2003

0%

20%

40%

60%

80%

100%

120%

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

Student

Gra

de Final

Quiz 7

17

Measuring Spread Range Quartiles Five-number summary

– Minimum– first quartile– median– third quartile– Maximum

Standard Deviation

Mean 50.63%

Mean 33.19%

Std Dev 21.4%

Std Dev 13.2%

19

Deviation score

Each individual has a deviation score. It measures how far that individual deviates from the mean.

Deviation scores always sum to zero. Deviation scores contain information.

– How far and in which direction the individual lies from the mean

18

Measuring spread with the standard deviation

Measures spread by looking at how far the observations are from their mean.

The standard deviation is the square root of the variance.

The variance is also a measure of spread

Mean Xdeviation score

$28,756 32,092$ The average teacher $28,756, John $32.092 64.5 68 The average woman is 5 4 1/2, mary is 5'8"110 90 The average IQ is 110 and Bubba has a 90

Individual deviation scores

deviation score

The average teacher $28,756, John $32.092 $3,336 dollarsThe average woman is 5 4 1/2, mary is 5'8" 3.5 inchesThe average IQ is 110 and Bubba has a 90 -20 points

Standard deviation

One number that tells us about the spread using all the data.

The group not the individual has a standard deviation.

Note!!

23

Standard Deviation

sx x

n

( ) 2

1

22

Variance

sx x

n2

2

1

( )

24

Properties of the standard deviation

s measures the spread about the mean s=0 only when there is no spread. This

happens when all the observations have the same value.

s is strongly influenced by extreme values

New Column headed deviation Deviation score = X – the mean

25

Calculate Standard Deviation with Excel

In new column type heading: dev2 Enter formula to square deviation Total squared deviations

– type label: sum of squares Divide sum of squares by n-1

– type label: variance

Moore page 50Example 2.7page 50

subject MetabolicRatesubject1 1792subject2 1666subject3 1362subject4 1614subject5 1460subject6 1867subject7 1439

Example 2.6page 42

subject MetabolicRatedev dev2subject1 1792 192 36864subject2 1666 66 4356subject3 1362 -238 56644subject4 1614 14 196subject5 1460 -140 19600subject6 1867 267 71289subject7 1439 -161 25921

total 11200 0 214870 ssmean 1600 35811.6667 var

189.239707 stdev189.239707 stdev check

To Calculate Standard Deviation: Total raw scores divide by n to get mean calculate deviation score for each subject (X

minus the mean) Square each deviation score Sum the deviation scores to obtain sum of

squares Divide by n-1 to obtain variance Take square root of variance to get standard

deviation.

Population

Sample

26

Sample variance

s2

27

Population variance

2

Population Variance

Sample Variance

2

s2

28

Little sigma, thePopulation standard deviation

29

Sample standard deviation

s

Population Standard Deviation

Sample StandardDeviation

s

To analyze data

1. Make a frequency distribution and plot the data

Look for overall pattern and outliers or skewness

Create a numerical summary: mean and standard deviation.

41

Start with a list of scoresCathy 400 Alice 300Paula 300 Mitzi 200Sandy 500 Jack 700Lois 400 Mike 500Anne 500 Dawn 600Miriam 600 Vicki 400June 400 George 500David 500 Ashley 800

42

Make a frequency distribution

200 xxxxxxxxxxx300 xxxxxxxxxxxxxxxxxxxxxxxx400 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx500 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx600 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx700 xxxxxxxxxxxxxxxxxxxxxxxxx800 xxxxxxxxxxxxx

43

Frequency distributionscore frequency

200 50300 100400 150500 250600 150700 100800 50

44

Represent with a chart (histogram)

SAT Scores

0

50

100

150

200

2502

00

30

0

40

0

50

0

60

0

70

0

80

0

Score

Fre

qu

en

cy

45

Represent with line chartSAT Scores

0

50

100

150

200

250

20

0

30

0

40

0

50

0

60

0

70

0

80

0

Score

Fre

qu

en

cy

Density Curve

Replaces the histogram when we have many observations.

Transform a score

Hotel Atlantico 200 pesos Peso a unit of

measure

Transform a score

1 dollar = 28.38 pesos

200/28.38=$7.05 Dollar a unit of

measure

31

standardized observations or values. To standardize is to transform a score

into standard deviation units. Frequently referred to as z-scores A z-score tells how many standard

deviations the score or observation falls from the mean and in which direction

32

Standard Scores (Z-scores)

individual scores expressed in terms of the mean and standard deviation of the sample or population.

Z = X minus the mean/standard deviation

33

Z-score

zx

34

new symbols

the population mean

the population s dard deviation

z the s dardized value

tan

tan

35

Calculate Z-scores for trivia data

Label column E as Z-score Type formula deviation score/std dev Make std dev reference absolute (use

F4 to insert dollar signs) Copy formula down. Check: should sum to zero

File extensions

Word .doc Excel .xls Text files .txt

To view File extensions

Open Windows Explorer Choose Tools/Folder Options/View uncheck “hide extensions for known file

types.

37

Z Scores

Height of young women– Mean = 64– Standard deviation = 2.7

How tall in deviations is a woman 70 inches?

A woman 5 feet tall (60 inches) is how tall in standard deviations?

38

Z scores

Height of young women– Mean = 64– Standard deviation = 2.7

How tall in deviations is a woman 70 inches? z = 2.22

A woman 5 feet tall (60 inches) is how tall in standard deviations? z = -1.48

39

Calculating Z scores

48.17.2

6460

height

z

22.27.2

6470

height

z

Calculating X from Z scores

*zX

72

Types of data

Categorical or Qualitative data– Nominal: Assign individuals to mutually

exclusive categories. exhaustive: everyone is in one category

– Ordinal: Involves putting individuals in rank order. Categories are still mutually exclusive and exhaustive, but the order cannot be changed.

73

Types of data

Measurement or Quantitative Data– Interval data: There is a consistent interval

or difference between the numbers. Zero point is arbitrary

– Ratio data: Interval scale plus a meaningful zero. Zero means none. Weight, money and Celsius scales exemplify ratio data

– Measurement data allows for arithmetic operations.

60

The End

Mean vs. Median