Upload
kongtzehfung
View
229
Download
1
Embed Size (px)
Citation preview
7/21/2019 4 Role of Statistics in Research
1/8
The
Role
of
Statistics
in
Research
Scnles
or
MTRSUREMENT
Nominal
Scale
Ordinal
Scale
lnterval
Scale
Ratio
Scale
lmportance
of
Scales
of
Measurement
7/21/2019 4 Role of Statistics in Research
2/8
Pr()icct
is
nrcant
ttl1;enerar
lize.
Apopulation
ci.rn
Lre
us
l.rr1laclly
clcfirccJ
as
all
f
the
people
in
the
world,
Jr
"":l1il
living
organisnrr;
,r.
ih"
population
an
be
more
narrowly
definedall_18
,22?yiurolds
or
a'orin"
psychor_
gy
majors
at
a
particular
schSgl.
Typically,'u
."r"urcher
cannot
test
a1
of
he
members
of
a
target
populatior,.
irrt"uj,
itl
a
smalr
percentage
of
that
opulationa
ru*pl"
tf'th.
memberscan
te
tested
il;
sample
is
il":iil:::fr:r."'''.
the
entire
populatio.,
,o
,hu,
if
we
identify
a
characteris_
the
wom"r.,_fr"_tor
example,
that
tn"
mur.,
lr,
the
sample
are
taller
than
r.r,he"ii*ffi
'riff
,f.;r[:Hit;TJHrJH**r*Ti*,*'*ji;
e
that
a
char'acteristic
of
our
sampre
*"
u"
g;eralized
to
the
popuration.
et's
assume
that
a
."r"ur.hu,
nur.oii".r"a
a
memory
experiment,
nd
the
mnemonic
(memo.,
,::lllry")
;rr*
performs
bltt".
than
the
ontrol
group.
perh1ps
the
purucrpants
in
the
averase
or
18
out
or
zo
*o.ir,
undthe
.."r.i#iilr'Jlit3:J:""r"#1.l
16
out
of
20
v'ords.
At
this
foi;;
,h;;#"inu,
nor
yer
supporred
the
lternative
hypothesis
thatinr,"o.i.
instructions
lead
to
bette
erformance,
even
though
ts
i,
greater
,n""
16.
The
."r"urJnt"T::lnow
that
the
s,ample
of"purii.ipants
in
the
r
i?1il"#*:y#r,f
."::::,:.,'m[:lnthe.il;';:ilT;ff
:]lJ?ff;
:Iffi
JTfi*'f*ff
JJ:"#:ft
:nft
::'i:T,",i,"Jffi
::.'ff#*[.
This
chapter
offers
a
nontechnical
introduction
to
some
statistical
oncepts'
The
purpose
is
to
familiariz
yr"
*iin
some
of
the
terms
and
ii:X'JT*,'l
orgu'iri'g
and
anaryzins
d;;;
,iu,irr,.ully
so
that
you
wiu
vour
studies,
t";#,tTiils;if,
,T:";:'lutt'u'
r'o,
encounter
as
part
of
('lt.t1rl1'1'
l;,,,,,
ScaTES
oF
T{easUREMENT
'l'lrt'
ltolt'ol'Stirtistics
ilr
l{t.st..ll'r'll
b5
Ltst'cl in this
example are
that
the number
you
assign to
yotrr hirp1rir11.**
lnust
be between l
and 10, where
l refers
to a lack
of
happiness
alrtl l0
rcferrs
to an
abundance
of happiness.
Not all measurement
systems are
equivalent.
Some measurements
can be mathematically
manipulatedfor
instance,
by adding
a constant
or by taking
the
square root
of each numberbut
still keep its
primary
characteristics.
Other
systems are
very intolerant
of any mathematical
manipulation;
adding
a
constant
or taking
the square root renders
the
data
meaningless.
Measurement
systems can
be
assigned
to
one
of four
scales
of measurement
that vary
by
the level
of mathematical
manipula
tion
they can tolerate. These
four
scales
of measurement
(also
called lev
els of measurement)
are
the nominal,
ordinal, interval,
and ratio
scales.
Nominal
Scale
The
nominal
scale
of
measurement
merely
classifies objects
or
indi
viduals
as
belonging
to different
categories. The
order
of
the
categories is
arbitrary
and
unimportant.
Thus,
participants might
be
categorized as
male
or female, and the male
category
may
be assigned
the number 1
and
the female
category assigned
the number
2. These
numbers
say
nothing
about
the importance
of one
category as
compared
to the other. The
num
bers
could
just
as well
be 17.35
and29.46.
Other
examples
of
nominal
scales
of
measurement
are numbers
on basketball
players'
jerseys
or the
numbers
assigned
by the
Department
of Motor
Vehicles
to the license
plates
of cars.
Numbers, when
used in
a nominal
scale
of
measurement,
serve as'labels
only', and
provide no information
on the magnitude
or
amount
of the
charatteristic being
measured.
Ordinal
Scale
An
ordinal scale
differs from
a nominal
scale
in
that the
order of the
categories
is important.
A
grading system with
the grades A, B,
C, D, and
F
is an
ordinal
scale.
The
order of the
categories reflects
a decrease in
the
amount
of the stuff being
measuredin
this
case, knowledge.
Note,
how
ever, that the
distance
between the
categories
is not necessarily
equal.
Thus, the difference between
one
A
and
one
B
is
not
necessarily the same
as
the difference
between
another A
and another B.
Similarly, the differ
ence between
any A and B is
not necessarily
the same as
the difference
betweenaBandaC.
'Rankorder
dataiis
also measured
on an
ordinal scale. An
observer
may rankorder
participants according
to attractiveness
or
a researcher
rnay
ask
tasters to rankorder
a number
of crackers according
to saltiness.
iWhen
your eyesight is
tested and
you
are
asked to
choose which
of
twtl
lenses
results in
a clearer image,you
are being
asked to
provide
ordinirlf
,data.
Again, when
data are rankordered,
a
statement is beirrg
r)I.)tlt',
about
th{magnitude
or
amount
of the characteristic
being m('asurt'tl)l',rrr'
the intervals
between
units need
not
be equivalent. If
sevcn pt'o1rlt'.rrt.
,"r"jl.iltllltlt'
satisfactory
work
depends
o1
using
the
right
toots.
rn
appropri",",."ir?ll,
l?ffiXt"jT
Tt
statistical
to;l;
,r,1"r,
the most
ue'+**""o*.ln"ffi
i,:r'Tii,"JH:,*i;Ji[il'iff
',l,r::fi*i{j
nigue
is
to
identify
the
typ"
if
oata
bein
g
analyzed.
In
psychorogy,
researchers
assume
that
anything
that
exists_be
it
a
3ff
t3t"'ff
;i:T,Tl':::,.,iJ,'"r"*r.,"rgr.,,;;:p,yihorogi.urconstruct,
ff
trT*"ffi
:'JJ:T*:,'il,",'J::T'1fr'"il.;'il,:H,il""H;:li;
rgsr)
n",i"f
r''w
happy
il;:l,.T:itr"
::J"
H
if,'iJ::
ry;i::
easurement.
It
entails
identifying
a
.r,uru.i".itic,
your
happiness,
and
uantifying
the
amount
or
huppin"r,
yo.r
u."
experiencing.
The
rules
7/21/2019 4 Role of Statistics in Research
3/8
(r(r
(
lt,tlrlt'r.
l,orrr.
t''tllk.rtlert'cl
tll't
attractivencss,
the
clifference
i.
attr..rcti\,(,.(,ss
[rt,lr.vt,t,.
lrt'first;rtrcl
seconcl
person
is
not
necessarily
t5e
sarn..
.s
the
c.litit,rt,rrct,
r.tween
the
second
a'd
the
third.
The
firsf
and
sec'ncl
pcrsons
may
b.t6
e
very
attractive,
yjrh
only
the
smalr"rt
iilr"rence
between
them,
wrrirc
he
third
person
might
be
,.rbstuntially
less
attractive
than
the
seconc,.
Interual
Scale
The
interval
scale
of
measurement
is
characterized
by
equal
units
of
easurement
throughout
the
scale.
Thus,
measurements
made
with
an
nterval
scale
provide information
about
loth
the
order
and
the
relative
uantity
of
the
characteristic
being
..,"ur.r.J.
r,r"..ral
scales
of
measure
ent,
however,
do
not
have
a
true
zero
value.
A
true
zero
means
that
one
of
the
characteristic
being
measu."d
l".r,uins.
Temperature
mea
urements
in
degrees
Fahrenh.eit
or
in
d"gr"",
Celsius
(also
called
centi
rade)
correspond
to
interval
scales.
ThE
dlrtu.,."
between
degrees
is
qual
over
the
full
length
of
the
scale;
the
difference
between
20"
and
40o
s
the
same
as
that
betri'een
40o
and
60".
In.,"i,n".
scale,
however,
is
there
true
zero;
zero
simply
represents
another
point
on
the
,.u1",
and
nega_
ive
numbers
are
potribl"
and.
meaningfJ
tg".u.rr"
there
is
no
true
zero
n
these
scales,
it
is
inappropriate
to
sa'y
tnui+0.
is
twice
as
warm
as
20o.
;'In
other
words,
ratios
.ir,.roi
be
compuied
with
intervar
scale
data.)
There
is
a
controversy
among psychological
researchers
regarding
nterval
and
ordinal
scaler
i.,
,"luJion"to
,atfig.
suppose
that
a
partici_
ant
is
asked
to
rate
something
on
a
scare
itn
pi.ti..,tu,
".a
points,
uch
as
1
to
7
or
0
to
5.
For
exanipre,
u
p"rro.,
might
be
asked
the
folrow_
ng
rating
question:
How
satisfied
are
you
with
your
friendships?
7
2
3
4
5
6'7
B
9
10
ery
dissatisfied
very
satisfied
The
end
numbers
usuaily
have
labels,
but
the
middre
numbers
sometimes
o
not'
The
controversy
arises
as
to
whether
the
ratings
should
be
consid
red
ordinal
data
or
interval
data.
what
nu,
,r"rr"r
been
ascertained
is
hether
the
scales
that
peopre
use
in
their
heads
have
units
oi
"qrrut
,rr".
f
the
gnits
are
equal
itt
tir",
the
data
could
be
regarded
as
interval
data;
if
he;z
are
.r.,"qrur,
the
data
should
u"
,"gurd"Jas
ordinal
data.
This
is
a
oint
of
contention
because
interval
dala
often
permit
the
use
of
more
owbrful
statistics
than
do
ordinal
data.
There
is
stilr
no
consensus
about
the
nature
of
rating
scale
data.
In
ome
research
areas,
ratings
tend
to
be
t."uteJ
cautiously
and
are
consicl
red
ordinal
data.
In
otheiareassuch
as
langrug"
and
memory
studies,
here
participants
may
be
asked
to
rate
hofuiliar
a
phrase
is
or
how
trong
their
feeling
of
knowing
isratings
t"r,J
to
be
treatecr
as
interval
ata'
The
particuJai
philosophi
or
any
paiti..,to,l
are;r
of
study
is
p.ssiblv
est
ascertained
from
prerriorrsresearch
in
that
area.
J
.
I lrt'
liolr'
ol
sl,ttistit's
itl l{t'st',tt't
lt
b7
Ratio Scale
I'lrr,
ratio scale
of
measurement
provides
information
about
tlrder;
all
rrrrits,rrt,of cqLrill
size
throughoutthe
scale,
and
there
is
a true
zero
verlue
tlr.rt ru'1rrescrrts
an
atbsence
of
the characteristic
being
measured.
The true
z(,1'() itll1lws
rrttios of
values
to be
formed.
Thus,
a person
who
is S0years
oltl is tr,r,icc
as olcl
as a person
who
is
25.
Age
in
years
is a
ratio scale.
Each
vt'ar
rcpresents
the
same
amount
of time
no
matter
where
it occurs
on
the
scalc;
tlre
year
between
20 and
27
years
of
age
is
the same
amount
of time
.rs
the verlr between
54
and
55.
As you
may
have
noticed,
the scales
of
measurement
can be arranged
Siersrchically
from
nominal
to
ratio.
Staring
with
the ordinal
scale,
each
scale
includes
all
the capabilities
of
the preceding
scale
plus
something
r1ew.
Thus,
nominal
scales
are simply
categorical,
while
ordinal
scales
are
categorical
with
the
addition
of
ordering
of
the categories.
Interval
scales
of
measurement
involve
ordered
categories
of
equal
size;
in other
words,
the
intervals
between
numbers
on
the scale
are equivalent
throughout
the
scale.
Ratio scales
also
have equal
intervals
but,
in addition,
begin
at
a
true
zero score
that
represents
an
absence
of
the
characteristic
being
mea
sured
and
allows
for the
computation
of
ratios.
Importance of Scales
of
Measurement
The
statistical
techniques
that
are
appropriate
for one
scale
of mea
surement
may
not be
appropriate
for
another.
Therefore,
the
researcher
must
be able
to
identify
the scale
of
measurement
being
used,
so
that
appropriate
statistical
techniques
can
be applied.
Sometimes,
the
inappro
piiut"t
"ts
of
a
technique
is
subtle;
at
other
times,
it
can
be
quite
obvi
ousand
quite
embarrassing
to
a
researcher
who
lets an
inappropriate
statistic
slip
by.
For example,
imagine
that
ten people
are
rankordered
according
1o
height.
In addition,
information
about
the
individuals'
weight
in
pounds
and
age
in
years
is
recorded.
When
instructing
the com
put&
to
ialculate
arithmetic
averages,
the
researcher
absentmindedly
includes
the
height
rankings
along
with
the other
variables.
The
com
puter
calculates
that
the
average
age of
the
participants
is 22.6,years,
that
ih"
urr"ruge
weight
of
the group
is
155.6 pounds,
and that the
average
height
is 5'5".
Calculating
an average
of ordinal
data, such
as
the
height
,u.,kirrgr
in this
example,
will yield
little
useful
information.
Meaningful
results
will only
be obtained
by using
the statistical
technique
appropri
ate
to the
data's
scale
of
measurement.
On
what
scale
of
measurement
would
each
measured?
of the
following
data
be
7/21/2019 4 Role of Statistics in Research
4/8
('lt,tIrlt't'
l].ttl'
a. The number
of dollars
in one's wallet.
b. The
rated
sweetness of
a can
of soda.
c. Whether
one responds
yes
or no to
a
question.
d. Height
measured
in inches.
e.
The gender
of individuals.
I:
*"wmeasulef
':
"rnl
T*i:T:11
n"1:l:
TYpes
oF
SrarrsrrcAr
TEcHNTeUES
Having
recognized
the type
of data
collected,
the researcher
needs
also
to consider
the
question that
he
or she wants
to answer.
You
can't
tighten
a
screw with a hammer,
and
you can't answer
one
research
ques
tion
with a
statistical
test meant for
a different
question.
Let's
consider
three
questions that
a
researcher
might
ask:
1. How
can
I
describe
the data?
2.
To what
degree
are these
two variables
related
to
each
other?
3.
Do
the
participants
in
this group
have different
scores than
the
participants
in
the
other
group?
These
three
questions
require
the
use
of
different types
of
statistical tech
niques. The
scale
of measurement
on which
the data
were
collected deter
mines
more
specifically
which
statistical
tool
to use.
DescribinS
the Data
When a researcher
begins
organizing
a set of data, it
can be very
use
ful to
determine
typical
characteristics
of
the different
variables. The
sta
tistical techniques
used for
this task are aptly
called descriptive
statistics.
Usually, researchers
use two
types
of descriptive
statistics: a
description
of the
average
score and
a description
of
how
spread
out or close together
the data lie.
Averages
Perhaps
the most
commonly discussed
characteristic
of a data
set
is its
average.
However,
there
are three
different averages
that
can be
calcu
lated:
the
mode,
the median,
and
the mean.
Each
provides
somewhat
dif
ferent
information.
The
scale
of measurement
on which the data are
collected
will, in
part, determine
which average
is
most appropriate
to
use.
Let's'consider
a researcher
who
has
collected
data
on people's weight
measured
in pounds;
hair
color
categortzed
according
to 10
shades rang
ing
from light
to
dark;
and
eye color labeled
as
blue,
green, brown,
or
other. This researcher
has measured data
on three different
scales
of
mea
surement:
ratio,
ordinal,
and nominal,
respectively.
When
describing
the
I'lrt'
l{olt'
ol
Stiltistit's
itt
l{t'st"ll't
ll
(r(l
t,yt,col0rs
Of
thc
participa,.t,r,.:h"
researcher
will
neecl
t(l
Ltst'it
tlillt't't'trl
statistic
tharn
wher
a"r..iUing
the
participants'
average
weight'
.f.
describe
the
eye
colo*
of
ih"
p*ti.ipunts,
tie
researcher
w.ttltl
use
thc
mode.
The
mode
is
defined
as
the
score
that
occurs
most
fre
quently.
Thus,
if
most
of
the
purai.ipunts
had
brown
eyes,
brown
would
be
the
modal
eye
color.
somelimes
a
set
of
data
will
have
two
scores
that
tie
for
occurring
most
frequentlf
tn
tnut
case,
the
distribution
is
said
to
be
bimodal.
If
three
or
more
scores
are
tied
for
occurring
most
frequently'
the
distribution
is
said
to
be
multimodal'
In
our
".;;i;;
nui,
color.
is
measured.
on
an
ordinal
scale
of
mea
surement,
since
we
have
no
evidence
that
the
ten
shades
of
hair
color
are
equallydistantfromeachother'Todescribeaveragehaircolor,the
researcher
could
use
the
mode'
the
median'
or
perhaps
bgth'
The
median
is
defined
as
the
middle
pointin.a'al"'::^11:s,
the
point
below
which
50%
of
the
scores
fall'
The
median
is
especially
useful
because
it
pro.lt"rltr,ror.r,ation
about
the
distribution
of
other
scores
in
the
set)If
,h"
*":;;[ffi;;i;;;"t
the
eighth
darkest
hair
categorv'
then
we
know
that
half
of
the
participants
hai
h.air
in
categories
8
to
10'
and
thattheotherhalfoftheparticipantshadhair'incategoriesltoS.
Finally,
orrr."r"urcher
will'want
to
describe
the
participants'
average
weight.Theresearchercouldusethemodeorthemedianhere,orthe
researche,,,r";;;h
ro.rr"
the mean. The mean is the
arithmetic
average
of
the
scores
in
a
distribution;it
is
calculated
by
adding
uP
the
scores
in
the
distribution
and
dividing
by
the
number
oj
scores'
The
mean
is
probuury
tn.
'*or1commonly
"r".1
tIP:.
of
average,
in
part
because
it
islmathematically
very
manipuiablef
It
is
difficult
to
write
a
formula
that
describes
how
to
calculate
the
mode
or
median'
but
it
is
not
difficult
to
write
a
formula
for
adding
a
set
of
scores
and
dividing
the
sum
by
the
number
of
scores.
Because
oflhis,
the
mean
can
be
embedded
within
other
formulas'
The
mean
d.oes,
however,
have
its
limitations'
scores
that
are
inordi
nately
large
or
small
(called
outliers)
are
given
as
much
weight
as
every
other
score
in
the
distributiorr,
ahi,
can
aflect
the
mean
score'
which
will
be
inflated
if
the
outlier
i,
turg" and
deflated
if
the outlier is small'
For
example,
suppose
a
setof
"*.u'i
scores
rs
82'
88'
84'
86'
and
20'
The
mean
of
these
,.or"i
ts
T2,although
four
of
the
five
people
".u1lud
scores
in
the
gOs.
The
inordinately
small
score,
the
outliet
20,
deflated
the
mean'
Researchers
need
to
watch
o.ri
iot
ini'
ptoblem
when
using
means'
Nev
ertheless,themeanisstillaverypopularaverage'Themeancanbeused
with
data
measured
on
intervur
u.,a
ratio
scales."It
is
sometimes
used
witlr
numerical
ordinal
data
(suc;;r;;;G
scales),
but
it
cannot
be
usccl
witlr
rankorder
data
or
d'ata
measured
on
a
nominal
scale'
The
mode,
median,
and
mean
ale
ways
of
describirrg
tlre
.tr,t't.ltgt'
score
among
a
set
of
data'
rn"f
u'"
often
tutt"d
measures
of
central
ten
7/21/2019 4 Role of Statistics in Research
5/8
lr,r rl , .
l,'orrr.
clclrcy
[rt't'ilttst'tht'y
tt'lttl
t.
c.lescribc
thc
sc()r.s
ilr
tlrt,[rritlr.ll..f
tlrt,tiistri
rtir.)(alth.rrgh
the
m'de
";;
not
be
in
the
midcirt,,at
ar).
A
researcher
observes
cars rrfArinc r^.t
r
rec
o rds
th
e
ge
n
d
e r
of
th
e
;G:fi
:
ir,
ffi.f
il'
;:
j#
:l'
:f,
J
".:
l.1
"
type
of
car
(Ford,
chevroret
,
yrazda,ua.j,
"na
the
speed
at
which
the
car
rives
through
the
rot
(measured
with
a ,radargun
in
mph).
a.
For
each
type of
data
measured,
what
wourd
be
an
appropriate
aver_
ge
to
carcurate
(mode,
median,
and/or
mean)?
b'
one
driver
travered
through
the
parking
rot
at
a
speed
20
mph
igher
than
any
other
1r'u.*
wh;;
.fp'"
or
average
would
be
most
lO.cted
by
this
one
score?
Another.,:p^*"*
.n*?:::,:11.
",
"
,",
:;
*
,,
;"
,";.""
;,
hich
the{scores
are
crose
to
the.avi*r"
;
;."
spread
outf
statistics
that
escribe
this
chara.t"rlrii.
u"r"
.ut"a
"";;s
of
dispersion.
Measures
of
Dispersion
Although
they
can be
used
with
nominal
and
ordin
aI
data,measures
f
dispersion
are
uged
p.iurity
with
r"i"r""i
or
ratio
data.
he
most
straightiorward
""ur.r."";i;rrp"rsion
is
the
range.
The
ffil?#::;r",T.H:,:r"?,;::re
varues
r;J;,"s
in
a
discrete
data
set
or
tinu
ous
dis
trlb
u
ti
on.
In
u
d ir..:?
:J::"1,
"f":
:
ff
:
::i:l
**rl
;:1.
ible'
such
as
the
numu"r
rr
times
J
r"ote
of
women
have
been
regnant;
as
they
say,
you
."":t_b::
lt*""oinantshe
either
is
or
she
sn't'
In
a
contirr.ro*
distribution
set,
fractionstf
scores
are
possibre,
such
ffii:,l'fi?:t
peopre
i"
u'u*pre;
ror
,;;;
,h"
;;;;"
;ill.n
is
very
tn"
nigh",;
;;;J:lt#r#il
d
bv
subtracting
the
r';;;;.ore
rrom
Range
=
Highest
_
Lowest
+
1
we
add"
1
so
that
the
range
will
include
both
the
highest
value
and
the
?il,"::;tff
*i:i*t**:*ffi
;ffi;"#"Lno
rr
iio,''.u,
72s,
776,
202_110+1=98
Ifl;fple
of
scores
covers
98
pounds
from
the
lightest
to
the
heaviest
I lrt' Itolt'
oI St.t t ist it's itt
ltt'sr.,r rt'lr 7 I
'l'lrt'rarrge
tt'lls lts ovcr lrow rri.lny
scores thc data arc
sprearcl, btrt it
titrcs
not
give Lts any information
about
how
the scores are distributed
over the
range.
lt is limited
because
it relies
on
only two scores from
the entire distri
L'rtrtion. But it
does
provide us with
some useful information
about
the
spread
of the
scores and
it
is appropriate
for
use with
ordinal,
interval,
and
ratio
data.
A more
commonly used measure
of dispersion is
the standard
devia
tion. The
standard
deviation may
be thought
of as
expressing the
average
distance
that the
scores in a
set of data fall from
the mean.
For
example,
imagine
that the mean
score
on an exam was74.If
the
class all
performed
about the
same,
the
scores
might
range
from
67
to
81;
this
set
of
data
would
have a relatively
small
standard deviation, and the average
dis
tance from
the mean
of
74
would
be fairly
small.
On
the
other hand, if
the
members
of
the
class
performed
less
consistentlyif
some did very well,
but
others did
quite poorly,
perhaps with
scores
ranging
from
47 to 100
the
standard deviation would
be
quite
large;
the
average
distance from
the mean
of 74 would
be fairly
big.
The
standard deviation
and its
counterpart,
the variance
(the
stan
dard deviation
squared),
are
probably the most
commonly used measures
of dispersion. They
are used individually
and also are
embedded
within
other more
complex formulas. To
calculate a standard
deviation
or vari
ance,
you
need
to know the mean.
Because
we typically
calculate a mean
with
data measured
on interval
or ratio
scales,
standard
deviation
and
variance
are not appropriate
for use
with nominal
data.
Learning
to
calculate standard
deviation and variance
is not
neces
sary for the
purposes
of
this
book
(although
it is
presented
in
appendix
A).
The underlying
conceptthe notion
of how
spread out
or clustered
the data areis important,
however,
especially in research
where two
or
more
groups of
data
are being
compared. This issue
will
be discussed a
little later
in
the
chapter.
The
weather report includes
information
about
the normal
tempera
ture for the
day.
Suppose that today
the temperature
is
l0
degrees
above
normal. To
determine if
today is
a
very
strange day
or not especially
strange,
we need to know
the standard deviation.
lf we learn
that
the stan
dard
deviation is l5
degrees,
what might we conclude
about how normal
or abnormal
the
weather
is
today? lf the
standard deviation is
5 degrees,
what d":.
:h1: :yBest
about
today's w3af3r?
Measures
of
Relationships
Often a
researcher
will want
to know more
than the averttgr'
.rrrr1
degree
of dispersion for
different variables.
Sometimes, the reseirrclrcr'
7/21/2019 4 Role of Statistics in Research
6/8
7?
(
'lt,tPlct'
lrotrt'
w.rrrts
to
leirrrr
how nruch two variables are rcl;rtet1 to
orrt..lnotht'r.
lrr tlris
cilse, thc
rcsearrcher
would want
to calculate
a
correlation.
A
crlrrclrrtion
is
ir measure
of
the
degree of relationship between two variables. For
exam
ple,
if we
collected data on the number
of
hours
students
studied for
a
midterm
exam and the grades received
on
that
exam,
a
correlation could
be
calculated between the hours studied and the midterm grade.
We
might find
that those with
higher midterm
grades tended to study
more
hours,
while those with lower midterm
grades tended to study for
fewer
hours. This
is described
as
a
positive
correlation.
With
a
positive
correla
tion, an increase in
one variable is accompanied by an increase in
the
other variable. With a negative correlation, by
contrast,
an
increase
in one
variable
is accompanied
by u decrease in the
other
variable. A
possible
negative
correlation might
occur
between the number
of
hours
spent
watching
television the night before an
exam and the scores on the exam.
As
the number
of
hours
of
viewing increase,
the exam scores decrease.
A mathematical formula is used to
calculate
a correlation coefficient,
and the resulting number will
be somewhere between
1.00
and
+1.00.
The closer the number is to
either
+1.00
or
1.00,
the stronger the relation
ship
between the variables is. The
closer the number is to
0.00, the
weaker the
correlation is.
Thus, +.85
represents a relatively
strong posi
tive
correlation, but
+.03 represents a weak
positive correlation.
Similarly,
.9L
represents a
strong
negative
correlation,
but
.12
represents a rela
tively weak negative
correlation. The strength
of
the relationship is repre
sented by the absolute value
of
the
correlation coefficient. The direction
of
the relationship is represented
by the sign of the
correlation
coefficient.
Therefore,
.91.
represents
a stronger corcelation
than
does
+.85.
A particular type
of graph called a scattergram is used to demon
strate the
relationship
between two variables. The two variables
(typically
called the
x
and the y variables) are
plotted on the same graph.
The
r vari
able
is
plotted along the horizontal xaxis, and the
y
variable is
plotted
along the vertical
yaxis. Figure 4.1 is a
scattergram of
the hypothetical
data for number
of
hours
studied and midterm
exam scores.
Each
point on
figure 4.1 represents
the two
scores
for
each
person. To
calculate
a
correlation
there
must
be
pairs
of
scores
generated
by
one
set
of participants, not two
separate
sets
of
scores
generated by separate sets
of participants. Notice that the
points
tend
to
form
a pattern
from
the
lower
left corner to the upper right
corner.
This lower
left to upper right
pattern
is
hn indication
of
a
positive correlation. For a negatiue
correlation,
the points show a
pattern
from the top left
corner to the bottom
right
cor
ner. Furthermore,
the
more
closely the points fall along a
straight
line, the
stronger the
correlation between the two variables. Figure 4.2
presents
several scattergrams
representing
positive and negative
correlations of
various
strengths.
Several
types
of correlations can be calculated.
The
two
most
com
mon are Pearson's
productmoment correlation
(more
often called Pear
Ilrt'
l{olt'ol
Statistics
irr
ltt'st"ll't'lt
73
Sttl],S
r,)
arrcl
Spcarrmirn,S
rlrtl
(ftlr
which
the
Corresponding
Grcc,k
synrbtll
is
7,).
l,c.rrs.r.,s
,';;';;"J
when
tn."i*o
tariables
being
correlated
are
mea
strrccl
()n
interv.i
,.
*,io
scales.
when
one
or
both
variables
are
mea
sured
()n
an
'rdinal
scale,
"rp"".lu[y
if
the
variables
are
rankordered'
Spearma,.,,,
,t,o.i,
;;pp;iu,".
Oi^.i
correlation
coefficients
can
be
calcu
lated
for
situatrorrr"lun"rr,
for
example,
one
variable
is
measured
on
an
Figure
4.1
q"i:t
of
midterm
exam
scores
and
numbers
of'hou"
spent studying
for the
exam
l6
t)
oo

>\.
;E
oi
O ,
0c
c0)
)uD
z
30
45
50
t>
Midterm
exam
score
Figure
4.2
torrel
ations
of
different
Scattergrams
rePresenting
cor
strengths
and
directions
(c)
Weak
Positive
(b)
Strong
negatrve
(a)
Strong
Positive
(e)
No
correlation
(d)
Weak
negative
7/21/2019 4 Role of Statistics in Research
7/8
76 ('lt,tpl1'1.
l,'prrl.
Table
4.1
Statistical
Technique
Some
Appropriate
statistics
for
Different
,,..r.ilrrJl
cales
of
Measurement
I Irt' liolt' ol St.ttistics
itr lit'st',tt'r lt
77
'l'hcrc
irrc
thrce ways tcl
mcasure an averagc:
thc moclr', tltt'rttt'tli,rrr,
ancl thc
mean. The mode is the most
frequent
score;
the
mediartr is tltt't'r'tt
tral
scclre;
and the
mean is the arithmetic average of
the data set.
Measures of
dispersion
provide
information about
how clusterecj
together or spread out
the data are
in
a
distribution.
The range describes
the
number
of score
values the
data are spread across.
The variance
and
standard
deviation provide
information about
the average
distance the
scores fall from the
mean.
A
researcher
might
also
ask
if
two
variables
are
related
to
each
other.
This
question
is answered by calculating
a correlation coefficient.
The cor
relation
coefficient
is
a
number between
1.00
and
+1.00. The
closer
the
coefficient
is to
either
1.00
or
+1.00,
the stronger
the correlation
is.
The
negative and positive signs
indicate whether the
variables are changing
in
the same
direction
(a
positive correlation)
or
in
opposite
directions
(a
negative
correlation).
Finally,
a
researcher may wish to compare
sets of scores
in
order
to
determine
if
an
independent variable
had an
effect
on a
dependent
vari
able. A number of statistical techniques can
be used to
look
for
this
differ
ence.
The appropriate technique
depends on
a number of
factors, such
as
the
number
of groups
being compared
and the scale of
measurement on
which the data were collected.
If data at the ratio or
interval
level
were collected,
the statistical
tech
niques
that look for
differences between groups
have the same
underly
ing logic. A difference between groups
is
considered
to exist
when
the
variation among the scores
between the groups
is considerably greater
than the
variation among the scores
ruithin the group.
When data are measured on ordinal
or
nominal scales, other
statisti
cal techniques can
be used; these tend to be
less
powerful
than
those used
for data on
ratio and interval scales,
though.
Statistical
techniques
are necessary
to test research
hypotheses once
data have been collected.
Knowledge of
this field
is
essential
for research
psychologists.
IvrponrANT
TEnvrs AND
CoNcnprs
analysis of
variance
(ANOVA)
median
betweengroups
variance
mode
bimodal
correlation
descriptive statistics
error
variance
interval scale
mean
measurement
multimodal
negative
correlation
nominal scale
nonparametric tests
ordinal scale
outliers
parameter
measures
of central
tendency parametric
tests
Scales
of
Measurement
Nominal
Ordinal
lnterval
Ratio
.
Averages
mode
mode,
median
mode,
median,
mean
mode,
median,
mean
.
Measures
of
dispersion
range, s.d.,"
variance
range,
s.d.,
variance
.
Correlations
eQhi)
coefficient
Spearman's
p
Pearson's
r
Pearson's
r
4.
Single
group
compared
to
a
population
72
Goodness
72
cooaness
ofFit
ofFit
ztest,
single_
sample
t
ztest,
single_
sample
t
5.
Two
separate
grouPs
72
Tolb
x2
Tol
Wilcoxon's
ranksum,
72
Tol
Wilcoxon's
ranksum,
72
Tol,
indepen
5.
Three
or
more
grouPs
72
Tol
dentsamples
t
72
Tol
ANOVA,
KruskalWallis
ANOVA,
KruskalWallis
.
One group
tested
twice
MannWhitney
MannWhitney
U,
dependent
samples
t
U,
dependent
samples
t
Standard
a"ui".ioi
b
77
turt
of
independence
SurvuvlARy
Researchers
use
statistics
to
herp
them
test
their
hypotheses.
often,
sta_
istics
are
used
to
generalize
the
rer,rlt,
from
u
,umpt"
to
a
larger
population.
Mhich
type
oi
statisticar
t".t",r,iq.r"
i,
.rrJ
d"p"r,d,
on ttre
scale
of
mea_
uremn
on which
the data
ur"
.br".t"d.
D;;;
measured
on
a
nominal
cale
are
classified
in
different
lategories.
order
is
not
important
for
nomi_
al
data,
but
it
is
for
autu
mear.*"a
""
",.
o;;;;
scale
of
measurement.
The
ata
mehsured
on
an
interval
scale
"f
,.,;uJ;rl"",
are
also
ordered
but,
in
ddition,
the
units
of
measutu."r,,.*"
equal
throughout
the
scale.
The
ratio
cale
of
measurement
is
much
like
the
*d;;i;.Ilu,
"r,."pt
that
it
includes
a
rue
zero,
which
indicates
";;;;ount
of
the
construct
being
measured.
he
scale
of
measu.u"ilor
the
data
anJthe
questioriueing
asked
y
the
researcher
determi"u
*^ut
statistical
technique
should
be
used.
hen
describin
g
dut?,.d;;.tp*e
statistics
are
used.
These
include
aver_
ges
and
measures
of
disperri"".
measures
of
dispersion population
7/21/2019 4 Role of Statistics in Research
8/8
i
I
I
I
7tl
('lr,rpl1,1.;;.,,,,
l)(
)si
t i
v('
t'orr.r,la
t
iorr
raltgc
ratio
scale
sample
scattergram
ExpncrsEs
standard
cleviatiorr
ftest
variance
withingroup
variance
I lrc ltolt'ol Sl.ttistir's
itt ltt'st',tt't'lt
7q
Concept
Question
4.2
a.
Irc>r
gender,
the
mode;
for the number
in
the car,
the
median trtrcl/or
mode; for the type of
car, the mode;
for the
speed,
the
mean,
mediatt,
and/or
mode.
b.
The mean.
Concept
Question
4.3
If the standard
deviation
is 15, a day
that is 10 degrees
above the
nor
mal temperature
is
not an unusually
warm day; however,
if the standard
deviation
is
5, a
day
that is
10
degrees above
the
normal
temperature
is
twice
the average
distance from the mean
(roughly),
and thus
is an
unusually
warm
day.
Exercises
1.
Nominal:
license
plate
numbers, eye color. Ordinal:
ordered preference
for
five types of cookies, class
rank. Interval: degrees
Fahrenheit,
money in
your
checking
account
(assuming
you can
overdraw). Ratio:
loudness
in
decibels,
miles per gallon.
There are
of
course
any number
of other
correct
answers.
3.
a.
range,
standard
deviation,
and
variance
b.
range,
standard
deviation, and
variance
c.
range
5.
A
positive
correlation
describes
a relationship
in which two
variables
change
together
in the same direction.
For
example,
if
the
number of
violent crimes
increases as crowding
increases, that would
be a posi
tive
correlation.
A
negative correlation
describes a
relationship
in
which two
variables change
together
in
opposite
directions.
For
instance,
if weight gained
increases as the amount
of exercise
decreases,
that would
be a
negative
correlation.
I
t
::""ri1"L*t"ples
of
variables
corresponding
to
each
of
the
scales
of
' ^
,:^I;?Xr;^er
measures
height
in
inches,
what
averages
might
be
b'
rf
a
researcher
measures
height
by
assignrlq
pgople
to
the
categories
hort'
medium,
and
tall,
wnlt
"";.;;;;might
be
calculated?
3'
a'
If
a
researcher
measrr"?
y^"tgnt
in
pounds,
what
measures
of
dis_ersion
could
be
calculated?
,
Ifj"'::ff;.i:i:i:1,#J"Tr"isht
in
ounces,
what
measures
of
disper_
c'
If
a
researcher
measures
weight
by
assigning
each
person
to
either
ji.1,'.tliy;ffj'.Xil,"T,
n
"
u
"?
:
;id;;;:
wh
a
t
*
"
u,,,",
or
d
i
sp
e r
_
4.
Which correlation
is stron
ger:
_.g7or +.55?
5'
what
is
the
difference
between
a
positive
and
a
negative
correlation?
rovide
an
exampre
other
than
the
one
in
the
chapter.
6'
A
researcher
studying
the
effect
of
a
speedreading
course
on
readingimes
compares
the
s'cores
of
a
grorf'rh;
has
taken
the
course
with
hose
of
a
control
group'
The
resear.h".
finds
that
the
ratio
of
the
vari
tion
between
the
groups
to
the
variation
irt,i'
the
group
is
equal
to
'76'
A
colleag'"
do",
a
simila*trlyu.a
rir,a,
a
ratio
of
between_
roups
variatiol
to
withingroup
variation
of
7.32.which
ratio
is
more
ikely
to
suggest
a
signifi.uit
aiir"rence
between
reading
groups?
ANswERs
To
CorucEpT
euESTroNs
AND
OonNuMB
ERED
ExERCriEs
Note:
There
w'l
often
be
more
than
one
correct
answer
for
each
of
hese
questions.
Consurt
ith
yo,rr
instructor
about
your
own
answers.
Concept
euestion 4.1
a.
ratio
b.
ordinal
or
interval
c.
nominal
d.
ratio
e.
nominal
f.
ordinal