Upload
amirah-akhyar
View
214
Download
0
Embed Size (px)
Citation preview
8/20/2019 Quantitative Variables.docx
1/86
Quantitative Variables
Author(s)David M. Lane
PrerequisitesVariables
1. Stem and Leaf Displays
2. Histograms
3. Frequeny !olygons
". #o$ !lots
%. #o$ !lot Demonstration
&. #ar '(arts
). Line *rap(s
+. Dot !lots
,s disussed in t(e setion on variables in '(apter 1- quantitative
variables are variables measured on a numeri sale. Heig(t-eig(t- response time- sub/etive rating of pain- temperature- andsore on an e$am are all e$amples of quantitative variables.0uantitative variables are distinguis(ed from ategorialsometimes alled qualitative variables su( as favorite olor-religion- ity of birt(- and favorite sport in (i( t(ere is noordering or measuring involved.
(ere are many types of grap(s t(at an be used to portray
distributions of quantitative variables. (e upoming setions over
t(e folloing types of grap(s4 1 stem and leaf displays- 2(istograms- 3 frequeny polygons- " bo$ plots- % bar (arts-
& line grap(s- ) satter plots disussed in a different (apter-
and + dot plots. Some grap( types su( as stem and leaf displays
are best5suited for small to moderate amounts of data- (ereas
ot(ers su( as (istograms are best5suited for large amounts of
http://onlinestatbook.com/2/introduction/variables.htmlhttp://onlinestatbook.com/2/graphing_distributions/stem.htmlhttp://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://onlinestatbook.com/2/graphing_distributions/freq_poly.htmlhttp://onlinestatbook.com/2/graphing_distributions/boxplots.htmlhttp://onlinestatbook.com/2/graphing_distributions/boxplot_demo.htmlhttp://onlinestatbook.com/2/graphing_distributions/bar_chart.htmlhttp://onlinestatbook.com/2/graphing_distributions/line_graphs.htmlhttp://onlinestatbook.com/2/graphing_distributions/dotplots.htmlhttp://onlinestatbook.com/2/describing_bivariate_data/intro.htmlhttp://onlinestatbook.com/2/graphing_distributions/stem.htmlhttp://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://onlinestatbook.com/2/graphing_distributions/freq_poly.htmlhttp://onlinestatbook.com/2/graphing_distributions/boxplots.htmlhttp://onlinestatbook.com/2/graphing_distributions/boxplot_demo.htmlhttp://onlinestatbook.com/2/graphing_distributions/bar_chart.htmlhttp://onlinestatbook.com/2/graphing_distributions/line_graphs.htmlhttp://onlinestatbook.com/2/graphing_distributions/dotplots.htmlhttp://onlinestatbook.com/2/describing_bivariate_data/intro.htmlhttp://onlinestatbook.com/2/introduction/variables.html
8/20/2019 Quantitative Variables.docx
2/86
data. *rap( types su( as bo$ plots are good at depiting
differenes beteen distributions. Satter plots are used to s(o
t(e relations(ip beteen to variables.
Stem and Leaf Displays
Author(s)David M. Lane
PrerequisitesDistributions
Learning Objectives
1. 'reate and interpret basi stem and leaf displays
2. 'reate and interpret ba65to5ba6 stem and leaf displays
3. 7udge (et(er a stem and leaf display is appropriate for agiven data set
, stem and leaf display is a grap(ial met(od of displaying data. 8t
is partiularly useful (en your data are not too numerous. 8n t(issetion- e ill e$plain (o to onstrut and interpret t(is 6ind of
grap(.
,s usual- an e$ample ill get us started. 'onsider able 1 t(at
s(os t(e number of touchdown passes D passes t(ron by
ea( of t(e 31 teams in t(e 9ational Football League in t(e 2:::
season.able 1. 9umber of tou(don passes.
37, 33, 33, 32, 29, 28, 28, 23, 22,22, 22, 21, 21, 21, 20, 20, 19, 19,
18, 18, 18, 18, 16, 15, 14, 14, 14,
12, 12, 9, 6
, stem and leaf display of t(e data is s(on in Figure 1. (e leftportion of Figure 1 ontains t(e stems. (ey are t(e numbers 3- 2-
http://onlinestatbook.com/2/introduction/distributions.htmlhttp://glossary%28%27touchdown_pass%27%29/http://onlinestatbook.com/2/introduction/distributions.htmlhttp://glossary%28%27touchdown_pass%27%29/
8/20/2019 Quantitative Variables.docx
3/86
1- and :- arranged as a olumn to t(e left of t(e bars. (in6 oft(ese numbers as 1:;s digits. , stem of 3- for e$ample- an be usedto represent t(e 1:;s digit in any of t(e numbers from 3: to 3
8/20/2019 Quantitative Variables.docx
4/86
reserved for t(e numbers from 3: to 3" and (olds t(e 32- 33- and
33 D passes made by t(e ne$t t(ree teams in t(e table. @ou an
see for yourself (at t(e ot(er ros represent.
3|7
3|2332|889
2|001112223
1|56888899
1|22444
0|69
Figure 2. Stem and leaf display it( t(e stems split in to.
Figure 2 is more revealing t(an Figure 1 beause t(e latter
figure lumps too many values into a single ro. >(et(er you s(ouldsplit stems in a display depends on t(e e$at form of your data. 8f
ros get too long it( single stems- you mig(t try splitting t(em
into to or more parts.
(ere is a variation of stem and leaf displays t(at is useful for
omparing distributions. (e to distributions are plaed ba6 to
ba6 along a ommon olumn of stems. (e result is a Aba65to5
ba6 stem and leaf grap(.B Figure 3 s(os su( a grap(. 8t
ompares t(e numbers of D passes in t(e 1
8/20/2019 Quantitative Variables.docx
5/86
Figure 3. #a65to5ba6 stem and leaf display. (e left side s(os t(e1
8/20/2019 Quantitative Variables.docx
6/86
anot(er sub/et as 2)." milliseonds sloer pronouning
aggressive ords (en t(ey ere preeded by eapon ords.
(e data are displayed it( stems and leaves in Figure ". Sine
stem and leaf displays an only portray to (ole digits one for t(e
stem and one for t(e leaf- t(e numbers are first rounded. (us- t(evalue "3.2 is rounded to "3 and represented it( a stem of " and a
leaf of 3. Similarly- "2.< is rounded to "3. o represent negative
numbers- e simply use negative stems. For e$ample- t(e bottom
ro of t(e figure represents t(e number 52). (e seond5to5last
ro represents t(e numbers 51:- 51:- 51%- et. ?ne again- e
(ave rounded t(e original values from able 2.
4|33
3|6
2|00456
1|00134
0|1245589
-0|0679
-1|005559
-2|7
Figure ". Stem and leaf display it( negative numbers and rounding.
?bserve t(at t(e figure ontains a ro (eaded by C:C and anot(er
(eaded by C5:.C (e stem of : is for numbers beteen : and
8/20/2019 Quantitative Variables.docx
7/86
Figure %. Stem and leaf display of populations of 1+% ES ities it(populations beteen 1::-::: and %::-::: in 1
8/20/2019 Quantitative Variables.docx
8/86
Question 1 out of 7., stem and leaf display is a good met(od of displaying largeamounts of data.
rue
False
Stem and leaf displays an be unieldy it( large amounts of databeause every single data value is s(on in t(e figure.
Histograms
Author(s)David M. Lane
PrerequisitesDistributions- *rap(ing 0ualitative Data
Learning Objectives
1. 'reate a grouped frequeny distribution
2. 'reate a (istogram based on a grouped frequeny distribution
3. Determine an appropriate bin idt(
, (istogram is a grap(ial met(od for displaying t(e s(ape of a
distribution. 8t is partiularly useful (en t(ere are a large numberof observations. >e begin it( an e$ample onsisting of t(e sores
of &"2 students on a psy(ology test. (e test onsists of 1
8/20/2019 Quantitative Variables.docx
9/86
(e first step is to reate a frequency table. Enfortunately- a
simple frequeny table ould be too big- ontaining over 1:: ros.
o simplify t(e table- e group sores toget(er as s(on in able 1.
able 1. *rouped Frequeny Distribution of !sy(ology est Sores
Interval's
Lower Limit
Interval's
Upper Limit
Class
Frequency
39.5 49.5 3
49.5 59.5 10
59.5 69.5 53
69.5 79.5 107
79.5 89.5 147
89.5 99.5 130
99.5 109.5 78
109.5 119.5 59
119.5 129.5 36
129.5 139.5 11
139.5 149.5 6
149.5 159.5 1
159.5 169.5 1
o reate t(is table- t(e range of sores as bro6en into intervals-
alled class intervals. (e first interval is from 3
8/20/2019 Quantitative Variables.docx
10/86
More information on (oosing t(e idt(s of lass intervals is
presented later in t(is setion. !laing t(e limits of t(e lass
intervals miday beteen to numbers e.g.- "eGll (ave more to say about
s(apes of distributions in t(e (apter C Summariing Distributions.C
8n our e$ample- t(e observations are (ole numbers.
Histograms an also be used (en t(e sores are measured on a
more ontinuous sale su( as t(e lengt( of time in milliseonds
required to perform a tas6. 8n t(is ase- t(ere is no need to orry
about fene5sitters sine t(ey are improbable. 8t ould be quite a
oinidene for a tas6 to require e$atly ) seonds- measured to t(e
nearest t(ousandt( of a seond. >e are t(erefore free to (oose
(ole numbers as boundaries for our lass intervals- for e$ample-
http://glossary%28%27skew%27%29/http://onlinestatbook.com/2/summarizing_distributions/shapes.htmlhttp://glossary%28%27skew%27%29/http://onlinestatbook.com/2/summarizing_distributions/shapes.html
8/20/2019 Quantitative Variables.docx
11/86
":::- %:::- et. (e lass frequeny is t(en t(e number of
observations t(at are greater t(an or equal to t(e loer bound- and
stritly less t(an t(e upper bound. For e$ample- one interval mig(t
(old times from "::: to "
8/20/2019 Quantitative Variables.docx
12/86
seemed learest. (e best advie is to e$periment it( different
(oies of idt(- and to (oose a (istogram aording to (o ell it
ommuniates t(e s(ape of t(e distribution.
o provide e$periene in onstruting (istograms- e (ave
developed an interative demonstration. (e demonstration revealst(e onsequenes of different (oies of bin idt( and of loer
boundary for t(e first interval.
Frequency olygons
Author(s)David M. Lane
PrerequisitesHistograms
Learning Objectives
1. 'reate and interpret frequeny polygons
2. 'reate and interpret umulative frequeny polygons
3. 'reate and interpret overlaid frequeny polygons
Frequeny polygons are a grap(ial devie for understanding t(e
s(apes of distributions. (ey serve t(e same purpose as
(istograms- but are espeially (elpful for omparing sets of data.
Frequeny polygons are also a good (oie for
displaying cu!ulative frequency distributions.
o reate a frequeny polygon- start /ust as for (istograms- by
(oosing aclass interval" (en dra an K5a$is representing t(e
values of t(e sores in your data. Mar6 t(e middle of ea( lass
interval it( a ti6 mar6- and label it it( t(e middle value
represented by t(e lass. Dra t(e @5a$is to indiate t(e frequeny
of ea( lass. !lae a point in t(e middle of ea( lass interval at
t(e (eig(t orresponding to its frequeny. Finally- onnet t(e
points. @ou s(ould inlude one lass interval belo t(e loest value
http://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://glossary%28%27cumulative%27%29/http://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://glossary%28%27class_interval%27%29/http://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://glossary%28%27cumulative%27%29/http://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://glossary%28%27class_interval%27%29/
8/20/2019 Quantitative Variables.docx
13/86
in your data and one above t(e (ig(est value. (e grap( ill t(en
tou( t(e K5a$is on bot( sides.
, frequeny polygon for &"2 psy(ology test sores s(on in
Figure 1 as onstruted from t(e frequeny table s(on in able 1.
able 1. Frequeny Distribution of !sy(ology est Sores.
Lower
Limit
Upper
Limit Count
Cumulative
Count
29.5 39.5 0 0
39.5 49.5 3 3
49.5 59.5 10 13
59.5 69.5 53 66
69.5 79.5 107 173
79.5 89.5 147 320
89.5 99.5 130 450
99.5 109.5 78 528
109.5 119.5 59 587
119.5 129.5 36 623
129.5 139.5 11 634
139.5 149.5 6 640
149.5 159.5 1 641
159.5 169.5 1 642
169.5 179.5 0 642
(e first label on t(e K5a$is is 3%. (is represents an interval
e$tending from 2
8/20/2019 Quantitative Variables.docx
14/86
distribution is not symmetri inasmu( as good sores to t(e rig(t
trail off more gradually t(an poor sores to t(e left. 8n t(e
terminology of '(apter 3 (ere e ill study s(apes of
distributions more systematially- t(e distribution is skewed .
Figure 1. Frequeny polygon for t(e psy(ology test sores.
, cu!ulative frequency polygon for t(e same test sores is s(on
in Figure 2. (e grap( is t(e same as before e$ept t(at t(e @ value
for ea( point is t(e number of students in t(e orresponding lass
interval plus all numbers in loer intervals. For e$ample- t(ere are
no sores in t(e interval labeled C3%-C t(ree in t(e interval C"%-C and
1: in t(e interval C%%.C (erefore- t(e @ value orresponding to C%%C
is 13. Sine &"2 students too6 t(e test- t(e umulative frequeny
for t(e last interval is &"2.
http://glossary%28%27skew%27%29/http://glossary%28%27cumulative_frequency_poly%27%29/http://glossary%28%27skew%27%29/http://glossary%28%27cumulative_frequency_poly%27%29/
8/20/2019 Quantitative Variables.docx
15/86
Figure 2. 'umulative frequeny polygon for t(e psy(ology testsores.
Frequeny polygons are useful for omparing distributions. (is is
a(ieved by overlaying t(e frequeny polygons dran for different
data sets. Figure 3 provides an e$ample. (e data ome from a tas6
in (i( t(e goal is to move a omputer ursor to a target on t(e
sreen as fast as possible. ?n 2: of t(e trials- t(e target as asmall retangle on t(e ot(er 2:- t(e target as a large retangle.
ime to rea( t(e target as reorded on ea( trial. (e to
distributions one for ea( target are plotted toget(er in Figure 3.
(e figure s(os t(at- alt(oug( t(ere is some overlap in times- it
generally too6 longer to move t(e ursor to t(e small target t(an to
t(e large one.
8/20/2019 Quantitative Variables.docx
16/86
Figure 3. ?verlaid frequeny polygons.
8t is also possible to plot to umulative frequeny distributions in
t(e same grap(. (is is illustrated in Figure " using t(e same data
from t(e ursor tas6. (e differene in distributions for t(e to
targets is again evident.
8/20/2019 Quantitative Variables.docx
17/86
Figure ". ?verlaid umulative frequeny polygons.
@ou mig(t be urious about your on performane in t(e ursortas6. ry t(e tas6 yourself - and ompare your times it( ours.
Question 1 out of !., frequeny polygon is very similar to a
(istogram
stem and leaf display
listing of ra data
http://newwindow3%28%27target_time.html%27%2C%20660%2C510%29/http://newwindow3%28%27target_time.html%27%2C%20660%2C510%29/
8/20/2019 Quantitative Variables.docx
18/86
Frequeny polygons do not list t(e ra data- as stem and leaf plotsdo. Frequeny polygons are very similar to (istograms- e$ept(istograms (ave bars and frequeny polygons (ave dots and lines
onneting t(e frequenies of ea( lass interval.
"o# lots
Author(s)David M. Lane
Prerequisites!erentiles- Histograms- Frequeny !olygons
Learning Objectives
1. Define basi terms inluding (inges- H5spread- step- ad/aentvalue- outside value- and far out value
2. 'reate a bo$ plot
3. 'reate parallel bo$ plots
". Determine (et(er a bo$ plot is appropriate for a given data
set
>e (ave already disussed te(niques for visually representing data
see(istograms and frequeny polygons. 8n t(is setion- e present
anot(er important grap( alled a bo# plot . #o$ plots are useful for
identifying outliers and for omparing distributions. >e ill e$plain
bo$ plots it( t(e (elp of data from an in5lass e$periment. ,s part
of t(e CStroop 8nterferene 'ase Study-C students in introdutory
statistis ere presented it( a page ontaining 3: olored
retangles. (eir tas6 as to name t(e olors as qui6ly as possible.(eir times in seonds ere reorded. >eGll ompare t(e sores
for t(e 1& men and 31 omen (o partiipated in t(e e$periment
by ma6ing separate bo$ plots for ea( gender. Su( a display is said
to involve parallel bo# plots.
http://onlinestatbook.com/2/introduction/percentiles.htmlhttp://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://onlinestatbook.com/2/graphing_distributions/freq_poly.htmlhttp://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://onlinestatbook.com/2/graphing_distributions/freq_poly.htmlhttp://glossary%28%27boxplot%27%29/http://onlinestatbook.com/2/case_studies/stroop.htmlhttp://glossary%28%27parallel_box_plots%27%29/http://onlinestatbook.com/2/introduction/percentiles.htmlhttp://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://onlinestatbook.com/2/graphing_distributions/freq_poly.htmlhttp://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://onlinestatbook.com/2/graphing_distributions/freq_poly.htmlhttp://glossary%28%27boxplot%27%29/http://onlinestatbook.com/2/case_studies/stroop.htmlhttp://glossary%28%27parallel_box_plots%27%29/
8/20/2019 Quantitative Variables.docx
19/86
(ere are several steps in onstruting a bo$ plot. (e first relies
on t(e 2%t(- %:t(- and )%t( perentiles in t(e distribution of sores.
Figure 1 s(os (o t(ese t(ree statistis are used. For ea( gender-
e dra a bo$ e$tending from t(e 2%t(perentile to t(e
)%t( perentile. (e %:t( perentile is dran inside t(e bo$.(erefore-
the bottom of each box is the 25th percentile,
the top is the 75th percentile,
and the line in the middle is the 50th percentile.
(e data for t(e omen in our sample are s(on in able 1.
able 1. >omenGs times.
14
15
16
16
17
17
17
17
17
18
18
18
18
18
18
19
19
19
20
20
20
20
20
20
21
21
22
23
24
24
29
For t(ese data- t(e 2%t( perentile is 1)- t(e %:t( perentile is 1
8/20/2019 Quantitative Variables.docx
20/86
Figure 1. (e first step in reating bo$ plots.
#efore proeeding- t(e terminology in able 2 is (elpful.able 2. #o$ plot terms and values for omenGs times.
Name Formula Value
Upper Hinge 75th Percentile 20
Lower Hinge 25th Percentile 17
H-Spread Upper Hinge - Lower Hinge 3
Step 1.5 x H-Spread 4.5
Upper Inner
FenceUpper Hinge + 1 Step 24.5
Lower Inner
FenceLower Hinge - 1 Step 12.5
Upper Outer
FenceUpper Hinge + 2 Steps 29
Lower Outer Lower Hinge - 2 Steps 8
8/20/2019 Quantitative Variables.docx
21/86
Fence
Upper
Adjacent
Largest value below Upper Inner
Fence24
Lower
Adjacent
Smallest value above Lower
Inner Fence14
Outside
Value
A value beyond an Inner Fence
but not beyond an Outer Fence29
Far Out
ValueA value beyond an Outer Fence None
'ontinuing it( t(e bo$ plots- e put C(is6ersC above and beloea( bo$ to give additional information about t(e spread of t(edata. >(is6ers are vertial lines t(at end in a (oriontal stro6e.>(is6ers are dran from t(e upper and loer (inges to t(e upperand loer ad/aent values 2" and 1" for t(e omenGs data.
Figure 2. (e bo$ plots it( t(e (is6ers dran.
8/20/2019 Quantitative Variables.docx
22/86
,lt(oug( e donGt dra (is6ers all t(e ay to outside or far out
values- e still is( to represent t(em in our bo$ plots. (is is
a(ieved by adding additional mar6s beyond t(e (is6ers.
Speifially- outside values are indiated by small CoGsC and far outvalues are indiated by asteris6s . 8n our data- t(ere are no far
out values and /ust one outside value. (is outside value of 2< is for
t(e omen and is s(on in Figure 3.
Figure 3. (e bo$ plots it( t(e outside value s(on.
(ere is one more mar6 to inlude in bo$ plots alt(oug( sometimes
it is omitted. >e indiate t(e mean sore for a group by inserting a
plus sign. Figure " s(os t(e result of adding means to our bo$
plots.
8/20/2019 Quantitative Variables.docx
23/86
Figure ". (e ompleted bo$ plots.
Figure " provides a revealing summary of t(e data. Sine (alf
t(e sores in a distribution are beteen t(e (inges reall t(at t(e
(inges are t(e 2%t( and )%t(perentiles- e see t(at (alf t(e
omenGs times are beteen 1) and 2: seonds- (ereas (alf t(emenGs times are beteen 1< and 2%.%. >e also see t(at omen
generally named t(e olors faster t(an t(e men did- alt(oug( one
oman as sloer t(an almost all of t(e men. Figure % s(os t(e
bo$ plot for t(e omenGs data it( detailed labels.
8/20/2019 Quantitative Variables.docx
24/86
Figure %. (e bo$ plot for t(e omenGs data it( detailed labels.
#o$ plots provide basi information about a distribution. For
e$ample- a distribution it( a positive s6e ould (ave a longer
(is6er in t(e positive diretion t(an in t(e negative diretion. ,
larger mean t(an median ould also indiate a positive s6e. #o$
plots are good at portraying e$treme values and are espeially good
at s(oing differenes beteen distributions. Hoever- many of t(e
details of a distribution are not revealed in a bo$ plot- and toe$amine t(ese details one s(ould reate a (istogram andNor a ste!
and leaf display .
Here are some ot(er e$amples of bo$ plots4ime to move t(e mouse over a targetDraft lottery
V$%&$'&()S () "(* L('S
Statistial analysis programs may offer options on (o bo$ plots are
reated. For e$ample- t(e bo$ plots in Figure & are onstruted fromour data but differ from t(e previous bo$ plots in several ays.
1. 8t does not mar6 outliers.
2. (e means are indiated by green lines rat(er t(an plus signs.
http://glossary%28%27stem_and_leaf_plot%27%29/http://glossary%28%27stem_and_leaf_plot%27%29/http://newwindow%28%27boxplots_files/target_boxplot.html')http://newwindow%28%27boxplots_files/draft.html')http://glossary%28%27stem_and_leaf_plot%27%29/http://glossary%28%27stem_and_leaf_plot%27%29/http://newwindow%28%27boxplots_files/target_boxplot.html')http://newwindow%28%27boxplots_files/draft.html')
8/20/2019 Quantitative Variables.docx
25/86
3. (e mean of all sores is indiated by a gray line.
". 8ndividual sores are represented by dots. Sine t(e sores(ave been rounded to t(e nearest seond- any given dot mig(trepresent more t(an one sore.
%. (e bo$ for t(e omen is ider t(an t(e bo$ for t(e menbeause t(e idt(s of t(e bo$es are proportional to t(enumber of sub/ets of ea( gender 31 omen and 1& men.
Figure &. #o$ plots s(oing t(e individual sores and t(e means.
=a( dot in Figure & represents a group of sub/ets it( t(e
same sore rounded to t(e nearest seond. ,n alternative
grap(ing te(nique is to jitter thepoints. (is means spreading out
different dots at t(e same (oriontal position- one dot for ea(
sub/et. (e e$at (oriontal position of a dot is determined
randomly under t(e onstraint t(at different dots don;t overlap
e$atly. Spreading out t(e dots (elps you to see multiple
ourrenes of a given sore. Hoever- depending on t(e dot sie
and t(e sreen resolution- some points may be obsured even if t(e
points are /ittererd. Figure ) s(os (at /ittering loo6s li6e.
8/20/2019 Quantitative Variables.docx
26/86
Figure ). #o$ plots it( t(e individual sores /ittered.
Different styles of bo$ plots are best for different situations- and
t(ere are no firm rules for (i( to use. >(en e$ploring your data-
you s(ould try several ays of visualiing t(em. >(i( grap(s you
inlude in your report s(ould depend on (o ell different grap(s
reveal t(e aspets of t(e data you onsider most important.
Question 1 out of +.>(at is t(e upper (ingeO
8/20/2019 Quantitative Variables.docx
27/86
#
'
D
F
(e upper (inge is t(e )%t( perentile. 8t is t(e top of t(e bo$.
"ar ,-arts
Author(s)David M. Lane
8/20/2019 Quantitative Variables.docx
28/86
Prerequisites*rap(ing 0ualitative Variables
Learning Objectives
1. 'reate and interpret bar (arts
2. 7udge (et(er a bar (art or anot(er grap( su( as a bo$ plotould be more appropriate
8n t(e setion on qualitative variables- e sa (o bar (arts ould
be used to illustrate t(e frequenies of different ategories. For
e$ample- t(e bar (art s(on in Figure 1 s(os (o many
pur(asers of iMa omputers ere previous Maintos( users-
previous >indos users- and ne omputer pur(asers.
Figure 1. iMa buyers as a funtion of previous omputer oners(ip.
8n t(is setion- e s(o (o bar (arts an be used to presentot(er 6inds of quantitative information- not /ust frequeny ounts.
(e bar (art in Figure 2 s(os t(e perent inreases in t(e Do
7ones- Standard and !oor %:: S P !- and 9asdaq sto6 inde$es
from May 2"t( 2::: to May 2"t( 2::1. 9otie t(at bot( t(e S P !
and t(e 9asdaq (ad Anegative inreasesB (i( means t(at t(ey
http://onlinestatbook.com/2/graphing_distributions/graphing_qualitative.htmlhttp://onlinestatbook.com/2/graphing_distributions/graphing_qualitative.htmlhttp://onlinestatbook.com/2/graphing_distributions/graphing_qualitative.htmlhttp://onlinestatbook.com/2/graphing_distributions/graphing_qualitative.html
8/20/2019 Quantitative Variables.docx
29/86
dereased in value. 8n t(is bar (art- t(e @5a$is is not frequeny but
rat(er t(e signed quantity percentage increase"
Figure 2. !erent inrease in t(ree sto6 inde$es from May 2"t( 2:::to May 2"t( 2::1.
#ar (arts are partiularly effetive for s(oing (ange over time.
Figure 3- for e$ample- s(os t(e perent inrease in t(e 'onsumer
!rie 8nde$ '!8 over four t(ree5mont( periods. (e flutuation in
inflation is apparent in t(e grap(.
Figure 3. !erent (ange in t(e '!8 over time. =a( bar representsperent inrease for t(e t(ree mont(s ending at t(e date indiated.
#ar (arts are often used to ompare t(e means of different
e$perimental onditions. Figure " s(os t(e mean time it too6 one
of us DL to move t(e mouse to eit(er a small target or a large
8/20/2019 Quantitative Variables.docx
30/86
target. ?n average- more time as required for small targets t(an
for large ones.
Figure ". #ar (art s(oing t(e means for t(e to onditions.
,lt(oug( bar (arts an display means- e do not reommend t(em
for t(is purpose. #o$ plots s(ould be used instead sine t(ey
provide more information t(an bar (arts it(out ta6ing up more
spae. For e$ample- a bo$ plot of t(e mouse5movement data is
s(on in Figure %. @ou an see t(at Figure % reveals more about t(e
distribution of movement times t(an does Figure ".
Figure %. #o$ plots of times to move t(e mouse to t(e small and largetargets.
http://onlinestatbook.com/2/graphing_distributions/boxplots.htmlhttp://onlinestatbook.com/2/graphing_distributions/boxplots.html
8/20/2019 Quantitative Variables.docx
31/86
(e setion on qualitative variables presented earlier in t(is (apter
disussed t(e use of bar (arts for omparing distributions. Some
ommon grap(ial mista6es ere also noted. (e earlier disussion
applies equally ell to t(e use of bar (arts to display quantitative
variables.
Question 1 out of .#ar (arts an only be used for qualitative variables.
rue
False
,lt(oug( bar (arts an be used for qualitative variables- t(ey analso portray quantitative variables.
Line /rap-s
Author(s)David M. Lane
Prerequisites#ar *rap(s
Learning Objectives
1. 'reate and interpret line grap(s
2. 7udge (et(er a line grap( ould be appropriate for a givendata set
, line grap( is a bar grap( it( t(e tops of t(e bars represented by
points /oined by lines t(e rest of t(e bar is suppressed. For
http://onlinestatbook.com/2/graphing_distributions/graphing_qualitative.htmlhttp://onlinestatbook.com/2/graphing_distributions/bar_chart.htmlhttp://onlinestatbook.com/2/graphing_distributions/graphing_qualitative.htmlhttp://onlinestatbook.com/2/graphing_distributions/bar_chart.html
8/20/2019 Quantitative Variables.docx
32/86
e$ample- Figure 1 as presented in t(e setion on bar (arts and
s(os (anges in t(e 'onsumer !rie 8nde$ '!8 over time.
Figure 1. , bar (art of t(e perent (ange in t(e '!8 over time. =a(bar represents perent inrease for t(e t(ree mont(s ending at t(edate indiated.
, line grap( of t(ese same data is s(on in Figure 2. ,lt(oug( t(e
figures are similar- t(e line grap( emp(asies t(e (ange fromperiod to period.
8/20/2019 Quantitative Variables.docx
33/86
Figure 2. , line grap( of t(e perent (ange in t(e '!8 over time.=a( point represents perent inrease for t(e t(ree mont(s endingat t(e date indiated.
Line grap(s are appropriate only (en bot( t(e K5 and @5a$es
display ordered rat(er t(an qualitative variables. ,lt(oug( bar
grap(s an also be used in t(is situation- line grap(s are generally
better at omparing (anges over time. Figure 3- for e$ample-s(os perent inreases and dereases in five omponents of t(e
'onsumer !rie 8nde$ '!8. (e figure ma6es it easy to see t(at
medial osts (ad a steadier progression t(an t(e ot(er
omponents. ,lt(oug( you ould reate an analogous bar (art- its
interpretation ould not be as easy.
8/20/2019 Quantitative Variables.docx
34/86
Figure 3. , line grap( of t(e perent (ange in five omponents of t(e'!8 over time.
Let us stress t(at it is misleading to use a line grap( (en t(e K5
a$is ontains merely qualitative variables. Figure " inappropriately
s(os a line grap( of t(e ard game data from @a(oo- disussed in
t(e setion on qualitative variables. (e defet in Figure " is t(at it
gives t(e false impression t(at t(e games are naturally ordered in a
numerial ay.
8/20/2019 Quantitative Variables.docx
35/86
Figure ". , line grap(- inappropriately used- depiting t(e number ofpeople playing different ard games on Sunday and >ednesday.
Question 1 out of .Line grap(s are most similar to
bar (arts.
(istograms.
stem and leaf displays.
frequeny polygons.
8/20/2019 Quantitative Variables.docx
36/86
, line grap( is a bar grap( it( t(e tops of t(e bars represented bypoints /oined by lines.
Dot lots
Author(s)David M. Lane
Prerequisites#ar '(arts
Learning Objectives
1. 'reate and interpret dot plots
2. 7udge (et(er a dot plot ould be appropriate for a given dataset
Dot plots an be used to display various types of information. Figure
1 uses a dot plot to display t(e number of M P MGs of ea( olor
found in a bag of M P MGs. =a( dot represents a single M P M. From
t(e figure- you an see t(at t(ere ere 3 blue M P MGs- 1< bron M
P MGs- et.
Figure 1. , dot plot s(oing t(e number of M P MGs of various olorsin a bag of M P MGs.
http://onlinestatbook.com/2/graphing_distributions/bar_chart.htmlhttp://onlinestatbook.com/2/graphing_distributions/bar_chart.html
8/20/2019 Quantitative Variables.docx
37/86
(e dot plot in Figure 2 s(os t(e number of people playing
various ard games on t(e @a(oo ebsite on a >ednesday. Enli6e
Figure 1- t(e loation rat(er t(an t(e number of dots represents t(e
frequeny.
Figure 2. , dot plot s(oing t(e number of people playing variousard games on a >ednesday.
(e dot plot in Figure 3 s(os t(e number of people playing on
a Sunday and on a >ednesday. (is grap( ma6es it easy to
ompare t(e popularity of t(e games separately for t(e to days-
but does not ma6e it easy to ompare t(e popularity of a given
game on t(e to days.
8/20/2019 Quantitative Variables.docx
38/86
Figure 3. , dot plot s(oing t(e number of people playing variousard games on a Sunday and on a >ednesday.
8/20/2019 Quantitative Variables.docx
39/86
Figure ". ,n alternate ay of s(oing t(e number of people playingvarious ard games on a Sunday and on a >ednesday.
(e dot plot in Figure " ma6es it easy to ompare t(e days of t(e
ee6 for speifi games (ile still portraying differenes among
games.
Question 1 out of !.Dot plots are typially used to represent frequenies.
rue
False
,lt(oug( dot plots ould be used to represent statistis su( asmeans- it is not reommended. (ey are typially used forfrequenies.
Statistical Literacy
Author(s)Seyd =ran and David Lane
,re 'ommerial Ve(iles in e$as
EnsafeO
Prerequisites*rap(ing Distributions
http://onlinestatbook.com/2/graphing_distributions/graphing_distributions.htmlhttp://onlinestatbook.com/2/graphing_distributions/graphing_distributions.html
8/20/2019 Quantitative Variables.docx
40/86
, nes report on t(e safety of ommerial ve(iles in e$as stated
t(at one out of five ommerial ve(iles (ave been pulled off t(e
road in 2:12 beause t(ey ere unsafe. 8n addition- 12-3:1
ommerial drivers (ave been banned from t(e road for safety
violations.(e aut(or presents t(e bar (art belo to provide information
about t(e perentage of fatal ras(es involving ommerial ve(iles
in e$as sine 2::&. (e aut(or also quotes D!S diretor Steven
M'ra4
'ommerial ve(iles are responsible for appro$imately 1%perent of t(e fatalities in e$as ras(es. (ose (o (oose todrive unsafe ommerial ve(iles or drive a ommerial ve(ile
unsafely pose a serious t(reat to t(e motoring publi.
0H$' D( (2 'H&)34
#ased on (at you (ave learned in t(is
(apter- does t(is bar (art provide enoug(
information to onlude t(at unsafe or
unsafely driven ommerial ve(iles pose a
serious t(reat to t(e motoring publiO >(at
mig(t you onlude if 3: perent of all t(e
ve(iles on t(e roads of e$as in 2:1: ere
ommerial and aounted for 1& perent of
fatal ras(esO
(is bar (art does not provide enoug( information to dra su( a
onlusion beause e don;t 6no- on t(e average- in a given year(at perentage of all ve(iles on t(e road are ommerial ve(iles.For e$ample- if 3: perent of all t(e ve(iles on t(e roads of e$asin 2:1: are ommerial ones and only 1& perent of fatal ras(esinvolved ommerial ve(iles- t(en ommerial ve(iles are safert(an non5ommerial ones. 9ote t(at in t(is ase ): perent of
8/20/2019 Quantitative Variables.docx
41/86
ve(iles are non5ommerial and t(ey are responsible for +"perent of t(e fatal ras(es.
Linear #y Design
Prerequisites*rap(ing Distributions
Fo$ 9es aired t(e line grap( belo s(oing t(e numberunemployed during four quarters beteen 2::) and 2:1:.
0H$' D( (2 'H&)34
Does Fo$ 9esG
line grap(
provide
misleading
informationO
>(y or >(y
notO
(ere are ma/or flas it( t(e Fo$ 9es grap(. First- t(e title oft(e grap( is misleading. ,lt(oug( t(e data s(o t(e numberunemployed- Fo$ 9es; grap( is titled C7ob Loss by 0uarter.CSeond- t(e intervals on t(e K5a$is are misleading. ,lt(oug( t(ere
are & mont(s beteen September 2::+ and Mar( 2::< and 1%mont(s beteen Mar( 2::< and 7une 2:1:- t(e intervals arerepresented in t(e grap( by very similar lengt(s. (is gives t(efalse impression t(at unemployment inreased steadily.
(e grap( presented belo is orreted so t(at distanes on t(e K5a$is are proportional to t(e number of days beteen t(e dates. (is
http://onlinestatbook.com/2/graphing_distributions/graphing_distributions.htmlhttp://onlinestatbook.com/2/graphing_distributions/graphing_distributions.htmlhttp://onlinestatbook.com/2/graphing_distributions/graphing_distributions.html
8/20/2019 Quantitative Variables.docx
42/86
grap( s(os learly t(at t(e rate of inrease in t(e numberunemployed is greater beteen September 2::+ and Mar( 2::<t(an it is beteen Mar( 2::< and 7une 2:1:.
Learning Objectives
• Construct a stem-and leaf plot.
• Understand the importance of a stem-and-leaf plot in statistics.
• Construct and interpret a pie chart.
• Construct and interpret a bar graph.
• Create a frequency distribution chart.
• Construct and interpret a histogram.
• Use technology to create graphical representations of data.
8/20/2019 Quantitative Variables.docx
43/86
What is the puppet doing? She can’t be cutting a pizza, because the pieces are all
different colors and sizes. t seems li!e she is dra"ing some type of a display to sho"
different amounts of a "hole circle. #he colors must represent different parts of the
"hole. $s you proceed through this lesson, refer bac! to this picture so that you "ill be
able to create a meaningful and detailed ans"er to the question, %What is the puppetdoing?&
Pie Charts
'ie charts, or circle graphs, are used e(tensi)ely in statistics. #hese graphs appear
often in ne"spapers and magazines. $ pie chart sho"s the relationship of the parts to
the "hole by )isually comparing the sizes of the sections *slices+. 'ie charts can be
constructed by using a hundreds dis! or by using a circle. #he hundreds dis! is built on
the concept that the "hole of anything is , "hile the circle is built on the concept
that360∘ is the "hole of anything. /oth methods of creating a pie chart are acceptable,
and both "ill produce the same result. #he sections ha)e different colors to enable anobser)er to clearly see the differences in the sizes of the sections. #he follo"ing
e(ample "ill first be done by using a hundreds dis! and then by using a circle.
Example 10
#he 0ed Cross /lood 1onor Clinic had a )ery successful morning collecting blood
donations. Within 2 hours, people had made donations, and the follo"ing is a table
sho"ing the blood types of the donations3
Blood Type A B O AB
Number of donors 4 5 6 7
Construct a pie chart to represent the data.
Solution:
Step ! 1etermine the total number of donors37+5+9+4=25.
Step "! 8(press each donor number as a percent of the "hole by using the
formulaPercent=fn⋅100%, "here f is the frequency and n is the total number.
8/20/2019 Quantitative Variables.docx
44/86
725⋅100%=28%525⋅100%=20%925⋅100%=36%425⋅100%=16%
Step #! Use a hundreds dis! and simply count the correct number for each blood type
* line 9 percent+.
Step $! :raph each section. Write the name and correct percentage inside the section.
Color each section a different color.
#he abo)e pie chart "as created by using a hundreds dis!, "hich is a circle "ith di)isions in groups of 5. 8ach di)ision *line+ represents percent. ;rom the graph, you
can see that more donations "ere of #ype < than any other type. #he fe"est number of
donations of blood collected "as of #ype $/. f the percentages had not been entered in
each section, these same conclusions could ha)e been made based simply on the size
of each section.
Solution:
Step ! 1etermine the total number of donors37+5+9+4=25.
Step "! 8(press each donor number as the number of degrees of a circle that it
represents by using the formulaDegrees=fn⋅360∘, "here f is the frequency and n is
the total number.
725⋅360∘=100.8∘525⋅360∘=72∘925⋅360∘=129.6∘425⋅360∘=57.6∘
Step #! Using a protractor, graph each section of the circle.
Step $! Write the name and correct percentage inside each section. Color each section
a different color.
8/20/2019 Quantitative Variables.docx
45/86
#he abo)e pie chart "as created by using a protractor and graphing each section of the
circle according to the number of degrees needed. ;rom the graph, you can see that
more donations "ere of #ype < than any other type. #he fe"est number of donations of
blood collected "as of #ype $/. =otice that the percentages ha)e been entered in each
section of the graph and not the numbers of degrees. #his is because degrees "ouldnot be meaningful to an obser)er trying to interpret the graph. n order to create a pie
chart by using a circle, it is necessary to use the formula to calculate the number of
degrees for each section, and in order to create a pie chart by using a hundreds dis!, it
is necessary to use the formula to determine the percentage for each section. n the
end, ho"e)er, both methods result in identical graphs.
Example 11
$ ne" restaurant is opening in to"n, and the o"ner is trying )ery hard to complete the
menu. >e "ants to include a choice of 5 salads and has presented his partner "ith the
follo"ing pie chart to represent the results of a recent sur)ey that he conducted of theto"n’s people. #he sur)ey as!ed the question, What is your fa)orite !ind of salad?
Use the pie chart to ans"er the follo"ing questions3
. Which salad "as the most popular choice?
@. Which salad "as the least popular choice?
2. f 2 people "ere sur)eyed, ho" many people chose each type of salad?
7. What is the difference bet"een the number of people "ho chose the spinach
salad and the number of people "ho chose the garden salad?
Solution:
. #he most popular salad "as the caesar salad.
@. #he least popular salad "as the taco salad.
8/20/2019 Quantitative Variables.docx
46/86
2. Caesar salad335%=35100=0.35
(300)(0.35)=105 people
#aco salad310%=10100=0.10
(300)(0.10)=30 people
Spinach salad317%=17100=0.17
(300)(0.17)=51 people
:arden salad313%=13100=0.13
(300)(0.13)=39 people
Chef salad325%=25100=0.25
(300)(0.25)=75 people
7. #he difference bet"een the number of people "ho chose the spinach salad and the
number of people "ho chose the garden salad is51−39=12 people.
f "e re)isit the puppet "ho "as introduced at the beginning of the lesson, you should
no" be able to create a story that details "hat she is doing. $n e(ample "ould be that
she is in charge of the student body and is presenting to the students the results of a
questionnaire regarding student acti)ities for the first semester.
8/20/2019 Quantitative Variables.docx
47/86
2834233516174705602639354735383555475448
Solution:
Step ! Create the stem-and-leaf plot.
Some people prefer to arrange the data in order before the stems and lea)es are
created. #his "ill ensure that the )alues of the lea)es are in order. >o"e)er, this is not
necessary and can ta!e a great deal of time if the data set is large. We "ill first create
the stem-and-leaf plot, and then "e "ill organize the )alues of the lea)es.
#he leading digit of a data )alue is used as the stem, and the trailing digit is used as the
leaf. #he numbers in the stem column should be consecuti)e numbers that begin "ith
the smallest class and continue to the largest class. f there are no )alues in a class, do
not enter a )alue in the leaf − Aust lea)e it blan!.
Step "!
8/20/2019 Quantitative Variables.docx
48/86
#he number of )alues in the leaf column should equal the number of data )alues that
"ere gi)en in the table. #he )alue that appears the most often in the same leaf ro" is
the trailing digit of the mode of the data set. #he mode of this data set is 25. ;or 4 of the
@ days, the number of animals recei)ing treatment "as bet"een 27 and 26. #he
)eterinarian school treated a minimum of 5 animals and a ma(imum of B animals onany one day. #he median of the data can be quic!ly calculated by using the )alues in
the leaf column to locate the )alue in the middle position. n this stem and leaf plot, the
median is the mean of the sum of the numbers represented by the 10thand
the11thlea)es335+352=702=35.
Example 13
#he follo"ing numbers represent the gro"th *in centimeters+ of some plants after @5
days.
Construct a stem-and-leaf plot to represent the data, and list 2 facts that you !no"about the gro"th of the plants.
18103736613941495052575351573948563336193041513860
Solution:
$ns"ers "ill )ary, but the follo"ing are some possible responses3
• ;rom the stem-and-leaf plot, the gro"th of the plants ranged from a minimum of
cm to a ma(imum of B cm.
• #he median of the data set is the )alue in the 13thposition, "hich is 7 cm.
• #here "as no gro"th recorded in the class of @ cm, so there is no number in the
leaf ro".
• #he data set is multimodal.
Bar &raphs
#he different types of graphs that you ha)e seen so far are plots to use "ith quantitati)e
)ariables. $ qualitati)e )ariable can be plotted using a bar graph. $ bar graph is a plot
made of bars "hose heights *)ertical bars+ or lengths *horizontal bars+ represent the
frequencies of each category. #here is bar for each category, "ith space bet"een
8/20/2019 Quantitative Variables.docx
49/86
each bar, and the data that is plotted is discrete data. 8ach category is represented by
inter)als of the same "idth. When constructing a bar graph, the category is usually
placed on the horizontal a(is, and the frequency is usually placed on the )ertical a(is.
#hese )alues can be re)ersed if the bar graph has horizontal bars.
Example 14
Construct a bar graph to represent the depth of the :reat a!es3
a!e Superior D ,222 ft.
a!e Eichigan D 6@2 ft.
a!e >uron D 45 ft.
a!e
8/20/2019 Quantitative Variables.docx
50/86
. What type of sho" is "atched the most?
@. What type of sho" is "atched the least?
2. $ppro(imately ho" many students participated in the sur)ey?
7. 1oes the graph sho" the differences bet"een the preferences of males andfemales?
Solution:
. Sit-coms are "atched the most.
@. Huiz sho"s are "atched the least.
2. $ppro(imately45+20+18+6+35+16=140students participated in the sur)ey.
7. =o, the graph does not sho" the differences bet"een the preferences of males
and females.
f bar graphs are constructed on grid paper, it is )ery easy to !eep the inter)als the
same size and to !eep the bars e)enly spaced. n addition to helping in the appearance
of the graph, grid paper also enables you to more accurately determine the frequency of
each class.
Example 16
#he follo"ing bar graph represents the part-time Aobs held by a group of grade
students3
Using the abo)e bar graph, ans"er the follo"ing questions3
. What "as the most popular part-time Aob?
@. What "as the part-time Aob held by the least number of students?
2. Which part-time Aobs employed or more of the students?
8/20/2019 Quantitative Variables.docx
51/86
7. s it possible to create a table of )alues for the bar graph? f so, construct the
table of )alues.
5. What percentage of the students "or!ed as a deli)ery person?
Solution:
. #he most popular part-time Aob "as in the fast food industry.
@. #he part-time Aob of tutoring "as the one held by the least number of students.
2. #he part-time Aobs that employed or more students "ere in the fast food, deli)ery,
la"n maintenance, and grocery store businesses.
7. Ies, itJs possible to create a table of )alues for the bar graph.
Part%Time 'obBaby
Sitting
(ast
(ood
)eliver
y
La*n
Care
&rocery
StoreTutoring
Number of
StudentsF 7 @ 2 5
5. #he percentage of the students "ho "or!ed as a deli)ery person "as appro(imately
6.7.
+istograms
$n e(tension of the bar graph is the histogram. $ histogram is a type of )ertical bargraph in "hich the bars represent grouped continuous data. While there are similarities
bet"een a bar graph and a histogram, such as each bar being the same "idth, a
histogram has no spaces bet"een the bars. #he quantitati)e data is grouped according
to a determined bin size, or inter)al. #he bin size refers to the "idth of each bar, and the
data is placed in the appropriate bin.
#he bins, or groups of data, are plotted on the x-a(is, and the frequencies of the binsare plotted on the y-a(is. $ grouped fre,uency distribution is constructed for the
numerical data, and this table is used to create the histogram. n most cases, the
grouped frequency distribution is designed so there are no brea!s in the inter)als. #helast )alue of one bin is actually the first )alue counted in the ne(t bin. #his means that if
you had groups of data "ith a bin size of , the bins "ould be represented by the
notation K-+, K-@+, K@-2+, etc. 8ach bin appears to contain )alues, "hich is
more than the desired bin size of . #herefore, the last digit of each bin is counted as
the first digit of the follo"ing bin.
8/20/2019 Quantitative Variables.docx
52/86
#he first bin includes the )alues through 6, and the ne(t bin includes the )alues 6
through 6. #his ma!es the bins the proper size. /in sizes are "ritten in this manner to
simplify the process of grouping the data. #he first bin can begin "ith the smallest
number of the data set and end "ith the )alue determined by adding the bin "idth to
this )alue, or the bin can begin "ith a reasonable )alue that is smaller than the smallestdata )alue.
Example 17
Construct a frequency distribution table "ith a bin size of for the follo"ing data,
"hich represents the ages of 2 lottery "inners3
384129334074664560552552546146515957666232476550392235727749
Solution:
Step ! 1etermine the range of the data by subtracting the smallest )alue from the
largest )alue.
Range: 77−22=55
Step "! 1i)ide the range by the bin size to ensure that you ha)e at least 5 groups of
data. $ histogram should ha)e from 5 to bins to ma!e it meaningful3 5510=5.5≈6.
Since you cannot ha)e .5 of a bin, the result indicates that you "ill ha)e at least B bins.
Step #! Construct the table.
Bin (re,uency
[20−30) 2
[30−40) 5
[40−50) B
[50−60) F
[60−70) 5
[70−80) 2
Step $! 1etermine the sum of the frequency column to ensure that all the data has been
grouped.3+5+6+8+5+3=30
When data is grouped in a frequency distribution table, the actual data )alues are lost.
#he table indicates ho" many )alues are in each group, but it doesnJt sho" the actual
)alues.
8/20/2019 Quantitative Variables.docx
53/86
#here are many different "ays to create a distribution table and many different
distribution tables that can be created. >o"e)er, for the purpose of constructing a
histogram, the method sho"n "or!s )ery "ell, and it is not difficult to complete. When
the number of data )alues is )ery large, another column is often inserted in the
distribution table. #his column is a tally column, and it is used to account for the number of )alues "ithin a bin. $ tally column facilitates the creation of the distribution table and
usually allo"s the tas! to be completed more quic!ly.
Example 18
#he numbers of years of ser)ice for 45 teachers in a small to"n are listed belo"3
1, 6, 11, 26, 21, 18, 2, 5, 27, 33, 7, 15, 22, 30, 831, 5, 25, 20, 19, 4, 9, 19, 34, 3, 16,
23, 31, 10, 42, 31, 26, 19, 3, 12, 14, 28, 32, 1, 17, 24, 34, 16, 1,18, 29, 10, 12, 30, 13
, 7, 8, 27, 3, 11, 26, 33, 29, 207, 21, 11, 19, 35, 16, 5, 2, 19, 24, 13, 14, 28, 10, 31
Using the abo)e data, construct a frequency distribution table "ith a bin size of 5.
Solution:
Range: 35−1345=34=6.8≈7
Iou "ill ha)e 4 bins.
;or each )alue that is in a bin, dra" a stro!e in the #ally column. #o ma!e counting the
stro!es easier, dra" 7 stro!es and cross them out "ith the fifth stro!e. #his process
bundles the stro!es in groups of 5, and the frequency can be readily determined.
Bin Tally (re,uency
[0−5) |||| |||| |
[5−10) |||| |||| 6
[10−15) |||| |||| || @
[15−20) |||| |||| |||| 7
[20−25) |||| || 4
[25−30) |||| ||||
[30−35) |||| |||| || @
11+9+12+14+7+10+12=75
8/20/2019 Quantitative Variables.docx
54/86
=o" that you ha)e constructed the frequency table, the grouped data can be used to
dra" a histogram. i!e a bar graph, a histogram requires a title and properly labeled x-and y-a(es.
Example 19
Use the data from 8(ample 4 that displays the ages of the lottery "inners to constructa histogram. #he data is sho"n again belo"3
Bin (re,uency
[20−30) 2
[30−40) 5
[40−50) B
[50−60)F
[60−70) 5
[70−80) 2
Solution:
Use the data as it is represented in the distribution table to construct the histogram.
;rom loo!ing at the tops of the bars, you can see ho" many "inners "ere in each
category, and by adding these numbers, you can determine the total number of "inners.
Iou can also determine ho" many "inners "ere "ithin a specific category. ;or
e(ample, you can see that F "inners "ere B years of age or older. #he graph can also
be used to determine percentages. ;or e(ample, it can ans"er the question, %What
percentage of the "inners "ere 5 years of age or older?& as follo"s3
1630=0.533̄̄ ¯̄ (̄0.533)(100%)≈5.3%.
8/20/2019 Quantitative Variables.docx
55/86
Example 20
a+ Use the data and the distribution table that represent the ages of teachers from
8(ample F to construct a histogram to display the data. #he distribution table is sho"n
again belo"3
Bin Tally (re,uency
[0−5) |||| |||| |
[5−10) |||| |||| 6
[10−15) |||| |||| || @
[15−20) |||| |||| |||| 7
[20−25) |||| || 4
[25−30) |||| ||||
[30−35) |||| |||| || @
b+ =o" use the histogram to ans"er the follo"ing questions.
i+ >o" many teachers teach in this small to"n?
ii+ >o" many teachers ha)e "or!ed for less than 5 years?
iii+ f teachers are able to retire "hen they ha)e taught for 2 years or more, ho" many
are eligible to retire?
i)+ What percentage of the teachers still ha)e to teach for years or fe"er before they
are eligible to retire?
)+ 1o you thin! that the maAority of the teachers are young or old? Lustify your ans"er.
Solution:
8/20/2019 Quantitative Variables.docx
56/86
a+
b+ i+11+9+12+14+7+10+12=75
n this small to"n, 45 teachers are teaching.
ii+ teachers ha)e taught for less than 5 years.
iii+ @ teachers are eligible to retire.
i)+1775=0.2266̄̄ ¯̄ (̄0.2266)(100%)≈23%
$ppro(imately @2 of the teachers must teach for years or fe"er before they are
eligible to retire.
)+ $ns"ers "ill )ary, but one possible ans"er is that the maAority of the teachers are
young, because 7B ha)e taught for less than @ years.
#echnology can also be used to plot a histogram. #he #-F2 can be used to create a
histogram by using S#$# and S#$# '
8/20/2019 Quantitative Variables.docx
57/86
Using the #0$C8 feature "ill gi)e you information about the data in each bar of the
histogram.
#he #0$C8 feature tells you that in the first bin, "hich is KB-4+, there are 7 )alues.
#he #0$C8 feature tells you that in the second bin, "hich is K4-F+, there are B )alues.
#o ad)ance to the ne(t bin, or bar, of the histogram, use the cursor and mo)e to the
right. #he information obtained by using the #0$C8 feature "ill enable you to create a
frequency table and to dra" the histogram on paper.
#he shape of a histogram can tell you a lot about the distribution of the data, as "ell as
pro)ide you "ith information about the mean, median, and mode of the data set. #he
follo"ing are some typical histograms, "ith a caption belo" each one e(plaining the
distribution of the data, as "ell as the characteristics of the mean, median, and mode.
1istributions can ha)e other shapes besides the ones sho"n belo", but these representthe most common ones that you "ill see "hen analyzing data. n each of the graphs
belo", the distributions are not perfectly shaped, but are shaped enough to identify an
o)erall pattern.
8/20/2019 Quantitative Variables.docx
58/86
a+
;igure a represents a bell-shaped distribution, "hich has a single pea! and tapers off to
both the left and to the right of the pea!. #he shape appears to be symmetric about the
center of the histogram. #he single pea! indicates that the distribution is unimodal. #he
highest pea! of the histogram represents the location of the mode of the data set. #he
mode is the data )alue that occurs the most often in a data set. ;or a symmetric
histogram, the )alues of the mean, median, and mode are all the same and are all
located at the center of the distribution.
b+
;igure b represents a distribution that is appro(imately uniform and forms a rectangular,
flat shape. #he frequency of each class is appro(imately the same.
c+
;igure c represents a right%s-e*ed distribution, "hich has a pea! to the left of the
distribution and data )alues that taper off to the right. #his distribution has a single pea!
and is also unimodal. ;or a histogram that is s!e"ed to the right, the mean is located to
the right on the distribution and is the largest )alue of the measures of central tendency.
#he mean has the largest )alue because it is strongly affected by the outliers on the
8/20/2019 Quantitative Variables.docx
59/86
right tail that pull the mean to the right. #he mode is the smallest )alue, and it is located
to the left on the distribution. #he mode al"ays occurs at the highest point of the pea!.
#he median is located bet"een the mode and the mean.
d+
;igure d represents a left%s-e*ed distribution, "hich has a pea! to the right of the
distribution and data )alues that taper off to the left. #his distribution has a single pea!and is also unimodal. ;or a histogram that is s!e"ed to the left, the mean is located to
the left on the distribution and is the smallest )alue of the measures of central tendency.
#he mean has the smallest )alue because it is strongly affected by the outliers on the
left tail that pull the mean to the left. #he median is located bet"een the mode and the
mean.
e+
;igure e has no shape that can be defined. #he only defining characteristic about this
distribution is that it has @ pea!s of the same height. #his means that the distribution is
bimodal.
$nother type of graph that can be dra"n to represent the same set of data as a
histogram represents is a frequency polygon. $ fre,uency polygon is a graphconstructed by using lines to Aoin the midpoints of each inter)al, or bin. #he heights of
the points represent the frequencies. $ frequency polygon can be created from the
histogram or by calculating the midpoints of the bins from the frequency distribution
table. #he midpoint of a bin is calculated by adding the upper and lo"er boundary
)alues of the bin and di)iding the sum by @.
Example 22
8/20/2019 Quantitative Variables.docx
60/86
#he follo"ing histogram represents the mar!s made by 7 students on a math test.
Use the histogram to construct a frequency polygon to represent the data.
Solution:
#here is no data )alue greater than and less than @. #he Aagged line that is inserted
on the x-a(is is used to represent this fact. #he area under the frequency polygon is the
same as the area under the histogram and is, therefore, equal to the frequency )alues
that "ould be displayed in a distribution table. #he frequency polygon also sho"s the
shape of the distribution of the data, and in this case, it resembles a bell cur)e.
Example 23
#he follo"ing distribution table represents the number of miles run by @ randomly
selected runners during a recent road race3
Bin (re,uency
[5.5−10.5)
[10.5−15.5) 2
[15.5−20.5) @
8/20/2019 Quantitative Variables.docx
61/86
Bin (re,uency
[20.5−25.5) 7
[25.5−30.5) 5
[30.5−35.5) 2
[35.5−40.5) @
Using this table, construct a frequency polygon.
Solution:Step ! Calculate the midpoint of each bin by adding the @ numbers of the inter)al and
di)iding the sum by @.
Midpoints: 5.5+10.52=162=820.5+25.52=462=2335.5+40.52=762=3810.5+15.
52=262=1325.5+30.52=562=2815.5+20.52=362=1830.5+35.52=662=33
Step "! 'lot the midpoints on a grid, ma!ing sure to number the x-a(is "ith a scale that
"ill include the bin sizes. Loin the plotted midpoints "ith lines.
$ frequency polygon usually e(tends unit belo" the smallest bin )alue and unit
beyond the greatest bin )alue. #his e(tension gi)es the frequency polygon an
appearance of ha)ing a starting point and an ending point, "hich pro)ides a )ie" of the
distribution of data. f the data set "ere )ery large so that the number of bins had to be
increased and the bin size decreased, the frequency polygon "ould appear as a smooth
cur)e.
Lesson Summary
n this lesson, you learned ho" to represent data that "as presented in )arious forms.
1ata that could be represented as percentages "as displayed in a pie chart, or circle
graph. 1iscrete data that "as qualitati)e "as displayed on a bar graph. ;inally,
8/20/2019 Quantitative Variables.docx
62/86
continuous data that "as grouped "as graphed on a histogram or on a frequency
polygon. Iou also learned to detect characteristics of a distribution by simply obser)ing
the shape of a histogram.
8/20/2019 Quantitative Variables.docx
63/86
color of cars, a person’s status, and fa)orite )acation spots. #he follo"ing flo" chart
should help you to better understand the abo)e terms.
Example 1
Select the best descriptions for the follo"ing )ariables and indicate your selections by
mar!ing an Mx’ in the appropriate bo(es.
.ariable /uantitative /ualitative )iscrete Continuous
=umber of members in a family
$ person’s marital status
ength of a person’s arm
Color of cars
=umber of errors on a math test
Solution:
.ariable /uantitative /ualitative )iscrete Continuous
=umber of members in a family x x
$ person’s marital status x
ength of a person’s arm x x
Color of cars x
=umber of errors on a math test x x
Gariables can also be classified as dependent or independent. When there is a linear
relationship bet"een @ )ariables, the )alues of one )ariable depend upon the )alues of
the other )ariable. n a linear relation, the )alues of y depend upon the )alues of x.#herefore, the dependent variable is represented by the )alues that are plotted on
the y-a(is, and the independent variable is represented by the )alues that are plotted
on the x-a(is.
Example 2
8/20/2019 Quantitative Variables.docx
64/86
Sally "or!s at the local ballpar! stadium selling lemonade. She is paid N5. each time
she "or!s, plus N.45 for each glass of lemonade she sells. Create a table of )alues to
represent Sally’s earnings if she sells F glasses of lemonade. Use this table of )alues to
represent her earnings on a graph.
Solution:
#he first step is to "rite an equation to represent her earnings and then to use this
equation to create a table of )alues.
y=0.75x+15, "here y represents her earnings and x represents the number of
glasses of lemonade she sells.
Number of &lasses of Lemonade 0arnings
N5.
N5.45
@ NB.5
2 N4.@5
7 NF.
5 NF.45
B N6.5
4 N@.@5
F N@.
#he dependent )ariable is the money earned, and the independent )ariable is the
number of glasses of lemonade sold. #herefore, money is on the y-a(is, and the
number of glasses of lemonade is on the x-a(is.
;rom the table of )alues, Sally "ill earn N@. if she sells F glasses of lemonade.
8/20/2019 Quantitative Variables.docx
65/86
=o" that the points ha)e been plotted, the decision has to be made as to "hether or not
to Aoin them. /et"een e)ery @ points plotted on the graph are an infinite number of
)alues. f these )alues are meaningful to the problem, then the plotted points can be
Aoined. #his type of data is called continuous data. f the )alues bet"een the @ plotted
points are not meaningful to the problem, then the points should not be Aoined. #his type
of data is called discrete data. Since glasses of lemonade are represented by "hole
numbers, and since fractions or decimals are not appropriate )alues, the pointsbet"een @ consecuti)e )alues are not meaningful in this problem. #herefore, the points
should not be Aoined. #he data is discrete.
=o" it is time to re)isit the problem presented in the introduction.
#he local arena is trying to attract as many participants as possible to attend the
community’s %S!ate for Scoliosis& e)ent. 'articipants pay a fee of N. for registering,
and, in addition, the arena "ill donate N2. for each hour a participant s!ates, up to a
ma(imum of B hours. Create a table of )alues and dra" a graph to represent a
participant "ho s!ates for the entire B hours. >o" much money can a participant raise
for the community if heOshe s!ates for the ma(imum length of time?
Solution:
#he equation for this scenario is y=3x+10, "here y represents the money made by
the participant, and x represents the number of hours the participant s!ates.
Numbers of +ours S-ating 1oney 0arned
N.
N2.
@ NB.
2 N6.
7 N@@.
8/20/2019 Quantitative Variables.docx
66/86
Numbers of +ours S-ating 1oney 0arned
5 N@5.
B N@F.
#he dependent )ariable is the money made, and the independent )ariable is the
number of hours the participant s!ated. #herefore, money is on the y-a(is, and time ison the x-a(is as sho"n belo"3
$ participant "ho s!ates for the entire B hours can ma!e N@F. for the S!ate forScoliosis e)ent. #he points are Aoined, because the fractions and decimals bet"een @
consecuti)e points are meaningful for this problem. $ participant could s!ate for 2
minutes, and the arena "ould pay that s!ater N.5 for the time s!ating. #he data is
continuous.
inear graphs are important in statistics "hen se)eral data sets are used to represent
information about a single topic. $n e(ample "ould be data sets that represent different
plans a)ailable for cell phone users. #hese data sets can be plotted on the same grid.
#he resulting graph "ill sho" intersection points for the plans. #hese intersection points
indicate a coordinate "here @ plans are equal. $n obser)er can easily interpret thegraph to decide "hich plan is best, and "hen. f the obser)er is trying to choose a plan
to use, the choice can be made easier by seeing a graphical representation of the data.
Example 3
#he follo"ing graph represents 2 plans that are a)ailable to customers interested in
hiring a maintenance company to tend to their la"n. Using the graph, e(plain "hen it
"ould be best to use each plan for la"n maintenance.
8/20/2019 Quantitative Variables.docx
67/86
Solution:
;rom the graph, the base fee that is charged for each plan is ob)ious. #hese )alues are
found on the y-a(is. 'lan $ charges a base fee of N@., 'lan C charges a base fee
of N., and 'lan / charges a base fee of N5.. #he cost per hour can be
calculated by using the )alues of the intersection points and the base fee in the
equation y=mx+b and sol)ing for m. 'lan / is the best plan to choose if the la"n
maintenance ta!es less than @.5 hours. $t @.5 hours, 'lan / and 'lan C both cost
N5. for la"n maintenance. $fter @.5 hours, 'lan C is the best deal, until 5 hours
of la"n maintenance is needed. $t 5 hours, 'lan $ and 'lan C both cost N2. for
la"n maintenance. ;or more than 5 hours of la"n maintenance, 'lan $ is the best
plan. $ll of the abo)e information "as ob)ious from the graph and "ould enhance the
decision-ma!ing process for any interested client.
#he abo)e graphs represent linear functions, and are called linear *line+ graphs. 8ach of
these graphs has a defined slope that remains constant "hen the line is plotted. $
)ariation of this graph is a bro-en%line graph. #his type of line graph is used "hen it is
necessary to sho" change o)er time. $ line is used to Aoin the )alues, but the line has
no defined slope. >o"e)er, the points are meaningful, and they all represent animportant part of the graph. Usually a bro!en-line graph is gi)en to you, and you must
interpret the gi)en information from the graph.
Example 4
#he follo"ing graph is an e(ample of a bro!en-line graph, and it represents the time of a
round-trip Aourney, dri)ing from home to a popular campground and bac!.
8/20/2019 Quantitative Variables.docx
68/86
a+ >o" far is it from home to the picnic par!?
b+ >o" far is it from the picnic par! to the campground?
c+ $t "hat @ places did the car stop?
d+ >o" long "as the car stopped at the campground?
e+ When does the car arri)e at the picnic par!?
f+ >o" long did it ta!e for the return trip?
g+ What "as the speed of the car from home to the picnic par!?
h+ What "as the speed of the car from the campground to home?
Solution:
a+ t is 7 miles from home to the picnic par!.
b+ t is B miles from the picnic par! to the campground.
c+ #he car stopped at the picnic par! and at the campground.
d+ #he car "as stopped at the campground for 5 minutes.
e+ #he car arri)ed at the picnic par! at 3 am.
f+ #he return trip too! hour.
g+ #he speed of the car from home to the picnic par! "as 7 miOh.
h+ #he speed of the car from the campground to home "as miOh.Example 5
Sam decides to spend some time "ith his friend $aron. >e hops on his bi!e and starts
off to $aron’s house, but on his "ay, he gets a flat tire and must "al! the remaining
distance.
8/20/2019 Quantitative Variables.docx
69/86
and then Sam returns home. o" far is it from $aron’s house to the mall?
c+ $t "hat time did Sam ha)e a flat tire?
d+ >o" long did Sam stay at $aron’s house?
e+ $t "hat speed did Sam tra)el from $aron’s house to the mall and then from the mall
to home?
Solution:
a+ t is @5 !m from Sam’s house to $aron’s house.
b+ t is 5 !m from $aron’s house to the mall.
c+ Sam had a flat tire at 3 am.
d+ Sam stayed at $aron’s house for hour.
e+ Sam tra)eled at a speed of 2 !mOh from $aron’s house to the mall and then at a
speed of 7 !mOh from the mall to home.
8/20/2019 Quantitative Variables.docx
70/86
#he connection is ob)ious−"hen the price of peaches "as high, the sales "ere lo",
but "hen the price "as lo", the sales "ere high.
#he follo"ing scatter plot sho"s the sales of a "ee!ly ne"spaper and the temperature3
#here is no connection bet"een the number of ne"spapers sold and the temperature.
$nother term used to describe @ sets of data that ha)e a connection or a relationship
is correlation. #he correlation bet"een @ sets of data can be positi)e or negati)e, and it
can be strong or "ea!. #he follo"ing scatter plots "ill help to enhance this concept.
f you loo! at the @ s!etches that represent a positi)e correlation, you "ill notice that the
points are around a line that slopes up"ard to the right. When the correlation is
negati)e, the line slopes do"n"ard to the right. #he @ s!etches that sho" a strong
correlation ha)e points that are bunched together and appear to be close to a line that is
in the middle of the points. When the correlation is "ea!, the points are more scattered
and not as concentrated.
n the sales of ne"spapers and the temperature, there "as no connection bet"een the
@ data sets. #he follo"ing s!etches represent some other possible outcomes "hen
there is no correlation bet"een data sets3
8/20/2019 Quantitative Variables.docx
71/86
Example 6
'lot the follo"ing points on a scatter plot, "ith m as the independent )ariable and nas
the dependent )ariable. =umber both a(es from to @. f a correlation e(ists bet"een
the )alues of m and n, describe the correlation *strong negati)e, "ea! positi)e, etc.+.
m4913161767 1810n 531118611181216
Solution:
Example 7
1escribe the correlation, if any, in the follo"ing scatter plot3
Solution:
n the abo)e scatter plot, there is a strong positi)e correlation.Iou no" !no" that a scatter plot can ha)e either a positi)e or a negati)e correlation.
When this e(ists on a scatter plot, a line of best fit can be dra"n on the graph. #he line
of best fit must be dra"n so that the sums of the distances to the points on either side
of the line are appro(imately equal and such that there are an equal number of points
abo)e and belo" the line. Using a clear plastic ruler ma!es it easier to meet all of these
conditions "hen dra"ing the line. $nother useful tool is a stic! of spaghetti, since it can
8/20/2019 Quantitative Variables.docx
72/86
be easily rolled and mo)ed on the graph until you are satisfied "ith its location. #he
edge of the spaghetti can be traced to produce the line of best fit. $ line of best fit can
be used to ma!e estimations from the graph, but you must remember that the line of
best fit is simply a s!etch of "here the line should appear on the graph. $s a result, any
)alues that you choose from this line are not )ery accurate−the )alues are more of aballpar! figure.
Example 8
#he follo"ing table consists of the mar!s achie)ed by 6 students on chemistry and math
tests3
Student A B C ) 0 ( & + 2
Chemistry 1ar-s 76 7B 25 5F 5 5B 57 7B 52
1ath 1ar-s @6 @2 7 2F 2B 2 @7 ?
'lot the abo)e mar!s on scatter plot, "ith the chemistry mar!s on the x-a(is and themath mar!s on the y-a(is. 1ra" a line of best fit, and use this line to estimate the mar!
that Student "ould ha)e made in math had he or she ta!en the test.
Solution:
f Student had ta!en the math test, his or her mar! "ould ha)e been bet"een 2@ and
24.
Scatter plots and lines of best fit can also be dra"n by using technology. #he #-F2 is
capable of graphing both a scatter plot and of inserting the line of best fit onto the
scatter plot.Example 9
Using the data from 8(ample F, create a scatter plot and dra" a line of best fit "ith the
#-F2.
Student A B C ) 0 ( & + 2
Chemistry 1ar-s 76 7B 25 5F 5 5B 57 7B 52
8/20/2019 Quantitative Variables.docx
73/86
Student A B C ) 0 ( & + 2
1ath 1ar-s @6 @2 7 2F 2B 2 @7 ?
Solution:
#he calculator can no" be used to determine a linear regression equation for the gi)en
)alues. #he equation can be entered into the calculator, and the line "ill be plotted on
the scatter plot.
;rom the line of best fit, the calculated )alue for Student Js math test mar! "as 22.B.
0emember that the mar! that you estimated "as bet"een 2@ and 24.
Lesson Summary
n this lesson, you learned ho" to represent data by graphing a straight line of the
form y=mx+b, and also by using a scatter plot and a line of best fit. nterpreting a
bro!en-line graph "as also presented in this lesson. Iou learned about correlation as it
applies to a scatter plot and ho" to describe the correlation of a scatter plot. Iou also
learned ho" to dra" a line of best fit on a scatter plot and to use this line to ma!e
estimates from the graph. #he final topic that "as demonstrated in the lesson "as ho"
8/20/2019 Quantitative Variables.docx
74/86
to use the #-F2 calculator to produce a scatter plot and ho" to graph a line of best fit by
using linear regression.
Points to Consider
• Can any of these graphs be used for comparing data?
• Can the equation for the line of best fit be used to calculate )alues?
• s any other graphical representation of data used for estimations?
Learning Objectives
• Construct and interpret a bo(-and-"his!er plot.
• Use technology to create bo(-and-"his!er plots.
Bo3%and 4his-er Plots
n traditional statistics, data is organized by using a frequency distribution. #he results of
the frequency distribution can then be used to create )arious graphs, such as a
histogram or a frequency polygon, "hich indicate the shape or nature of the distribution.
#he shape of the distribution "ill allo" you to confirm )arious conAectures about the
nature of the data.
#o e(amine data in order to identify patterns, trends, or relationships, e(ploratory data
analysis is used. n e(ploratory data analysis, organized data is displayed in order to
ma!e decisions or suggestions regarding further actions. $ bo3%and%*his-er
plot *often called a bo( plot+ can be used to graphically represent the data set, and the
graph in)ol)es plotting 5 specific )alues. #he 5 specific )alues are often referred to as
a five%number summary of the organized data set. #he fi)e-number summary consists
of the follo"ing3
. #he lo"est number in the data set *minimum )alue+
@. #he median of the lo"er quartile3 Q1*median of the first half of the data set+2. #he median of the entire data set *median+
7. #he median of the upper quartile3 Q3*median of the second half of the data set+
5. #he highest number in the data set *ma(imum )alue+
#he display of the fi)e-number summary produces a bo(-and-"his!er plot as sho"n
belo"3
8/20/2019 Quantitative Variables.docx
75/86
#he abo)e model of a bo(-and-"his!er plot sho"s @ horizontal lines *the "his!ers+ that
each contain @5 of the data and are of the same length. n addition, it sho"s that the
median of the data set is in the middle of the bo(, "hich contains 5 of the data. #he
lengths of the "his!ers and the location of the median "ith respect to the center of the
bo( are used to describe the distribution of the data. tJs important to note that this is Aust
an e(ample. =ot all bo(-and-"his!er plots ha)e the median in the middle of the bo( and
"his!ers of the same size.
nformation about the data set that can be determined from the bo(-and-"his!er plot"ith respect to the location of the median includes the follo"ing3
a+ f the median is located in the center or near the center of the bo(, the distribution is
appro(imately symmetric.
b+ f the median is located to the left of the center of the bo(, the distribution is positi)ely
s!e"ed.
c+ f the median is located to the right of the center of the bo(, the distribution is
negati)ely s!e"ed.
nformation about the data set that can be determined from the bo(-and-"his!er plot"ith respect to the length of the "his!ers includes the follo"ing3
a+ f the "his!ers are the same or almost the same length, the distribution is
appro(imately symmetric.
b+ f the right "his!er is longer than the left "his!er, the distribution is positi)ely s!e"ed.
c+ f the left "his!er is longer than the right "his!er, the distribution is negati)ely
s!e"ed.
#he length of the "his!ers also gi)es you information about ho" spread out the data is. $ bo(-and-"his!er plot is often used "hen the number of data )alues is large. #he
center of the distribution, the nature of the distribution, and the range of the data are
)ery ob)ious from the graph. #he fi)e-number summary di)ides the data into quarters
by use of the medians of the upper and lo"er hal)es of the data. 0emember that, unli!e
the mean, the median of the entire data set is not affected by outliers, so it is the
measure of central tendency that is most often used in e(ploratory data analysis.
8/20/2019 Quantitative Variables.docx
76/86
Example 24
;or the follo"ing data sets, determine the fi)e-number summaries3
a+ @, B, 2B, , 2, @2, 5F
b+ 77, @7, 52, B@6, 57, 2Solution:
a+ #he first step is to organize the )alues in the data set as sho"n belo"3
12, 16, 36, 10, 31, 23, 5810, 12, 16, 23, 31, 36, 58
=o" complete the follo"ing list3
Einimum )alue→10
Q1→12Eedian→23
Q3→36
Ea(imum )alue→58
b+ #he first step is to organize the )alues in the data set as sho"n belo"3
144, 240, 153, 629, 540, 300144, 153, 240, 300, 540, 629
=o" complete the follo"ing list3
Einimum )alue→144
Q1
→153Eedian→270
Q3→540
Ea(imum )alue→629
Example 25
8/20/2019 Quantitative Variables.docx
77/86
Use the data set for 8(ample @7 part a+ and the fi)e-number summary to construct a
bo(-and-"his!er plot to model the data set.
Solution:
#he fi)e-number summary can no" be used to construct a bo(-and-"his!er plot. /e
sure to pro)ide a scale on the number line that includes the range from the minimum)alue to the ma(imum )alue.
a+ Einimum )alue→10
Q1→12
Eedian→23
Q3→36
Ea(imum )alue→58
t is )ery )isible that the right "his!er is much longer than the left "his!er. #his indicates
that the distribution is positi)ely s!e"ed.
Example 26
;or each bo(-and-"his!er plot, list the fi)e-number summary and describe the
distribution based on the location of the median.
Solution:
a+ Einimum )alue→4
Q1→6
Eedian→9
8/20/2019 Quantitative Variables.docx
78/86
Q3→10
Ea(imum )alue→12
#he median of the data set is located to the right of the center of the bo(, "hich
indicates that the distribution is negati)ely s!e"ed.b+ Einimum )alue→225
Q1→250
Eedian→300
Q3→325
Ea(imum )alue→350
#he median of the data set is located to the right of the center of the bo(, "hich
indicates that the distribution is negati)ely s!e"ed.
c+ Einimum )alue→60
Q1→70
Eedian→75
Q3→95
Ea(imum )alue→100
#he median of the data set is located to the left of the center of the bo(, "hich indicates
that the distribution is positi)ely s!e"ed.
Example 27
#he numbers of square feet *in s+ of of the largest museums in the "orld are
sho"n belo"3
B5, 574, @7, @2, 272, @FF, @@@, @5, @F4, @B6
Construct a bo(-and-"his!er plot for the abo)e data set and describe the distribution.
Solution:
#he first step is to organize the data )alues as follo"s3
20,40021,30022,20025,00026,90028,70028,80034,30054,70065,000
=o" calculate the median, Q1, and Q3.
20,40021,30022,20025,00026,90028,70028,80034,30054,70065,000
Median→26,900+28,7002=55,6002=27,800
8/20/2019 Quantitative Variables.docx
79/86
Q1=22,200
Q3=34,300
=e(t, complete the follo"ing list3
Einimum )alue→20,400
Q1→22,200
Eedian→27,800
Q3→34,300
Ea(imum )alue→65,000
#he right "his!er is longer than the left "his!er, "hich indicates that the distribution is
positi)ely s!e"ed.
#he #-F2 or #-F7 can also be used to create a bo(-and "his!er plot. n the follo"ing
e(amples, the #-F2 is used. n the ne(t chapter, !ey stro!es using the #-F7 "ill be
presented to you. #he fi)e-number summary )alues can be determined by using the
#0$C8 feature of the calculator or by using C$C and -Gar Stats.
Example 28
#he follo"ing numbers represent the number of siblings in each family for 5 randomly
selected students3
4, 1, 2, 2, 5, 3, 4, 2, 6, 4, 6, 1, 7, 8, 4
Use technology to construct a bo(-and-"his!er plot to display the data. ist the fi)e-
number summary )alues.
Solution:
8/20/2019 Quantitative Variables.docx
80/86
=ote that "hen creating a bo(-and-"his!er plot "ith a # calculator, you donJt ha)e toactually sort the data. #he calculator "ill sort the data automatically "hen creating the
bo(-and-"his!er plot.
#he fi)eDnumber summary can be obtained from the calculator in @ "ays.
. #he follo"ing results are obtained by simply using the #0$C8 feature and the left
and right arro"s3
#he )alues at the bottom of each screen are the fi)e-number summary.
@. #he second method in)ol)es pressingSTATand using -Gar Stats on the C$C
menu for 3
Eany data sets contain )alues that are either e(tremely high )alues or e(tremely lo"
)alues compared to the rest of the data )alues. #hese )alues are calledoutliers. #here
are se)eral reasons "hy a data set may contain an outlier. Some of these are listed
belo"3
8/20/2019 Quantitative Variables.docx
81/86
. #he )alue may be the result of an error made in measurement or in obser)ation.
#he researcher may ha)e measured the )ariable incorrectly.
@. #he )alue may simply be an error made by the researcher in recording the )alue.
#he )alue may ha)e been "ritten or typed incorrectly.
2. #he )alue could be a result obtained from a subAect not "ithin the defined
population. $ rese