Quantitative Variables.docx

Embed Size (px)

Citation preview

  • 8/20/2019 Quantitative Variables.docx

    1/86

    Quantitative Variables

     Author(s)David M. Lane

    PrerequisitesVariables

    1. Stem and Leaf Displays

    2. Histograms

    3. Frequeny !olygons

    ". #o$ !lots

    %. #o$ !lot Demonstration

    &. #ar '(arts

    ). Line *rap(s

    +. Dot !lots

    ,s disussed in t(e setion on variables in '(apter 1- quantitative

    variables are variables measured on a numeri sale. Heig(t-eig(t- response time- sub/etive rating of pain- temperature- andsore on an e$am are all e$amples of quantitative variables.0uantitative variables are distinguis(ed from ategorialsometimes alled qualitative variables su( as favorite olor-religion- ity of birt(- and favorite sport in (i( t(ere is noordering or measuring involved.

    (ere are many types of grap(s t(at an be used to portray

    distributions of quantitative variables. (e upoming setions over

    t(e folloing types of grap(s4 1 stem and leaf displays- 2(istograms- 3 frequeny polygons- " bo$ plots- % bar (arts-

    & line grap(s- ) satter plots disussed in a different (apter-

    and + dot plots. Some grap( types su( as stem and leaf displays

    are best5suited for small to moderate amounts of data- (ereas

    ot(ers su( as (istograms are best5suited for large amounts of

    http://onlinestatbook.com/2/introduction/variables.htmlhttp://onlinestatbook.com/2/graphing_distributions/stem.htmlhttp://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://onlinestatbook.com/2/graphing_distributions/freq_poly.htmlhttp://onlinestatbook.com/2/graphing_distributions/boxplots.htmlhttp://onlinestatbook.com/2/graphing_distributions/boxplot_demo.htmlhttp://onlinestatbook.com/2/graphing_distributions/bar_chart.htmlhttp://onlinestatbook.com/2/graphing_distributions/line_graphs.htmlhttp://onlinestatbook.com/2/graphing_distributions/dotplots.htmlhttp://onlinestatbook.com/2/describing_bivariate_data/intro.htmlhttp://onlinestatbook.com/2/graphing_distributions/stem.htmlhttp://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://onlinestatbook.com/2/graphing_distributions/freq_poly.htmlhttp://onlinestatbook.com/2/graphing_distributions/boxplots.htmlhttp://onlinestatbook.com/2/graphing_distributions/boxplot_demo.htmlhttp://onlinestatbook.com/2/graphing_distributions/bar_chart.htmlhttp://onlinestatbook.com/2/graphing_distributions/line_graphs.htmlhttp://onlinestatbook.com/2/graphing_distributions/dotplots.htmlhttp://onlinestatbook.com/2/describing_bivariate_data/intro.htmlhttp://onlinestatbook.com/2/introduction/variables.html

  • 8/20/2019 Quantitative Variables.docx

    2/86

    data. *rap( types su( as bo$ plots are good at depiting

    differenes beteen distributions. Satter plots are used to s(o

    t(e relations(ip beteen to variables.

    Stem and Leaf Displays

     Author(s)David M. Lane

    PrerequisitesDistributions

    Learning Objectives

    1. 'reate and interpret basi stem and leaf displays

    2. 'reate and interpret ba65to5ba6 stem and leaf displays

    3. 7udge (et(er a stem and leaf display is appropriate for agiven data set

    , stem and leaf display is a grap(ial met(od of displaying data. 8t

    is partiularly useful (en your data are not too numerous. 8n t(issetion- e ill e$plain (o to onstrut and interpret t(is 6ind of

    grap(.

    ,s usual- an e$ample ill get us started. 'onsider able 1 t(at

    s(os t(e number of touchdown passes D passes t(ron by

    ea( of t(e 31 teams in t(e 9ational Football League in t(e 2:::

    season.able 1. 9umber of tou(don passes.

    37, 33, 33, 32, 29, 28, 28, 23, 22,22, 22, 21, 21, 21, 20, 20, 19, 19,

    18, 18, 18, 18, 16, 15, 14, 14, 14,

    12, 12, 9, 6

    , stem and leaf display of t(e data is s(on in Figure 1. (e leftportion of Figure 1 ontains t(e stems. (ey are t(e numbers 3- 2-

    http://onlinestatbook.com/2/introduction/distributions.htmlhttp://glossary%28%27touchdown_pass%27%29/http://onlinestatbook.com/2/introduction/distributions.htmlhttp://glossary%28%27touchdown_pass%27%29/

  • 8/20/2019 Quantitative Variables.docx

    3/86

    1- and :- arranged as a olumn to t(e left of t(e bars. (in6 oft(ese numbers as 1:;s digits. , stem of 3- for e$ample- an be usedto represent t(e 1:;s digit in any of t(e numbers from 3: to 3

  • 8/20/2019 Quantitative Variables.docx

    4/86

    reserved for t(e numbers from 3: to 3" and (olds t(e 32- 33- and

    33 D passes made by t(e ne$t t(ree teams in t(e table. @ou an

    see for yourself (at t(e ot(er ros represent.

    3|7

    3|2332|889

    2|001112223

    1|56888899

    1|22444

    0|69

    Figure 2. Stem and leaf display it( t(e stems split in to.

    Figure 2 is more revealing t(an Figure 1 beause t(e latter

    figure lumps too many values into a single ro. >(et(er you s(ouldsplit stems in a display depends on t(e e$at form of your data. 8f

    ros get too long it( single stems- you mig(t try splitting t(em

    into to or more parts.

    (ere is a variation of stem and leaf displays t(at is useful for

    omparing distributions. (e to distributions are plaed ba6 to

    ba6 along a ommon olumn of stems. (e result is a Aba65to5

    ba6 stem and leaf grap(.B Figure 3 s(os su( a grap(. 8t

    ompares t(e numbers of D passes in t(e 1

  • 8/20/2019 Quantitative Variables.docx

    5/86

    Figure 3. #a65to5ba6 stem and leaf display. (e left side s(os t(e1

  • 8/20/2019 Quantitative Variables.docx

    6/86

    anot(er sub/et as 2)." milliseonds sloer pronouning

    aggressive ords (en t(ey ere preeded by eapon ords.

    (e data are displayed it( stems and leaves in Figure ". Sine

    stem and leaf displays an only portray to (ole digits one for t(e

    stem and one for t(e leaf- t(e numbers are first rounded. (us- t(evalue "3.2 is rounded to "3 and represented it( a stem of " and a

    leaf of 3. Similarly- "2.< is rounded to "3. o represent negative

    numbers- e simply use negative stems. For e$ample- t(e bottom

    ro of t(e figure represents t(e number 52). (e seond5to5last

    ro represents t(e numbers 51:- 51:- 51%- et. ?ne again- e

    (ave rounded t(e original values from able 2.

     4|33

     3|6

     2|00456

     1|00134

     0|1245589

    -0|0679

    -1|005559

    -2|7

    Figure ". Stem and leaf display it( negative numbers and rounding.

    ?bserve t(at t(e figure ontains a ro (eaded by C:C and anot(er

    (eaded by C5:.C (e stem of : is for numbers beteen : and

  • 8/20/2019 Quantitative Variables.docx

    7/86

    Figure %. Stem and leaf display of populations of 1+% ES ities it(populations beteen 1::-::: and %::-::: in 1

  • 8/20/2019 Quantitative Variables.docx

    8/86

    Question 1 out of 7., stem and leaf display is a good met(od of displaying largeamounts of data.

    rue

    False

    Stem and leaf displays an be unieldy it( large amounts of databeause every single data value is s(on in t(e figure.

    Histograms

     Author(s)David M. Lane

    PrerequisitesDistributions- *rap(ing 0ualitative Data

    Learning Objectives

    1. 'reate a grouped frequeny distribution

    2. 'reate a (istogram based on a grouped frequeny distribution

    3. Determine an appropriate bin idt(

    , (istogram is a grap(ial met(od for displaying t(e s(ape of a

    distribution. 8t is partiularly useful (en t(ere are a large numberof observations. >e begin it( an e$ample onsisting of t(e sores

    of &"2 students on a psy(ology test. (e test onsists of 1

  • 8/20/2019 Quantitative Variables.docx

    9/86

    (e first step is to reate a frequency table. Enfortunately- a

    simple frequeny table ould be too big- ontaining over 1:: ros.

    o simplify t(e table- e group sores toget(er as s(on in able 1.

     

    able 1. *rouped Frequeny Distribution of !sy(ology est Sores

    Interval's

    Lower Limit

    Interval's

    Upper Limit

    Class

    Frequency

    39.5 49.5 3

    49.5 59.5 10

    59.5 69.5 53

    69.5 79.5 107

    79.5 89.5 147

    89.5 99.5 130

    99.5 109.5 78

    109.5 119.5 59

    119.5 129.5 36

    129.5 139.5 11

    139.5 149.5 6

    149.5 159.5 1

    159.5 169.5 1

     

    o reate t(is table- t(e range of sores as bro6en into intervals-

    alled class intervals. (e first interval is from 3

  • 8/20/2019 Quantitative Variables.docx

    10/86

    More information on (oosing t(e idt(s of lass intervals is

    presented later in t(is setion. !laing t(e limits of t(e lass

    intervals miday beteen to numbers e.g.- "eGll (ave more to say about

    s(apes of distributions in t(e (apter C Summariing Distributions.C

    8n our e$ample- t(e observations are (ole numbers.

    Histograms an also be used (en t(e sores are measured on a

    more ontinuous sale su( as t(e lengt( of time in milliseonds

    required to perform a tas6. 8n t(is ase- t(ere is no need to orry

    about fene5sitters sine t(ey are improbable. 8t ould be quite a

    oinidene for a tas6 to require e$atly ) seonds- measured to t(e

    nearest t(ousandt( of a seond. >e are t(erefore free to (oose

    (ole numbers as boundaries for our lass intervals- for e$ample-

    http://glossary%28%27skew%27%29/http://onlinestatbook.com/2/summarizing_distributions/shapes.htmlhttp://glossary%28%27skew%27%29/http://onlinestatbook.com/2/summarizing_distributions/shapes.html

  • 8/20/2019 Quantitative Variables.docx

    11/86

    ":::- %:::- et. (e lass frequeny is t(en t(e number of

    observations t(at are greater t(an or equal to t(e loer bound- and

    stritly less t(an t(e upper bound. For e$ample- one interval mig(t

    (old times from "::: to "

  • 8/20/2019 Quantitative Variables.docx

    12/86

    seemed learest. (e best advie is to e$periment it( different

    (oies of idt(- and to (oose a (istogram aording to (o ell it

    ommuniates t(e s(ape of t(e distribution.

    o provide e$periene in onstruting (istograms- e (ave

    developed an interative demonstration. (e demonstration revealst(e onsequenes of different (oies of bin idt( and of loer

    boundary for t(e first interval.

    Frequency olygons

     Author(s)David M. Lane

    PrerequisitesHistograms

    Learning Objectives

    1. 'reate and interpret frequeny polygons

    2. 'reate and interpret umulative frequeny polygons

    3. 'reate and interpret overlaid frequeny polygons

    Frequeny polygons are a grap(ial devie for understanding t(e

    s(apes of distributions. (ey serve t(e same purpose as

    (istograms- but are espeially (elpful for omparing sets of data.

    Frequeny polygons are also a good (oie for

    displaying cu!ulative frequency distributions.

    o reate a frequeny polygon- start /ust as for (istograms- by

    (oosing aclass interval" (en dra an K5a$is representing t(e

    values of t(e sores in your data. Mar6 t(e middle of ea( lass

    interval it( a ti6 mar6- and label it it( t(e middle value

    represented by t(e lass. Dra t(e @5a$is to indiate t(e frequeny

    of ea( lass. !lae a point in t(e middle of ea( lass interval at

    t(e (eig(t orresponding to its frequeny. Finally- onnet t(e

    points. @ou s(ould inlude one lass interval belo t(e loest value

    http://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://glossary%28%27cumulative%27%29/http://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://glossary%28%27class_interval%27%29/http://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://glossary%28%27cumulative%27%29/http://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://glossary%28%27class_interval%27%29/

  • 8/20/2019 Quantitative Variables.docx

    13/86

    in your data and one above t(e (ig(est value. (e grap( ill t(en

    tou( t(e K5a$is on bot( sides.

    , frequeny polygon for &"2 psy(ology test sores s(on in

    Figure 1 as onstruted from t(e frequeny table s(on in able 1.

    able 1. Frequeny Distribution of !sy(ology est Sores.

    Lower

    Limit

    Upper

    Limit Count

    Cumulative

    Count

    29.5 39.5 0 0

    39.5 49.5 3 3

    49.5 59.5 10 13

    59.5 69.5 53 66

    69.5 79.5 107 173

    79.5 89.5 147 320

    89.5 99.5 130 450

    99.5 109.5 78 528

    109.5 119.5 59 587

    119.5 129.5 36 623

    129.5 139.5 11 634

    139.5 149.5 6 640

    149.5 159.5 1 641

    159.5 169.5 1 642

    169.5 179.5 0 642

    (e first label on t(e K5a$is is 3%. (is represents an interval

    e$tending from 2

  • 8/20/2019 Quantitative Variables.docx

    14/86

    distribution is not symmetri inasmu( as good sores to t(e rig(t

    trail off more gradually t(an poor sores to t(e left. 8n t(e

    terminology of '(apter 3 (ere e ill study s(apes of

    distributions more systematially- t(e distribution is skewed .

    Figure 1. Frequeny polygon for t(e psy(ology test sores.

    , cu!ulative frequency polygon for t(e same test sores is s(on

    in Figure 2. (e grap( is t(e same as before e$ept t(at t(e @ value

    for ea( point is t(e number of students in t(e orresponding lass

    interval plus all numbers in loer intervals. For e$ample- t(ere are

    no sores in t(e interval labeled C3%-C t(ree in t(e interval C"%-C and

    1: in t(e interval C%%.C (erefore- t(e @ value orresponding to C%%C

    is 13. Sine &"2 students too6 t(e test- t(e umulative frequeny

    for t(e last interval is &"2.

    http://glossary%28%27skew%27%29/http://glossary%28%27cumulative_frequency_poly%27%29/http://glossary%28%27skew%27%29/http://glossary%28%27cumulative_frequency_poly%27%29/

  • 8/20/2019 Quantitative Variables.docx

    15/86

    Figure 2. 'umulative frequeny polygon for t(e psy(ology testsores.

    Frequeny polygons are useful for omparing distributions. (is is

    a(ieved by overlaying t(e frequeny polygons dran for different

    data sets. Figure 3 provides an e$ample. (e data ome from a tas6

    in (i( t(e goal is to move a omputer ursor to a target on t(e

    sreen as fast as possible. ?n 2: of t(e trials- t(e target as asmall retangle on t(e ot(er 2:- t(e target as a large retangle.

    ime to rea( t(e target as reorded on ea( trial. (e to

    distributions one for ea( target are plotted toget(er in Figure 3.

    (e figure s(os t(at- alt(oug( t(ere is some overlap in times- it

    generally too6 longer to move t(e ursor to t(e small target t(an to

    t(e large one.

  • 8/20/2019 Quantitative Variables.docx

    16/86

    Figure 3. ?verlaid frequeny polygons.

    8t is also possible to plot to umulative frequeny distributions in

    t(e same grap(. (is is illustrated in Figure " using t(e same data

    from t(e ursor tas6. (e differene in distributions for t(e to

    targets is again evident.

  • 8/20/2019 Quantitative Variables.docx

    17/86

    Figure ". ?verlaid umulative frequeny polygons.

    @ou mig(t be urious about your on performane in t(e ursortas6. ry t(e tas6 yourself - and ompare your times it( ours.

     

    Question 1 out of !., frequeny polygon is very similar to a

    (istogram

    stem and leaf display

    listing of ra data

     

    http://newwindow3%28%27target_time.html%27%2C%20660%2C510%29/http://newwindow3%28%27target_time.html%27%2C%20660%2C510%29/

  • 8/20/2019 Quantitative Variables.docx

    18/86

     Frequeny polygons do not list t(e ra data- as stem and leaf plotsdo. Frequeny polygons are very similar to (istograms- e$ept(istograms (ave bars and frequeny polygons (ave dots and lines

    onneting t(e frequenies of ea( lass interval.

    "o# lots

     Author(s)David M. Lane

    Prerequisites!erentiles- Histograms- Frequeny !olygons

    Learning Objectives

    1. Define basi terms inluding (inges- H5spread- step- ad/aentvalue- outside value- and far out value

    2. 'reate a bo$ plot

    3. 'reate parallel bo$ plots

    ". Determine (et(er a bo$ plot is appropriate for a given data

    set

    >e (ave already disussed te(niques for visually representing data

    see(istograms and frequeny polygons. 8n t(is setion- e present

    anot(er important grap( alled a bo# plot . #o$ plots are useful for

    identifying outliers and for omparing distributions. >e ill e$plain

    bo$ plots it( t(e (elp of data from an in5lass e$periment. ,s part

    of t(e CStroop 8nterferene 'ase Study-C students in introdutory

    statistis ere presented it( a page ontaining 3: olored

    retangles. (eir tas6 as to name t(e olors as qui6ly as possible.(eir times in seonds ere reorded. >eGll ompare t(e sores

    for t(e 1& men and 31 omen (o partiipated in t(e e$periment

    by ma6ing separate bo$ plots for ea( gender. Su( a display is said

    to involve parallel bo# plots.

    http://onlinestatbook.com/2/introduction/percentiles.htmlhttp://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://onlinestatbook.com/2/graphing_distributions/freq_poly.htmlhttp://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://onlinestatbook.com/2/graphing_distributions/freq_poly.htmlhttp://glossary%28%27boxplot%27%29/http://onlinestatbook.com/2/case_studies/stroop.htmlhttp://glossary%28%27parallel_box_plots%27%29/http://onlinestatbook.com/2/introduction/percentiles.htmlhttp://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://onlinestatbook.com/2/graphing_distributions/freq_poly.htmlhttp://onlinestatbook.com/2/graphing_distributions/histograms.htmlhttp://onlinestatbook.com/2/graphing_distributions/freq_poly.htmlhttp://glossary%28%27boxplot%27%29/http://onlinestatbook.com/2/case_studies/stroop.htmlhttp://glossary%28%27parallel_box_plots%27%29/

  • 8/20/2019 Quantitative Variables.docx

    19/86

    (ere are several steps in onstruting a bo$ plot. (e first relies

    on t(e 2%t(- %:t(- and )%t( perentiles in t(e distribution of sores.

    Figure 1 s(os (o t(ese t(ree statistis are used. For ea( gender-

    e dra a bo$ e$tending from t(e 2%t(perentile to t(e

    )%t( perentile. (e %:t( perentile is dran inside t(e bo$.(erefore-

    the bottom of each box is the 25th percentile,

    the top is the 75th percentile,

    and the line in the middle is the 50th percentile.

    (e data for t(e omen in our sample are s(on in able 1.

    able 1. >omenGs times.

    14

    15

    16

    16

    17

    17

    17

    17

    17

    18

    18

    18

    18

    18

    18

    19

    19

    19

    20

    20

    20

    20

    20

    20

    21

    21

    22

    23

    24

    24

    29

    For t(ese data- t(e 2%t( perentile is 1)- t(e %:t( perentile is 1

  • 8/20/2019 Quantitative Variables.docx

    20/86

    Figure 1. (e first step in reating bo$ plots.

    #efore proeeding- t(e terminology in able 2 is (elpful.able 2. #o$ plot terms and values for omenGs times.

    Name Formula Value

    Upper Hinge 75th Percentile 20

    Lower Hinge 25th Percentile 17

    H-Spread Upper Hinge - Lower Hinge 3

    Step 1.5 x H-Spread 4.5

    Upper Inner

    FenceUpper Hinge + 1 Step 24.5

    Lower Inner

    FenceLower Hinge - 1 Step 12.5

    Upper Outer

    FenceUpper Hinge + 2 Steps 29

    Lower Outer Lower Hinge - 2 Steps 8

  • 8/20/2019 Quantitative Variables.docx

    21/86

    Fence

    Upper

    Adjacent

    Largest value below Upper Inner

    Fence24

    Lower

    Adjacent

    Smallest value above Lower

    Inner Fence14

    Outside

    Value

    A value beyond an Inner Fence

    but not beyond an Outer Fence29

    Far Out

    ValueA value beyond an Outer Fence None

    'ontinuing it( t(e bo$ plots- e put C(is6ersC above and beloea( bo$ to give additional information about t(e spread of t(edata. >(is6ers are vertial lines t(at end in a (oriontal stro6e.>(is6ers are dran from t(e upper and loer (inges to t(e upperand loer ad/aent values 2" and 1" for t(e omenGs data.

    Figure 2. (e bo$ plots it( t(e (is6ers dran.

  • 8/20/2019 Quantitative Variables.docx

    22/86

     

    ,lt(oug( e donGt dra (is6ers all t(e ay to outside or far out

    values- e still is( to represent t(em in our bo$ plots. (is is

    a(ieved by adding additional mar6s beyond t(e (is6ers.

    Speifially- outside values are indiated by small CoGsC and far outvalues are indiated by asteris6s . 8n our data- t(ere are no far

    out values and /ust one outside value. (is outside value of 2< is for

    t(e omen and is s(on in Figure 3.

    Figure 3. (e bo$ plots it( t(e outside value s(on.

    (ere is one more mar6 to inlude in bo$ plots alt(oug( sometimes

    it is omitted. >e indiate t(e mean sore for a group by inserting a

    plus sign. Figure " s(os t(e result of adding means to our bo$

    plots.

  • 8/20/2019 Quantitative Variables.docx

    23/86

    Figure ". (e ompleted bo$ plots.

    Figure " provides a revealing summary of t(e data. Sine (alf

    t(e sores in a distribution are beteen t(e (inges reall t(at t(e

    (inges are t(e 2%t( and )%t(perentiles- e see t(at (alf t(e

    omenGs times are beteen 1) and 2: seonds- (ereas (alf t(emenGs times are beteen 1< and 2%.%. >e also see t(at omen

    generally named t(e olors faster t(an t(e men did- alt(oug( one

    oman as sloer t(an almost all of t(e men. Figure % s(os t(e

    bo$ plot for t(e omenGs data it( detailed labels.

  • 8/20/2019 Quantitative Variables.docx

    24/86

    Figure %. (e bo$ plot for t(e omenGs data it( detailed labels.

    #o$ plots provide basi information about a distribution. For

    e$ample- a distribution it( a positive s6e ould (ave a longer

    (is6er in t(e positive diretion t(an in t(e negative diretion. ,

    larger mean t(an median ould also indiate a positive s6e. #o$

    plots are good at portraying e$treme values and are espeially good

    at s(oing differenes beteen distributions. Hoever- many of t(e

    details of a distribution are not revealed in a bo$ plot- and toe$amine t(ese details one s(ould reate a (istogram andNor a ste!

    and leaf display .

    Here are some ot(er e$amples of bo$ plots4ime to move t(e mouse over a targetDraft lottery

    V$%&$'&()S () "(* L('S

    Statistial analysis programs may offer options on (o bo$ plots are

    reated. For e$ample- t(e bo$ plots in Figure & are onstruted fromour data but differ from t(e previous bo$ plots in several ays.

    1. 8t does not mar6 outliers.

    2. (e means are indiated by green lines rat(er t(an plus signs.

    http://glossary%28%27stem_and_leaf_plot%27%29/http://glossary%28%27stem_and_leaf_plot%27%29/http://newwindow%28%27boxplots_files/target_boxplot.html')http://newwindow%28%27boxplots_files/draft.html')http://glossary%28%27stem_and_leaf_plot%27%29/http://glossary%28%27stem_and_leaf_plot%27%29/http://newwindow%28%27boxplots_files/target_boxplot.html')http://newwindow%28%27boxplots_files/draft.html')

  • 8/20/2019 Quantitative Variables.docx

    25/86

    3. (e mean of all sores is indiated by a gray line.

    ". 8ndividual sores are represented by dots. Sine t(e sores(ave been rounded to t(e nearest seond- any given dot mig(trepresent more t(an one sore.

    %. (e bo$ for t(e omen is ider t(an t(e bo$ for t(e menbeause t(e idt(s of t(e bo$es are proportional to t(enumber of sub/ets of ea( gender 31 omen and 1& men.

    Figure &. #o$ plots s(oing t(e individual sores and t(e means.

    =a( dot in Figure & represents a group of sub/ets it( t(e

    same sore rounded to t(e nearest seond. ,n alternative

    grap(ing te(nique is to jitter thepoints. (is means spreading out

    different dots at t(e same (oriontal position- one dot for ea(

    sub/et. (e e$at (oriontal position of a dot is determined

    randomly under t(e onstraint t(at different dots don;t overlap

    e$atly. Spreading out t(e dots (elps you to see multiple

    ourrenes of a given sore. Hoever- depending on t(e dot sie

    and t(e sreen resolution- some points may be obsured even if t(e

    points are /ittererd. Figure ) s(os (at /ittering loo6s li6e.

  • 8/20/2019 Quantitative Variables.docx

    26/86

    Figure ). #o$ plots it( t(e individual sores /ittered.

    Different styles of bo$ plots are best for different situations- and

    t(ere are no firm rules for (i( to use. >(en e$ploring your data-

    you s(ould try several ays of visualiing t(em. >(i( grap(s you

    inlude in your report s(ould depend on (o ell different grap(s

    reveal t(e aspets of t(e data you onsider most important.

     

    Question 1 out of +.>(at is t(e upper (ingeO

  • 8/20/2019 Quantitative Variables.docx

    27/86

     #

     '

    D

    F

    (e upper (inge is t(e )%t( perentile. 8t is t(e top of t(e bo$.

    "ar ,-arts

     Author(s)David M. Lane

  • 8/20/2019 Quantitative Variables.docx

    28/86

    Prerequisites*rap(ing 0ualitative Variables

    Learning Objectives

    1. 'reate and interpret bar (arts

    2. 7udge (et(er a bar (art or anot(er grap( su( as a bo$ plotould be more appropriate

    8n t(e setion on qualitative variables- e sa (o bar (arts ould

    be used to illustrate t(e frequenies of different ategories. For

    e$ample- t(e bar (art s(on in Figure 1 s(os (o many

    pur(asers of iMa omputers ere previous Maintos( users-

    previous >indos users- and ne omputer pur(asers.

    Figure 1. iMa buyers as a funtion of previous omputer oners(ip.

    8n t(is setion- e s(o (o bar (arts an be used to presentot(er 6inds of quantitative information- not /ust frequeny ounts.

    (e bar (art in Figure 2 s(os t(e perent inreases in t(e Do

    7ones- Standard and !oor %:: S P !- and 9asdaq sto6 inde$es

    from May 2"t( 2::: to May 2"t( 2::1. 9otie t(at bot( t(e S P !

    and t(e 9asdaq (ad Anegative inreasesB (i( means t(at t(ey

    http://onlinestatbook.com/2/graphing_distributions/graphing_qualitative.htmlhttp://onlinestatbook.com/2/graphing_distributions/graphing_qualitative.htmlhttp://onlinestatbook.com/2/graphing_distributions/graphing_qualitative.htmlhttp://onlinestatbook.com/2/graphing_distributions/graphing_qualitative.html

  • 8/20/2019 Quantitative Variables.docx

    29/86

    dereased in value. 8n t(is bar (art- t(e @5a$is is not frequeny but

    rat(er t(e signed quantity percentage increase"

    Figure 2. !erent inrease in t(ree sto6 inde$es from May 2"t( 2:::to May 2"t( 2::1.

    #ar (arts are partiularly effetive for s(oing (ange over time.

    Figure 3- for e$ample- s(os t(e perent inrease in t(e 'onsumer

    !rie 8nde$ '!8 over four t(ree5mont( periods. (e flutuation in

    inflation is apparent in t(e grap(.

    Figure 3. !erent (ange in t(e '!8 over time. =a( bar representsperent inrease for t(e t(ree mont(s ending at t(e date indiated.

    #ar (arts are often used to ompare t(e means of different

    e$perimental onditions. Figure " s(os t(e mean time it too6 one

    of us DL to move t(e mouse to eit(er a small target or a large

  • 8/20/2019 Quantitative Variables.docx

    30/86

    target. ?n average- more time as required for small targets t(an

    for large ones.

     

    Figure ". #ar (art s(oing t(e means for t(e to onditions.

    ,lt(oug( bar (arts an display means- e do not reommend t(em

    for t(is purpose. #o$ plots s(ould be used instead sine t(ey

    provide more information t(an bar (arts it(out ta6ing up more

    spae. For e$ample- a bo$ plot of t(e mouse5movement data is

    s(on in Figure %. @ou an see t(at Figure % reveals more about t(e

    distribution of movement times t(an does Figure ".

    Figure %. #o$ plots of times to move t(e mouse to t(e small and largetargets.

    http://onlinestatbook.com/2/graphing_distributions/boxplots.htmlhttp://onlinestatbook.com/2/graphing_distributions/boxplots.html

  • 8/20/2019 Quantitative Variables.docx

    31/86

    (e setion on qualitative variables presented earlier in t(is (apter

    disussed t(e use of bar (arts for omparing distributions. Some

    ommon grap(ial mista6es ere also noted. (e earlier disussion

    applies equally ell to t(e use of bar (arts to display quantitative

    variables.

    Question 1 out of .#ar (arts an only be used for qualitative variables.

    rue

     False

     

    ,lt(oug( bar (arts an be used for qualitative variables- t(ey analso portray quantitative variables.

    Line /rap-s

     Author(s)David M. Lane

    Prerequisites#ar *rap(s

    Learning Objectives

    1. 'reate and interpret line grap(s

    2. 7udge (et(er a line grap( ould be appropriate for a givendata set

    , line grap( is a bar grap( it( t(e tops of t(e bars represented by

    points /oined by lines t(e rest of t(e bar is suppressed. For

    http://onlinestatbook.com/2/graphing_distributions/graphing_qualitative.htmlhttp://onlinestatbook.com/2/graphing_distributions/bar_chart.htmlhttp://onlinestatbook.com/2/graphing_distributions/graphing_qualitative.htmlhttp://onlinestatbook.com/2/graphing_distributions/bar_chart.html

  • 8/20/2019 Quantitative Variables.docx

    32/86

    e$ample- Figure 1 as presented in t(e setion on bar (arts and

    s(os (anges in t(e 'onsumer !rie 8nde$ '!8 over time.

    Figure 1. , bar (art of t(e perent (ange in t(e '!8 over time. =a(bar represents perent inrease for t(e t(ree mont(s ending at t(edate indiated.

    , line grap( of t(ese same data is s(on in Figure 2. ,lt(oug( t(e

    figures are similar- t(e line grap( emp(asies t(e (ange fromperiod to period.

  • 8/20/2019 Quantitative Variables.docx

    33/86

    Figure 2. , line grap( of t(e perent (ange in t(e '!8 over time.=a( point represents perent inrease for t(e t(ree mont(s endingat t(e date indiated.

    Line grap(s are appropriate only (en bot( t(e K5 and @5a$es

    display ordered rat(er t(an qualitative variables. ,lt(oug( bar

    grap(s an also be used in t(is situation- line grap(s are generally

    better at omparing (anges over time. Figure 3- for e$ample-s(os perent inreases and dereases in five omponents of t(e

    'onsumer !rie 8nde$ '!8. (e figure ma6es it easy to see t(at

    medial osts (ad a steadier progression t(an t(e ot(er

    omponents. ,lt(oug( you ould reate an analogous bar (art- its

    interpretation ould not be as easy.

  • 8/20/2019 Quantitative Variables.docx

    34/86

    Figure 3. , line grap( of t(e perent (ange in five omponents of t(e'!8 over time.

    Let us stress t(at it is misleading to use a line grap( (en t(e K5

    a$is ontains merely qualitative variables. Figure " inappropriately

    s(os a line grap( of t(e ard game data from @a(oo- disussed in

    t(e setion on qualitative variables. (e defet in Figure " is t(at it

    gives t(e false impression t(at t(e games are naturally ordered in a

    numerial ay.

  • 8/20/2019 Quantitative Variables.docx

    35/86

    Figure ". , line grap(- inappropriately used- depiting t(e number ofpeople playing different ard games on Sunday and >ednesday.

    Question 1 out of .Line grap(s are most similar to

    bar (arts.

    (istograms.

    stem and leaf displays.

    frequeny polygons.

     

  • 8/20/2019 Quantitative Variables.docx

    36/86

     , line grap( is a bar grap( it( t(e tops of t(e bars represented bypoints /oined by lines.

    Dot lots

     Author(s)David M. Lane

    Prerequisites#ar '(arts

    Learning Objectives

    1. 'reate and interpret dot plots

    2. 7udge (et(er a dot plot ould be appropriate for a given dataset

    Dot plots an be used to display various types of information. Figure

    1 uses a dot plot to display t(e number of M P MGs of ea( olor

    found in a bag of M P MGs. =a( dot represents a single M P M. From

    t(e figure- you an see t(at t(ere ere 3 blue M P MGs- 1< bron M

    P MGs- et.

    Figure 1. , dot plot s(oing t(e number of M P MGs of various olorsin a bag of M P MGs.

    http://onlinestatbook.com/2/graphing_distributions/bar_chart.htmlhttp://onlinestatbook.com/2/graphing_distributions/bar_chart.html

  • 8/20/2019 Quantitative Variables.docx

    37/86

    (e dot plot in Figure 2 s(os t(e number of people playing

    various ard games on t(e @a(oo ebsite on a >ednesday. Enli6e

    Figure 1- t(e loation rat(er t(an t(e number of dots represents t(e

    frequeny.

    Figure 2. , dot plot s(oing t(e number of people playing variousard games on a >ednesday.

    (e dot plot in Figure 3 s(os t(e number of people playing on

    a Sunday and on a >ednesday. (is grap( ma6es it easy to

    ompare t(e popularity of t(e games separately for t(e to days-

    but does not ma6e it easy to ompare t(e popularity of a given

    game on t(e to days.

  • 8/20/2019 Quantitative Variables.docx

    38/86

    Figure 3. , dot plot s(oing t(e number of people playing variousard games on a Sunday and on a >ednesday.

  • 8/20/2019 Quantitative Variables.docx

    39/86

    Figure ". ,n alternate ay of s(oing t(e number of people playingvarious ard games on a Sunday and on a >ednesday.

    (e dot plot in Figure " ma6es it easy to ompare t(e days of t(e

    ee6 for speifi games (ile still portraying differenes among

    games.

    Question 1 out of !.Dot plots are typially used to represent frequenies.

    rue

     False

     

    ,lt(oug( dot plots ould be used to represent statistis su( asmeans- it is not reommended. (ey are typially used forfrequenies.

    Statistical Literacy

     Author(s)Seyd =ran and David Lane

    ,re 'ommerial Ve(iles in e$as

    EnsafeO

    Prerequisites*rap(ing Distributions

    http://onlinestatbook.com/2/graphing_distributions/graphing_distributions.htmlhttp://onlinestatbook.com/2/graphing_distributions/graphing_distributions.html

  • 8/20/2019 Quantitative Variables.docx

    40/86

    , nes report on t(e safety of ommerial ve(iles in e$as stated

    t(at one out of five ommerial ve(iles (ave been pulled off t(e

    road in 2:12 beause t(ey ere unsafe. 8n addition- 12-3:1

    ommerial drivers (ave been banned from t(e road for safety

    violations.(e aut(or presents t(e bar (art belo to provide information

    about t(e perentage of fatal ras(es involving ommerial ve(iles

    in e$as sine 2::&. (e aut(or also quotes D!S diretor Steven

    M'ra4

    'ommerial ve(iles are responsible for appro$imately 1%perent of t(e fatalities in e$as ras(es. (ose (o (oose todrive unsafe ommerial ve(iles or drive a ommerial ve(ile

    unsafely pose a serious t(reat to t(e motoring publi.

    0H$' D( (2 'H&)34

    #ased on (at you (ave learned in t(is

    (apter- does t(is bar (art provide enoug(

    information to onlude t(at unsafe or

    unsafely driven ommerial ve(iles pose a

    serious t(reat to t(e motoring publiO >(at

    mig(t you onlude if 3: perent of all t(e

    ve(iles on t(e roads of e$as in 2:1: ere

    ommerial and aounted for 1& perent of 

    fatal ras(esO

    (is bar (art does not provide enoug( information to dra su( a

    onlusion beause e don;t 6no- on t(e average- in a given year(at perentage of all ve(iles on t(e road are ommerial ve(iles.For e$ample- if 3: perent of all t(e ve(iles on t(e roads of e$asin 2:1: are ommerial ones and only 1& perent of fatal ras(esinvolved ommerial ve(iles- t(en ommerial ve(iles are safert(an non5ommerial ones. 9ote t(at in t(is ase ): perent of

  • 8/20/2019 Quantitative Variables.docx

    41/86

    ve(iles are non5ommerial and t(ey are responsible for +"perent of t(e fatal ras(es.

    Linear #y Design

    Prerequisites*rap(ing Distributions

    Fo$ 9es aired t(e line grap( belo s(oing t(e numberunemployed during four quarters beteen 2::) and 2:1:.

    0H$' D( (2 'H&)34

    Does Fo$ 9esG

    line grap(

    provide

    misleading

    informationO

    >(y or >(y

    notO

    (ere are ma/or flas it( t(e Fo$ 9es grap(. First- t(e title oft(e grap( is misleading. ,lt(oug( t(e data s(o t(e numberunemployed- Fo$ 9es; grap( is titled C7ob Loss by 0uarter.CSeond- t(e intervals on t(e K5a$is are misleading. ,lt(oug( t(ere

    are & mont(s beteen September 2::+ and Mar( 2::< and 1%mont(s beteen Mar( 2::< and 7une 2:1:- t(e intervals arerepresented in t(e grap( by very similar lengt(s. (is gives t(efalse impression t(at unemployment inreased steadily.

    (e grap( presented belo is orreted so t(at distanes on t(e K5a$is are proportional to t(e number of days beteen t(e dates. (is

    http://onlinestatbook.com/2/graphing_distributions/graphing_distributions.htmlhttp://onlinestatbook.com/2/graphing_distributions/graphing_distributions.htmlhttp://onlinestatbook.com/2/graphing_distributions/graphing_distributions.html

  • 8/20/2019 Quantitative Variables.docx

    42/86

    grap( s(os learly t(at t(e rate of inrease in t(e numberunemployed is greater beteen September 2::+ and Mar( 2::<t(an it is beteen Mar( 2::< and 7une 2:1:.

    Learning Objectives

    • Construct a stem-and leaf plot.

    • Understand the importance of a stem-and-leaf plot in statistics.

    • Construct and interpret a pie chart.

    • Construct and interpret a bar graph.

    • Create a frequency distribution chart.

    • Construct and interpret a histogram.

    • Use technology to create graphical representations of data.

  • 8/20/2019 Quantitative Variables.docx

    43/86

    What is the puppet doing? She can’t be cutting a pizza, because the pieces are all

    different colors and sizes. t seems li!e she is dra"ing some type of a display to sho"

    different amounts of a "hole circle. #he colors must represent different parts of the

    "hole. $s you proceed through this lesson, refer bac! to this picture so that you "ill be

    able to create a meaningful and detailed ans"er to the question, %What is the puppetdoing?&

    Pie Charts

    'ie charts, or circle graphs, are used e(tensi)ely in statistics. #hese graphs appear

    often in ne"spapers and magazines. $ pie chart sho"s the relationship of the parts to

    the "hole by )isually comparing the sizes of the sections *slices+. 'ie charts can be

    constructed by using a hundreds dis! or by using a circle. #he hundreds dis! is built on

    the concept that the "hole of anything is , "hile the circle is built on the concept

    that360∘ is the "hole of anything. /oth methods of creating a pie chart are acceptable,

    and both "ill produce the same result. #he sections ha)e different colors to enable anobser)er to clearly see the differences in the sizes of the sections. #he follo"ing

    e(ample "ill first be done by using a hundreds dis! and then by using a circle.

    Example 10 

    #he 0ed Cross /lood 1onor Clinic had a )ery successful morning collecting blood

    donations. Within 2 hours, people had made donations, and the follo"ing is a table

    sho"ing the blood types of the donations3

    Blood Type A B O AB

    Number of donors 4 5 6 7

    Construct a pie chart to represent the data.

    Solution:

    Step ! 1etermine the total number of donors37+5+9+4=25.

    Step "! 8(press each donor number as a percent of the "hole by using the

    formulaPercent=fn⋅100%, "here f  is the frequency and n is the total number.

  • 8/20/2019 Quantitative Variables.docx

    44/86

    725⋅100%=28%525⋅100%=20%925⋅100%=36%425⋅100%=16%

    Step #! Use a hundreds dis! and simply count the correct number for each blood type

    * line 9 percent+.

    Step $! :raph each section. Write the name and correct percentage inside the section.

    Color each section a different color.

    #he abo)e pie chart "as created by using a hundreds dis!, "hich is a circle "ith di)isions in groups of 5. 8ach di)ision *line+ represents percent. ;rom the graph, you

    can see that more donations "ere of #ype < than any other type. #he fe"est number of

    donations of blood collected "as of #ype $/. f the percentages had not been entered in

    each section, these same conclusions could ha)e been made based simply on the size

    of each section.

    Solution:

    Step ! 1etermine the total number of donors37+5+9+4=25.

    Step "! 8(press each donor number as the number of degrees of a circle that it

    represents by using the formulaDegrees=fn⋅360∘, "here f  is the frequency and n is

    the total number.

    725⋅360∘=100.8∘525⋅360∘=72∘925⋅360∘=129.6∘425⋅360∘=57.6∘

    Step #! Using a protractor, graph each section of the circle.

    Step $! Write the name and correct percentage inside each section. Color each section

    a different color.

  • 8/20/2019 Quantitative Variables.docx

    45/86

    #he abo)e pie chart "as created by using a protractor and graphing each section of the

    circle according to the number of degrees needed. ;rom the graph, you can see that

    more donations "ere of #ype < than any other type. #he fe"est number of donations of

    blood collected "as of #ype $/. =otice that the percentages ha)e been entered in each

    section of the graph and not the numbers of degrees. #his is because degrees "ouldnot be meaningful to an obser)er trying to interpret the graph. n order to create a pie

    chart by using a circle, it is necessary to use the formula to calculate the number of

    degrees for each section, and in order to create a pie chart by using a hundreds dis!, it

    is necessary to use the formula to determine the percentage for each section. n the

    end, ho"e)er, both methods result in identical graphs.

    Example 11

     $ ne" restaurant is opening in to"n, and the o"ner is trying )ery hard to complete the

    menu. >e "ants to include a choice of 5 salads and has presented his partner "ith the

    follo"ing pie chart to represent the results of a recent sur)ey that he conducted of theto"n’s people. #he sur)ey as!ed the question, What is your fa)orite !ind of salad?

    Use the pie chart to ans"er the follo"ing questions3

    . Which salad "as the most popular choice?

    @. Which salad "as the least popular choice?

    2. f 2 people "ere sur)eyed, ho" many people chose each type of salad?

    7. What is the difference bet"een the number of people "ho chose the spinach

    salad and the number of people "ho chose the garden salad?

    Solution:

    . #he most popular salad "as the caesar salad.

    @. #he least popular salad "as the taco salad.

  • 8/20/2019 Quantitative Variables.docx

    46/86

    2. Caesar salad335%=35100=0.35

    (300)(0.35)=105 people

    #aco salad310%=10100=0.10

    (300)(0.10)=30 people

    Spinach salad317%=17100=0.17

    (300)(0.17)=51 people

    :arden salad313%=13100=0.13

    (300)(0.13)=39 people

    Chef salad325%=25100=0.25

    (300)(0.25)=75 people

    7. #he difference bet"een the number of people "ho chose the spinach salad and the

    number of people "ho chose the garden salad is51−39=12 people.

    f "e re)isit the puppet "ho "as introduced at the beginning of the lesson, you should

    no" be able to create a story that details "hat she is doing. $n e(ample "ould be that

    she is in charge of the student body and is presenting to the students the results of a

    questionnaire regarding student acti)ities for the first semester.

  • 8/20/2019 Quantitative Variables.docx

    47/86

    2834233516174705602639354735383555475448

    Solution:

    Step ! Create the stem-and-leaf plot.

    Some people prefer to arrange the data in order before the stems and lea)es are

    created. #his "ill ensure that the )alues of the lea)es are in order. >o"e)er, this is not

    necessary and can ta!e a great deal of time if the data set is large. We "ill first create

    the stem-and-leaf plot, and then "e "ill organize the )alues of the lea)es.

    #he leading digit of a data )alue is used as the stem, and the trailing digit is used as the

    leaf. #he numbers in the stem column should be consecuti)e numbers that begin "ith

    the smallest class and continue to the largest class. f there are no )alues in a class, do

    not enter a )alue in the leaf − Aust lea)e it blan!.

    Step "! 

  • 8/20/2019 Quantitative Variables.docx

    48/86

    #he number of )alues in the leaf column should equal the number of data )alues that

    "ere gi)en in the table. #he )alue that appears the most often in the same leaf ro" is

    the trailing digit of the mode of the data set. #he mode of this data set is 25. ;or 4 of the

    @ days, the number of animals recei)ing treatment "as bet"een 27 and 26. #he

    )eterinarian school treated a minimum of 5 animals and a ma(imum of B animals onany one day. #he median of the data can be quic!ly calculated by using the )alues in

    the leaf column to locate the )alue in the middle position. n this stem and leaf plot, the

    median is the mean of the sum of the numbers represented by the 10thand

    the11thlea)es335+352=702=35.

    Example 13

    #he follo"ing numbers represent the gro"th *in centimeters+ of some plants after @5

    days.

    Construct a stem-and-leaf plot to represent the data, and list 2 facts that you !no"about the gro"th of the plants.

    18103736613941495052575351573948563336193041513860

    Solution:

     $ns"ers "ill )ary, but the follo"ing are some possible responses3

    • ;rom the stem-and-leaf plot, the gro"th of the plants ranged from a minimum of

    cm to a ma(imum of B cm.

    • #he median of the data set is the )alue in the 13thposition, "hich is 7 cm.

    • #here "as no gro"th recorded in the class of @ cm, so there is no number in the

    leaf ro".

    • #he data set is multimodal.

    Bar &raphs

    #he different types of graphs that you ha)e seen so far are plots to use "ith quantitati)e

    )ariables. $ qualitati)e )ariable can be plotted using a bar graph. $ bar graph is a plot

    made of bars "hose heights *)ertical bars+ or lengths *horizontal bars+ represent the

    frequencies of each category. #here is bar for each category, "ith space bet"een

  • 8/20/2019 Quantitative Variables.docx

    49/86

    each bar, and the data that is plotted is discrete data. 8ach category is represented by

    inter)als of the same "idth. When constructing a bar graph, the category is usually

    placed on the horizontal a(is, and the frequency is usually placed on the )ertical a(is.

    #hese )alues can be re)ersed if the bar graph has horizontal bars.

    Example 14

    Construct a bar graph to represent the depth of the :reat a!es3

    a!e Superior D ,222 ft.

    a!e Eichigan D 6@2 ft.

    a!e >uron D 45 ft.

    a!e

  • 8/20/2019 Quantitative Variables.docx

    50/86

    . What type of sho" is "atched the most?

    @. What type of sho" is "atched the least?

    2. $ppro(imately ho" many students participated in the sur)ey?

    7. 1oes the graph sho" the differences bet"een the preferences of males andfemales?

    Solution:

    . Sit-coms are "atched the most.

    @. Huiz sho"s are "atched the least.

    2. $ppro(imately45+20+18+6+35+16=140students participated in the sur)ey.

    7. =o, the graph does not sho" the differences bet"een the preferences of males

    and females.

    f bar graphs are constructed on grid paper, it is )ery easy to !eep the inter)als the

    same size and to !eep the bars e)enly spaced. n addition to helping in the appearance

    of the graph, grid paper also enables you to more accurately determine the frequency of 

    each class.

    Example 16 

    #he follo"ing bar graph represents the part-time Aobs held by a group of grade

    students3

    Using the abo)e bar graph, ans"er the follo"ing questions3

    . What "as the most popular part-time Aob?

    @. What "as the part-time Aob held by the least number of students?

    2. Which part-time Aobs employed or more of the students?

  • 8/20/2019 Quantitative Variables.docx

    51/86

    7. s it possible to create a table of )alues for the bar graph? f so, construct the

    table of )alues.

    5. What percentage of the students "or!ed as a deli)ery person?

    Solution:

    . #he most popular part-time Aob "as in the fast food industry.

    @. #he part-time Aob of tutoring "as the one held by the least number of students.

    2. #he part-time Aobs that employed or more students "ere in the fast food, deli)ery,

    la"n maintenance, and grocery store businesses.

    7. Ies, itJs possible to create a table of )alues for the bar graph.

    Part%Time 'obBaby

    Sitting

    (ast

    (ood

    )eliver 

    y

    La*n

    Care

    &rocery

    StoreTutoring

    Number of

    StudentsF 7 @ 2 5

    5. #he percentage of the students "ho "or!ed as a deli)ery person "as appro(imately

    6.7.

    +istograms

     $n e(tension of the bar graph is the histogram. $ histogram is a type of )ertical bargraph in "hich the bars represent grouped continuous data. While there are similarities

    bet"een a bar graph and a histogram, such as each bar being the same "idth, a

    histogram has no spaces bet"een the bars. #he quantitati)e data is grouped according

    to a determined bin size, or inter)al. #he bin size refers to the "idth of each bar, and the

    data is placed in the appropriate bin.

    #he bins, or groups of data, are plotted on the x-a(is, and the frequencies of the binsare plotted on the y-a(is. $ grouped fre,uency distribution is constructed for the

    numerical data, and this table is used to create the histogram. n most cases, the

    grouped frequency distribution is designed so there are no brea!s in the inter)als. #helast )alue of one bin is actually the first )alue counted in the ne(t bin. #his means that if

    you had groups of data "ith a bin size of , the bins "ould be represented by the

    notation K-+, K-@+, K@-2+, etc. 8ach bin appears to contain )alues, "hich is

    more than the desired bin size of . #herefore, the last digit of each bin is counted as

    the first digit of the follo"ing bin.

  • 8/20/2019 Quantitative Variables.docx

    52/86

    #he first bin includes the )alues through 6, and the ne(t bin includes the )alues 6

    through 6. #his ma!es the bins the proper size. /in sizes are "ritten in this manner to

    simplify the process of grouping the data. #he first bin can begin "ith the smallest

    number of the data set and end "ith the )alue determined by adding the bin "idth to

    this )alue, or the bin can begin "ith a reasonable )alue that is smaller than the smallestdata )alue.

    Example 17 

    Construct a frequency distribution table "ith a bin size of for the follo"ing data,

    "hich represents the ages of 2 lottery "inners3

    384129334074664560552552546146515957666232476550392235727749

    Solution:

    Step ! 1etermine the range of the data by subtracting the smallest )alue from the

    largest )alue.

    Range: 77−22=55

    Step "! 1i)ide the range by the bin size to ensure that you ha)e at least 5 groups of

    data. $ histogram should ha)e from 5 to bins to ma!e it meaningful3 5510=5.5≈6.

    Since you cannot ha)e .5 of a bin, the result indicates that you "ill ha)e at least B bins.

    Step #! Construct the table.

    Bin (re,uency

    [20−30) 2

    [30−40) 5

    [40−50) B

    [50−60) F

    [60−70) 5

    [70−80) 2

    Step $! 1etermine the sum of the frequency column to ensure that all the data has been

    grouped.3+5+6+8+5+3=30

    When data is grouped in a frequency distribution table, the actual data )alues are lost.

    #he table indicates ho" many )alues are in each group, but it doesnJt sho" the actual

    )alues.

  • 8/20/2019 Quantitative Variables.docx

    53/86

    #here are many different "ays to create a distribution table and many different

    distribution tables that can be created. >o"e)er, for the purpose of constructing a

    histogram, the method sho"n "or!s )ery "ell, and it is not difficult to complete. When

    the number of data )alues is )ery large, another column is often inserted in the

    distribution table. #his column is a tally column, and it is used to account for the number of )alues "ithin a bin. $ tally column facilitates the creation of the distribution table and

    usually allo"s the tas! to be completed more quic!ly.

    Example 18 

    #he numbers of years of ser)ice for 45 teachers in a small to"n are listed belo"3

    1, 6, 11, 26, 21, 18, 2, 5, 27, 33, 7, 15, 22, 30, 831, 5, 25, 20, 19, 4, 9, 19, 34, 3, 16,

    23, 31, 10, 42, 31, 26, 19, 3, 12, 14, 28, 32, 1, 17, 24, 34, 16, 1,18, 29, 10, 12, 30, 13

    , 7, 8, 27, 3, 11, 26, 33, 29, 207, 21, 11, 19, 35, 16, 5, 2, 19, 24, 13, 14, 28, 10, 31

    Using the abo)e data, construct a frequency distribution table "ith a bin size of 5.

    Solution:

    Range: 35−1345=34=6.8≈7

    Iou "ill ha)e 4 bins.

    ;or each )alue that is in a bin, dra" a stro!e in the #ally column. #o ma!e counting the

    stro!es easier, dra" 7 stro!es and cross them out "ith the fifth stro!e. #his process

    bundles the stro!es in groups of 5, and the frequency can be readily determined.

    Bin Tally (re,uency

    [0−5) |||| |||| |

    [5−10) |||| |||| 6

    [10−15) |||| |||| || @

    [15−20) |||| |||| |||| 7

    [20−25) |||| || 4

    [25−30) |||| ||||

    [30−35) |||| |||| || @

    11+9+12+14+7+10+12=75

  • 8/20/2019 Quantitative Variables.docx

    54/86

    =o" that you ha)e constructed the frequency table, the grouped data can be used to

    dra" a histogram. i!e a bar graph, a histogram requires a title and properly labeled x-and y-a(es.

    Example 19

    Use the data from 8(ample 4 that displays the ages of the lottery "inners to constructa histogram. #he data is sho"n again belo"3

    Bin (re,uency

    [20−30) 2

    [30−40) 5

    [40−50) B

    [50−60)F

    [60−70) 5

    [70−80) 2

    Solution:

    Use the data as it is represented in the distribution table to construct the histogram.

    ;rom loo!ing at the tops of the bars, you can see ho" many "inners "ere in each

    category, and by adding these numbers, you can determine the total number of "inners.

    Iou can also determine ho" many "inners "ere "ithin a specific category. ;or

    e(ample, you can see that F "inners "ere B years of age or older. #he graph can also

    be used to determine percentages. ;or e(ample, it can ans"er the question, %What

    percentage of the "inners "ere 5 years of age or older?& as follo"s3

    1630=0.533̄̄ ¯̄ (̄0.533)(100%)≈5.3%.

  • 8/20/2019 Quantitative Variables.docx

    55/86

    Example 20 

    a+ Use the data and the distribution table that represent the ages of teachers from

    8(ample F to construct a histogram to display the data. #he distribution table is sho"n

    again belo"3

    Bin Tally (re,uency

    [0−5) |||| |||| |

    [5−10) |||| |||| 6

    [10−15) |||| |||| || @

    [15−20) |||| |||| |||| 7

    [20−25) |||| || 4

    [25−30) |||| ||||

    [30−35) |||| |||| || @

    b+ =o" use the histogram to ans"er the follo"ing questions.

    i+ >o" many teachers teach in this small to"n?

    ii+ >o" many teachers ha)e "or!ed for less than 5 years?

    iii+ f teachers are able to retire "hen they ha)e taught for 2 years or more, ho" many

    are eligible to retire?

    i)+ What percentage of the teachers still ha)e to teach for years or fe"er before they

    are eligible to retire?

    )+ 1o you thin! that the maAority of the teachers are young or old? Lustify your ans"er.

    Solution:

  • 8/20/2019 Quantitative Variables.docx

    56/86

    a+

    b+ i+11+9+12+14+7+10+12=75

    n this small to"n, 45 teachers are teaching.

    ii+ teachers ha)e taught for less than 5 years.

    iii+ @ teachers are eligible to retire.

    i)+1775=0.2266̄̄ ¯̄ (̄0.2266)(100%)≈23%

     $ppro(imately @2 of the teachers must teach for years or fe"er before they are

    eligible to retire.

    )+ $ns"ers "ill )ary, but one possible ans"er is that the maAority of the teachers are

    young, because 7B ha)e taught for less than @ years.

    #echnology can also be used to plot a histogram. #he #-F2 can be used to create a

    histogram by using S#$# and S#$# '

  • 8/20/2019 Quantitative Variables.docx

    57/86

    Using the #0$C8 feature "ill gi)e you information about the data in each bar of the

    histogram.

    #he #0$C8 feature tells you that in the first bin, "hich is KB-4+, there are 7 )alues.

    #he #0$C8 feature tells you that in the second bin, "hich is K4-F+, there are B )alues.

    #o ad)ance to the ne(t bin, or bar, of the histogram, use the cursor and mo)e to the

    right. #he information obtained by using the #0$C8 feature "ill enable you to create a

    frequency table and to dra" the histogram on paper.

    #he shape of a histogram can tell you a lot about the distribution of the data, as "ell as

    pro)ide you "ith information about the mean, median, and mode of the data set. #he

    follo"ing are some typical histograms, "ith a caption belo" each one e(plaining the

    distribution of the data, as "ell as the characteristics of the mean, median, and mode.

    1istributions can ha)e other shapes besides the ones sho"n belo", but these representthe most common ones that you "ill see "hen analyzing data. n each of the graphs

    belo", the distributions are not perfectly shaped, but are shaped enough to identify an

    o)erall pattern.

  • 8/20/2019 Quantitative Variables.docx

    58/86

    a+

    ;igure a represents a bell-shaped distribution, "hich has a single pea! and tapers off to

    both the left and to the right of the pea!. #he shape appears to be symmetric about the

    center of the histogram. #he single pea! indicates that the distribution is unimodal. #he

    highest pea! of the histogram represents the location of the mode of the data set. #he

    mode is the data )alue that occurs the most often in a data set. ;or a symmetric

    histogram, the )alues of the mean, median, and mode are all the same and are all

    located at the center of the distribution.

    b+

    ;igure b represents a distribution that is appro(imately uniform and forms a rectangular,

    flat shape. #he frequency of each class is appro(imately the same.

    c+

    ;igure c represents a right%s-e*ed distribution, "hich has a pea! to the left of the

    distribution and data )alues that taper off to the right. #his distribution has a single pea!

    and is also unimodal. ;or a histogram that is s!e"ed to the right, the mean is located to

    the right on the distribution and is the largest )alue of the measures of central tendency.

    #he mean has the largest )alue because it is strongly affected by the outliers on the

  • 8/20/2019 Quantitative Variables.docx

    59/86

    right tail that pull the mean to the right. #he mode is the smallest )alue, and it is located

    to the left on the distribution. #he mode al"ays occurs at the highest point of the pea!.

    #he median is located bet"een the mode and the mean.

    d+

    ;igure d represents a left%s-e*ed distribution, "hich has a pea! to the right of the

    distribution and data )alues that taper off to the left. #his distribution has a single pea!and is also unimodal. ;or a histogram that is s!e"ed to the left, the mean is located to

    the left on the distribution and is the smallest )alue of the measures of central tendency.

    #he mean has the smallest )alue because it is strongly affected by the outliers on the

    left tail that pull the mean to the left. #he median is located bet"een the mode and the

    mean.

    e+

    ;igure e has no shape that can be defined. #he only defining characteristic about this

    distribution is that it has @ pea!s of the same height. #his means that the distribution is

    bimodal.

     $nother type of graph that can be dra"n to represent the same set of data as a

    histogram represents is a frequency polygon. $ fre,uency polygon is a graphconstructed by using lines to Aoin the midpoints of each inter)al, or bin. #he heights of

    the points represent the frequencies. $ frequency polygon can be created from the

    histogram or by calculating the midpoints of the bins from the frequency distribution

    table. #he midpoint of a bin is calculated by adding the upper and lo"er boundary

    )alues of the bin and di)iding the sum by @.

    Example 22 

  • 8/20/2019 Quantitative Variables.docx

    60/86

    #he follo"ing histogram represents the mar!s made by 7 students on a math test.

    Use the histogram to construct a frequency polygon to represent the data.

    Solution:

    #here is no data )alue greater than and less than @. #he Aagged line that is inserted

    on the x-a(is is used to represent this fact. #he area under the frequency polygon is the

    same as the area under the histogram and is, therefore, equal to the frequency )alues

    that "ould be displayed in a distribution table. #he frequency polygon also sho"s the

    shape of the distribution of the data, and in this case, it resembles a bell cur)e.

    Example 23

    #he follo"ing distribution table represents the number of miles run by @ randomly

    selected runners during a recent road race3

    Bin (re,uency

    [5.5−10.5)

    [10.5−15.5) 2

    [15.5−20.5) @

  • 8/20/2019 Quantitative Variables.docx

    61/86

    Bin (re,uency

    [20.5−25.5) 7

    [25.5−30.5) 5

    [30.5−35.5) 2

    [35.5−40.5) @

    Using this table, construct a frequency polygon.

    Solution:Step ! Calculate the midpoint of each bin by adding the @ numbers of the inter)al and

    di)iding the sum by @.

    Midpoints: 5.5+10.52=162=820.5+25.52=462=2335.5+40.52=762=3810.5+15.

    52=262=1325.5+30.52=562=2815.5+20.52=362=1830.5+35.52=662=33

    Step "! 'lot the midpoints on a grid, ma!ing sure to number the x-a(is "ith a scale that

    "ill include the bin sizes. Loin the plotted midpoints "ith lines.

     $ frequency polygon usually e(tends unit belo" the smallest bin )alue and unit

    beyond the greatest bin )alue. #his e(tension gi)es the frequency polygon an

    appearance of ha)ing a starting point and an ending point, "hich pro)ides a )ie" of the

    distribution of data. f the data set "ere )ery large so that the number of bins had to be

    increased and the bin size decreased, the frequency polygon "ould appear as a smooth

    cur)e.

    Lesson Summary

    n this lesson, you learned ho" to represent data that "as presented in )arious forms.

    1ata that could be represented as percentages "as displayed in a pie chart, or circle

    graph. 1iscrete data that "as qualitati)e "as displayed on a bar graph. ;inally,

  • 8/20/2019 Quantitative Variables.docx

    62/86

    continuous data that "as grouped "as graphed on a histogram or on a frequency

    polygon. Iou also learned to detect characteristics of a distribution by simply obser)ing

    the shape of a histogram.

  • 8/20/2019 Quantitative Variables.docx

    63/86

    color of cars, a person’s status, and fa)orite )acation spots. #he follo"ing flo" chart

    should help you to better understand the abo)e terms.

    Example 1

    Select the best descriptions for the follo"ing )ariables and indicate your selections by

    mar!ing an Mx’ in the appropriate bo(es.

    .ariable /uantitative /ualitative )iscrete Continuous

    =umber of members in a family

     $ person’s marital status

    ength of a person’s arm

    Color of cars

    =umber of errors on a math test

    Solution:

    .ariable /uantitative /ualitative )iscrete Continuous

    =umber of members in a family   x x

     $ person’s marital status   x

    ength of a person’s arm   x x

    Color of cars   x

    =umber of errors on a math test   x x

    Gariables can also be classified as dependent or independent. When there is a linear

    relationship bet"een @ )ariables, the )alues of one )ariable depend upon the )alues of

    the other )ariable. n a linear relation, the )alues of y depend upon the )alues of x.#herefore, the dependent variable is represented by the )alues that are plotted on

    the y-a(is, and the independent variable is represented by the )alues that are plotted

    on the x-a(is.

    Example 2 

  • 8/20/2019 Quantitative Variables.docx

    64/86

    Sally "or!s at the local ballpar! stadium selling lemonade. She is paid N5. each time

    she "or!s, plus N.45 for each glass of lemonade she sells. Create a table of )alues to

    represent Sally’s earnings if she sells F glasses of lemonade. Use this table of )alues to

    represent her earnings on a graph.

    Solution:

    #he first step is to "rite an equation to represent her earnings and then to use this

    equation to create a table of )alues.

    y=0.75x+15, "here y represents her earnings and x represents the number of

    glasses of lemonade she sells.

    Number of &lasses of Lemonade 0arnings

    N5.

    N5.45

    @ NB.5

    2 N4.@5

    7 NF.

    5 NF.45

    B N6.5

    4 N@.@5

    F N@.

    #he dependent )ariable is the money earned, and the independent )ariable is the

    number of glasses of lemonade sold. #herefore, money is on the y-a(is, and the

    number of glasses of lemonade is on the x-a(is.

    ;rom the table of )alues, Sally "ill earn N@. if she sells F glasses of lemonade.

  • 8/20/2019 Quantitative Variables.docx

    65/86

    =o" that the points ha)e been plotted, the decision has to be made as to "hether or not

    to Aoin them. /et"een e)ery @ points plotted on the graph are an infinite number of

    )alues. f these )alues are meaningful to the problem, then the plotted points can be

     Aoined. #his type of data is called continuous data. f the )alues bet"een the @ plotted

    points are not meaningful to the problem, then the points should not be Aoined. #his type

    of data is called discrete data. Since glasses of lemonade are represented by "hole

    numbers, and since fractions or decimals are not appropriate )alues, the pointsbet"een @ consecuti)e )alues are not meaningful in this problem. #herefore, the points

    should not be Aoined. #he data is discrete.

    =o" it is time to re)isit the problem presented in the introduction.

    #he local arena is trying to attract as many participants as possible to attend the

    community’s %S!ate for Scoliosis& e)ent. 'articipants pay a fee of N. for registering,

    and, in addition, the arena "ill donate N2. for each hour a participant s!ates, up to a

    ma(imum of B hours. Create a table of )alues and dra" a graph to represent a

    participant "ho s!ates for the entire B hours. >o" much money can a participant raise

    for the community if heOshe s!ates for the ma(imum length of time?

    Solution:

    #he equation for this scenario is y=3x+10, "here y represents the money made by

    the participant, and x represents the number of hours the participant s!ates.

    Numbers of +ours S-ating 1oney 0arned

    N.

    N2.

    @ NB.

    2 N6.

    7 N@@.

  • 8/20/2019 Quantitative Variables.docx

    66/86

    Numbers of +ours S-ating 1oney 0arned

    5 N@5.

    B N@F.

    #he dependent )ariable is the money made, and the independent )ariable is the

    number of hours the participant s!ated. #herefore, money is on the y-a(is, and time ison the x-a(is as sho"n belo"3

     $ participant "ho s!ates for the entire B hours can ma!e N@F. for the S!ate forScoliosis e)ent. #he points are Aoined, because the fractions and decimals bet"een @

    consecuti)e points are meaningful for this problem. $ participant could s!ate for 2

    minutes, and the arena "ould pay that s!ater N.5 for the time s!ating. #he data is

    continuous.

    inear graphs are important in statistics "hen se)eral data sets are used to represent

    information about a single topic. $n e(ample "ould be data sets that represent different

    plans a)ailable for cell phone users. #hese data sets can be plotted on the same grid.

    #he resulting graph "ill sho" intersection points for the plans. #hese intersection points

    indicate a coordinate "here @ plans are equal. $n obser)er can easily interpret thegraph to decide "hich plan is best, and "hen. f the obser)er is trying to choose a plan

    to use, the choice can be made easier by seeing a graphical representation of the data.

    Example 3

    #he follo"ing graph represents 2 plans that are a)ailable to customers interested in

    hiring a maintenance company to tend to their la"n. Using the graph, e(plain "hen it

    "ould be best to use each plan for la"n maintenance.

  • 8/20/2019 Quantitative Variables.docx

    67/86

    Solution:

    ;rom the graph, the base fee that is charged for each plan is ob)ious. #hese )alues are

    found on the y-a(is. 'lan $ charges a base fee of N@., 'lan C charges a base fee

    of N., and 'lan / charges a base fee of N5.. #he cost per hour can be

    calculated by using the )alues of the intersection points and the base fee in the

    equation y=mx+b and sol)ing for m. 'lan / is the best plan to choose if the la"n

    maintenance ta!es less than @.5 hours. $t @.5 hours, 'lan / and 'lan C both cost

    N5. for la"n maintenance. $fter @.5 hours, 'lan C is the best deal, until 5 hours

    of la"n maintenance is needed. $t 5 hours, 'lan $ and 'lan C both cost N2. for

    la"n maintenance. ;or more than 5 hours of la"n maintenance, 'lan $ is the best

    plan. $ll of the abo)e information "as ob)ious from the graph and "ould enhance the

    decision-ma!ing process for any interested client.

    #he abo)e graphs represent linear functions, and are called linear *line+ graphs. 8ach of 

    these graphs has a defined slope that remains constant "hen the line is plotted. $

    )ariation of this graph is a bro-en%line graph. #his type of line graph is used "hen it is

    necessary to sho" change o)er time. $ line is used to Aoin the )alues, but the line has

    no defined slope. >o"e)er, the points are meaningful, and they all represent animportant part of the graph. Usually a bro!en-line graph is gi)en to you, and you must

    interpret the gi)en information from the graph.

    Example 4

    #he follo"ing graph is an e(ample of a bro!en-line graph, and it represents the time of a

    round-trip Aourney, dri)ing from home to a popular campground and bac!.

  • 8/20/2019 Quantitative Variables.docx

    68/86

    a+ >o" far is it from home to the picnic par!?

    b+ >o" far is it from the picnic par! to the campground?

    c+ $t "hat @ places did the car stop?

    d+ >o" long "as the car stopped at the campground?

    e+ When does the car arri)e at the picnic par!?

    f+ >o" long did it ta!e for the return trip?

    g+ What "as the speed of the car from home to the picnic par!?

    h+ What "as the speed of the car from the campground to home?

    Solution:

    a+ t is 7 miles from home to the picnic par!.

    b+ t is B miles from the picnic par! to the campground.

    c+ #he car stopped at the picnic par! and at the campground.

    d+ #he car "as stopped at the campground for 5 minutes.

    e+ #he car arri)ed at the picnic par! at 3 am.

    f+ #he return trip too! hour.

    g+ #he speed of the car from home to the picnic par! "as 7 miOh.

    h+ #he speed of the car from the campground to home "as miOh.Example 5 

    Sam decides to spend some time "ith his friend $aron. >e hops on his bi!e and starts

    off to $aron’s house, but on his "ay, he gets a flat tire and must "al! the remaining

    distance.

  • 8/20/2019 Quantitative Variables.docx

    69/86

    and then Sam returns home. o" far is it from $aron’s house to the mall?

    c+ $t "hat time did Sam ha)e a flat tire?

    d+ >o" long did Sam stay at $aron’s house?

    e+ $t "hat speed did Sam tra)el from $aron’s house to the mall and then from the mall

    to home?

    Solution:

    a+ t is @5 !m from Sam’s house to $aron’s house.

    b+ t is 5 !m from $aron’s house to the mall.

    c+ Sam had a flat tire at 3 am.

    d+ Sam stayed at $aron’s house for hour.

    e+ Sam tra)eled at a speed of 2 !mOh from $aron’s house to the mall and then at a

    speed of 7 !mOh from the mall to home.

  • 8/20/2019 Quantitative Variables.docx

    70/86

    #he connection is ob)ious−"hen the price of peaches "as high, the sales "ere lo",

    but "hen the price "as lo", the sales "ere high.

    #he follo"ing scatter plot sho"s the sales of a "ee!ly ne"spaper and the temperature3

    #here is no connection bet"een the number of ne"spapers sold and the temperature.

     $nother term used to describe @ sets of data that ha)e a connection or a relationship

    is correlation. #he correlation bet"een @ sets of data can be positi)e or negati)e, and it

    can be strong or "ea!. #he follo"ing scatter plots "ill help to enhance this concept.

    f you loo! at the @ s!etches that represent a positi)e correlation, you "ill notice that the

    points are around a line that slopes up"ard to the right. When the correlation is

    negati)e, the line slopes do"n"ard to the right. #he @ s!etches that sho" a strong

    correlation ha)e points that are bunched together and appear to be close to a line that is

    in the middle of the points. When the correlation is "ea!, the points are more scattered

    and not as concentrated.

    n the sales of ne"spapers and the temperature, there "as no connection bet"een the

    @ data sets. #he follo"ing s!etches represent some other possible outcomes "hen

    there is no correlation bet"een data sets3

  • 8/20/2019 Quantitative Variables.docx

    71/86

    Example 6 

    'lot the follo"ing points on a scatter plot, "ith m as the independent )ariable and nas

    the dependent )ariable. =umber both a(es from to @. f a correlation e(ists bet"een

    the )alues of m and n, describe the correlation *strong negati)e, "ea! positi)e, etc.+.

    m4913161767 1810n 531118611181216

    Solution:

    Example 7 

    1escribe the correlation, if any, in the follo"ing scatter plot3

    Solution:

    n the abo)e scatter plot, there is a strong positi)e correlation.Iou no" !no" that a scatter plot can ha)e either a positi)e or a negati)e correlation.

    When this e(ists on a scatter plot, a line of best fit can be dra"n on the graph. #he line

    of best fit must be dra"n so that the sums of the distances to the points on either side

    of the line are appro(imately equal and such that there are an equal number of points

    abo)e and belo" the line. Using a clear plastic ruler ma!es it easier to meet all of these

    conditions "hen dra"ing the line. $nother useful tool is a stic! of spaghetti, since it can

  • 8/20/2019 Quantitative Variables.docx

    72/86

    be easily rolled and mo)ed on the graph until you are satisfied "ith its location. #he

    edge of the spaghetti can be traced to produce the line of best fit. $ line of best fit can

    be used to ma!e estimations from the graph, but you must remember that the line of

    best fit is simply a s!etch of "here the line should appear on the graph. $s a result, any

    )alues that you choose from this line are not )ery accurate−the )alues are more of aballpar! figure.

    Example 8 

    #he follo"ing table consists of the mar!s achie)ed by 6 students on chemistry and math

    tests3

    Student A B C ) 0 ( & + 2

    Chemistry 1ar-s 76 7B 25 5F 5 5B 57 7B 52

    1ath 1ar-s @6 @2 7 2F 2B 2 @7 ?

    'lot the abo)e mar!s on scatter plot, "ith the chemistry mar!s on the x-a(is and themath mar!s on the y-a(is. 1ra" a line of best fit, and use this line to estimate the mar!

    that Student "ould ha)e made in math had he or she ta!en the test.

    Solution:

    f Student had ta!en the math test, his or her mar! "ould ha)e been bet"een 2@ and

    24.

    Scatter plots and lines of best fit can also be dra"n by using technology. #he #-F2 is

    capable of graphing both a scatter plot and of inserting the line of best fit onto the

    scatter plot.Example 9

    Using the data from 8(ample F, create a scatter plot and dra" a line of best fit "ith the

    #-F2.

    Student A B C ) 0 ( & + 2

    Chemistry 1ar-s 76 7B 25 5F 5 5B 57 7B 52

  • 8/20/2019 Quantitative Variables.docx

    73/86

    Student A B C ) 0 ( & + 2

    1ath 1ar-s @6 @2 7 2F 2B 2 @7 ?

    Solution:

    #he calculator can no" be used to determine a linear regression equation for the gi)en

    )alues. #he equation can be entered into the calculator, and the line "ill be plotted on

    the scatter plot.

    ;rom the line of best fit, the calculated )alue for Student Js math test mar! "as 22.B.

    0emember that the mar! that you estimated "as bet"een 2@ and 24.

    Lesson Summary

    n this lesson, you learned ho" to represent data by graphing a straight line of the

    form y=mx+b, and also by using a scatter plot and a line of best fit. nterpreting a

    bro!en-line graph "as also presented in this lesson. Iou learned about correlation as it

    applies to a scatter plot and ho" to describe the correlation of a scatter plot. Iou also

    learned ho" to dra" a line of best fit on a scatter plot and to use this line to ma!e

    estimates from the graph. #he final topic that "as demonstrated in the lesson "as ho"

  • 8/20/2019 Quantitative Variables.docx

    74/86

    to use the #-F2 calculator to produce a scatter plot and ho" to graph a line of best fit by

    using linear regression.

    Points to Consider 

    • Can any of these graphs be used for comparing data?

    • Can the equation for the line of best fit be used to calculate )alues?

    • s any other graphical representation of data used for estimations?

    Learning Objectives

    • Construct and interpret a bo(-and-"his!er plot.

    • Use technology to create bo(-and-"his!er plots.

    Bo3%and 4his-er Plots

    n traditional statistics, data is organized by using a frequency distribution. #he results of 

    the frequency distribution can then be used to create )arious graphs, such as a

    histogram or a frequency polygon, "hich indicate the shape or nature of the distribution.

    #he shape of the distribution "ill allo" you to confirm )arious conAectures about the

    nature of the data.

    #o e(amine data in order to identify patterns, trends, or relationships, e(ploratory data

    analysis is used. n e(ploratory data analysis, organized data is displayed in order to

    ma!e decisions or suggestions regarding further actions. $ bo3%and%*his-er

    plot *often called a bo( plot+ can be used to graphically represent the data set, and the

    graph in)ol)es plotting 5 specific )alues. #he 5 specific )alues are often referred to as

    a five%number summary of the organized data set. #he fi)e-number summary consists

    of the follo"ing3

    . #he lo"est number in the data set *minimum )alue+

    @. #he median of the lo"er quartile3 Q1*median of the first half of the data set+2. #he median of the entire data set *median+

    7. #he median of the upper quartile3 Q3*median of the second half of the data set+

    5. #he highest number in the data set *ma(imum )alue+

    #he display of the fi)e-number summary produces a bo(-and-"his!er plot as sho"n

    belo"3

  • 8/20/2019 Quantitative Variables.docx

    75/86

    #he abo)e model of a bo(-and-"his!er plot sho"s @ horizontal lines *the "his!ers+ that

    each contain @5 of the data and are of the same length. n addition, it sho"s that the

    median of the data set is in the middle of the bo(, "hich contains 5 of the data. #he

    lengths of the "his!ers and the location of the median "ith respect to the center of the

    bo( are used to describe the distribution of the data. tJs important to note that this is Aust

    an e(ample. =ot all bo(-and-"his!er plots ha)e the median in the middle of the bo( and

    "his!ers of the same size.

    nformation about the data set that can be determined from the bo(-and-"his!er plot"ith respect to the location of the median includes the follo"ing3

    a+ f the median is located in the center or near the center of the bo(, the distribution is

    appro(imately symmetric.

    b+ f the median is located to the left of the center of the bo(, the distribution is positi)ely

    s!e"ed.

    c+ f the median is located to the right of the center of the bo(, the distribution is

    negati)ely s!e"ed.

    nformation about the data set that can be determined from the bo(-and-"his!er plot"ith respect to the length of the "his!ers includes the follo"ing3

    a+ f the "his!ers are the same or almost the same length, the distribution is

    appro(imately symmetric.

    b+ f the right "his!er is longer than the left "his!er, the distribution is positi)ely s!e"ed.

    c+ f the left "his!er is longer than the right "his!er, the distribution is negati)ely

    s!e"ed.

    #he length of the "his!ers also gi)es you information about ho" spread out the data is. $ bo(-and-"his!er plot is often used "hen the number of data )alues is large. #he

    center of the distribution, the nature of the distribution, and the range of the data are

    )ery ob)ious from the graph. #he fi)e-number summary di)ides the data into quarters

    by use of the medians of the upper and lo"er hal)es of the data. 0emember that, unli!e

    the mean, the median of the entire data set is not affected by outliers, so it is the

    measure of central tendency that is most often used in e(ploratory data analysis.

  • 8/20/2019 Quantitative Variables.docx

    76/86

    Example 24

    ;or the follo"ing data sets, determine the fi)e-number summaries3

    a+ @, B, 2B, , 2, @2, 5F

    b+ 77, @7, 52, B@6, 57, 2Solution:

    a+ #he first step is to organize the )alues in the data set as sho"n belo"3

    12, 16, 36, 10, 31, 23, 5810, 12, 16, 23, 31, 36, 58

    =o" complete the follo"ing list3

    Einimum )alue→10

    Q1→12Eedian→23

    Q3→36

    Ea(imum )alue→58

    b+ #he first step is to organize the )alues in the data set as sho"n belo"3

    144, 240, 153, 629, 540, 300144, 153, 240, 300, 540, 629

    =o" complete the follo"ing list3

    Einimum )alue→144

    Q1

    →153Eedian→270

    Q3→540

    Ea(imum )alue→629

    Example 25 

  • 8/20/2019 Quantitative Variables.docx

    77/86

    Use the data set for 8(ample @7 part a+ and the fi)e-number summary to construct a

    bo(-and-"his!er plot to model the data set.

    Solution:

    #he fi)e-number summary can no" be used to construct a bo(-and-"his!er plot. /e

    sure to pro)ide a scale on the number line that includes the range from the minimum)alue to the ma(imum )alue.

    a+ Einimum )alue→10

    Q1→12

    Eedian→23

    Q3→36

    Ea(imum )alue→58

    t is )ery )isible that the right "his!er is much longer than the left "his!er. #his indicates

    that the distribution is positi)ely s!e"ed.

    Example 26 

    ;or each bo(-and-"his!er plot, list the fi)e-number summary and describe the

    distribution based on the location of the median.

    Solution:

    a+ Einimum )alue→4

    Q1→6

    Eedian→9

  • 8/20/2019 Quantitative Variables.docx

    78/86

    Q3→10

    Ea(imum )alue→12

    #he median of the data set is located to the right of the center of the bo(, "hich

    indicates that the distribution is negati)ely s!e"ed.b+ Einimum )alue→225

    Q1→250

    Eedian→300

    Q3→325

    Ea(imum )alue→350

    #he median of the data set is located to the right of the center of the bo(, "hich

    indicates that the distribution is negati)ely s!e"ed.

    c+ Einimum )alue→60

    Q1→70

    Eedian→75

    Q3→95

    Ea(imum )alue→100

    #he median of the data set is located to the left of the center of the bo(, "hich indicates

    that the distribution is positi)ely s!e"ed.

    Example 27 

    #he numbers of square feet *in s+ of of the largest museums in the "orld are

    sho"n belo"3

    B5, 574, @7, @2, 272, @FF, @@@, @5, @F4, @B6

    Construct a bo(-and-"his!er plot for the abo)e data set and describe the distribution.

    Solution:

    #he first step is to organize the data )alues as follo"s3

    20,40021,30022,20025,00026,90028,70028,80034,30054,70065,000

    =o" calculate the median, Q1, and Q3.

    20,40021,30022,20025,00026,90028,70028,80034,30054,70065,000

    Median→26,900+28,7002=55,6002=27,800

  • 8/20/2019 Quantitative Variables.docx

    79/86

    Q1=22,200

    Q3=34,300

    =e(t, complete the follo"ing list3

    Einimum )alue→20,400

    Q1→22,200

    Eedian→27,800

    Q3→34,300

    Ea(imum )alue→65,000

    #he right "his!er is longer than the left "his!er, "hich indicates that the distribution is

    positi)ely s!e"ed.

    #he #-F2 or #-F7 can also be used to create a bo(-and "his!er plot. n the follo"ing

    e(amples, the #-F2 is used. n the ne(t chapter, !ey stro!es using the #-F7 "ill be

    presented to you. #he fi)e-number summary )alues can be determined by using the

    #0$C8 feature of the calculator or by using C$C and -Gar Stats.

    Example 28 

    #he follo"ing numbers represent the number of siblings in each family for 5 randomly

    selected students3

    4, 1, 2, 2, 5, 3, 4, 2, 6, 4, 6, 1, 7, 8, 4

    Use technology to construct a bo(-and-"his!er plot to display the data. ist the fi)e-

    number summary )alues.

    Solution:

  • 8/20/2019 Quantitative Variables.docx

    80/86

    =ote that "hen creating a bo(-and-"his!er plot "ith a # calculator, you donJt ha)e toactually sort the data. #he calculator "ill sort the data automatically "hen creating the

    bo(-and-"his!er plot.

    #he fi)eDnumber summary can be obtained from the calculator in @ "ays.

    . #he follo"ing results are obtained by simply using the #0$C8 feature and the left

    and right arro"s3

    #he )alues at the bottom of each screen are the fi)e-number summary.

    @. #he second method in)ol)es pressingSTATand using -Gar Stats on the C$C

    menu for 3

    Eany data sets contain )alues that are either e(tremely high )alues or e(tremely lo"

    )alues compared to the rest of the data )alues. #hese )alues are calledoutliers. #here

    are se)eral reasons "hy a data set may contain an outlier. Some of these are listed

    belo"3

  • 8/20/2019 Quantitative Variables.docx

    81/86

    . #he )alue may be the result of an error made in measurement or in obser)ation.

    #he researcher may ha)e measured the )ariable incorrectly.

    @. #he )alue may simply be an error made by the researcher in recording the )alue.

    #he )alue may ha)e been "ritten or typed incorrectly.

    2. #he )alue could be a result obtained from a subAect not "ithin the defined

    population. $ rese