70
10 syllabus syllabus r r ef ef er er ence ence Strand: Statistics and probability Core topic: Data collection and presentation In this In this cha chapter pter 10A Calculating and interpreting the mean 10B Mean, from frequency distribution tables 10C Mean, from grouped data 10D Median and mode 10E Best summary statistics 10F Range and interquartile range 10G Standard deviation 10H Comparing sets of data Describing, exploring and comparing data

Maths A - Chapter 10

Embed Size (px)

Citation preview

Page 1: Maths A - Chapter 10

10

syllabussyllabusrrefefererenceenceStrand:Statistics and probability

Core topic:Data collection and presentation

In thisIn this chachapterpter10A Calculating and

interpreting the mean10B Mean, from frequency

distribution tables10C Mean, from grouped data10D Median and mode10E Best summary statistics10F Range and interquartile

range10G Standard deviation10H Comparing sets of data

Describing, exploring and comparing data

MQ Maths A Yr 11 - 10 Page 381 Wednesday, July 4, 2001 5:58 PM

Page 2: Maths A - Chapter 10

382

M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Introduction

Archie is an archeologist. He ispassionate about his job, whichinvolves digging for buried artefacts,classifying his findings and piecingthem together to unravel and record thehistory of past civilizations. Imaginehis excitement when he uncovered asite of buried skulls in Egypt!

Further investigation confirmed thatthese were male skulls which had orig-inated from a race residing in Egypt.He was keen to place their existence intime. Delving into existing records, heuncovered measurements on maleEgyptian skulls recorded for two timeperiods – one around 4000

BC

and theother around

AD

150. These measurements confirmed a

change in skull shape over the timeperiod and this was taken as evidenceof interbreeding of the Egyptians withmigrant populations over the years. IfArchie compared the measurements onrecord with those he made on hisrecently excavated skulls, he couldpossibly identify a time in historywhen this race existed.

The measurements of male Egyptianskulls on record for 4000

BC

and

AD

150 were:1. breadth of skull2. height of skull and3. length of skull.

The recorded data for the measurements (in mm) of 30 male Egyptian skulls arecollated in the table on the following page.Where should Archie start? Statistical techniques enable us to summarise sets of data,which can then be compared. If Archie can summarise these two data sets, he couldthen compare them with his own measurements.

In this chapter, we shall investigate the main methods available to describe data setssuch as these. These methods employ

measures of central tendency

, in particular themean, median and mode. We shall also examine the range and interquartile range, thestandard deviation, and stem plots and boxplots. We shall then see how these measurescan be used to

compare

sets of data.In the previous chapter we investigated boxplots as a tool for comparing data sets.

We now explore this tool further, endeavouring to place Archie’s skull at some periodin history. Combining this with other statistical tools may enable us to provide asolution for Archie.

Hei

ght

Breadth

Length

MQ Maths A Yr 11 - 10 Page 382 Wednesday, July 4, 2001 5:58 PM

Page 3: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a

383

4000

BC AD

150

Breadth Height Length Breadth Height Length

131 138 89 137 123 91

125 131 92 136 131 95

131 132 99 128 126 91

119 132 96 130 134 92

136 143 100 138 127 86

138 137 89 126 138 101

139 130 108 136 138 97

125 136 93 126 126 92

131 134 102 132 132 99

134 134 99 139 135 92

129 138 95 143 120 95

134 121 95 141 136 101

126 129 109 135 135 95

132 136 100 137 134 93

141 140 100 142 135 96

131 134 97 139 134 95

135 137 103 138 125 99

132 133 93 137 135 96

129 136 96 133 125 92

132 131 101 145 129 89

126 133 102 138 136 92

135 135 103 131 129 97

134 124 93 143 126 88

128 134 103 134 124 91

130 130 104 132 127 97

138 135 100 137 125 85

128 132 93 129 128 81

127 129 106 140 135 103

131 136 114 147 129 87

124 138 101 136 133 97

MQ Maths A Yr 11 - 10 Page 383 Wednesday, July 4, 2001 5:58 PM

Page 4: Maths A - Chapter 10

384

M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

You may not be familiar with some of the following statistical terms. We shall investigatethem further, in this chapter.

1

A set of test results is shown below.8, 3, 6, 4, 5, 4, 9, 7, 4, 6, 5

a

Arrange the scores in ascending order.

b

How many scores are in the set?

c

In what position does the middle score lie?

d

What is the value of the middle score (the median)?

e

What is the range of the data?

f

Calculate the average (mean).

g

How many scores are below the mean? How many above?

h

Give the most frequently occurring score (mode) of the set of data.

i

Comment on any difference in value between the mean, median and mode.

j

Determine values for the lower and upper quartiles.

2

The mean, median and mode are measures of ‘central tendency’. Explain what thisterm ‘central tendency’ means.

3

The spread of the scores can be determined using a number of statistical measures.Name some measures of ‘spread’ with which you are familiar.

4

What is the relationship between the median and the quartiles?

5

In a boxplot, which of the following are true?

a

The quartiles divide the data into four sections of equal length.

b

The median is the score with an equal number of data values above it and below it.

c

If the ‘whiskers’ are longer than the ‘box’, it means that there are more scores in thewhiskers than there are in the box.

d

The whole ‘box’ contains the same number of scores as the two ‘whiskers’ together.

e

It is possible to calculate the mean of the set of data by observing the values in theboxplot.

6

For those statements in question 5 that are incorrect, explain why this is so. Adjust the

statements to make them correct.

Calculating the mean

If you were to survey a group of people about what they believe is meant by the word‘average’, you would find a variety of answers.

When looking at a set of

statistics

we are often asked for the average. The average is afigure that describes a typical score. In statistics, the correct term for the average is the

mean

. The mean is the first of three measures of

central tendency

that we shall bestudying. The others are the median and the mode.

The statistical symbol for the mean is

x

–. The formula for the mean is

x

=

x∑

n--------

MQ Maths A Yr 11 - 10 Page 384 Wednesday, July 4, 2001 5:58 PM

Page 5: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a

385

In mathematics, the symbol

Σ

(sigma) means

sum

or

total

,

x

represents each individualscore in a list and

Σ

x

is therefore the sum of the scores. The sum is divided by

n

, whichrepresents the number of

scores

.

A graphics calculator can be used to calculate and display many statistical functions.There are several brands of graphics calculator, but the Texas Instrument T83 will bethe model referred to in illustrations. Other brands of calculator allow calculations anddisplays with similar instructions. Many of the exercises lend themselves to eithermanual working or graphics calculator use.

Find the mean of the scores 17, 16, 13, 15, 16, 20, 10, 15.

THINK WRITE

Find the total of all scores. Total = 17 + 16 + 13 + 15 + 16 + 20 + 10 + 15Σx = 122

Divide the total by 8 (the number of scores).

Mean =

x– = 15.25

1

2122

8--------- Σx

n------

1WORKEDExample

Calculate the mean of the set of data below, using a graphics calculator.10, 12, 15, 16, 18, 19, 22, 25, 27, 29

THINK WRITE/DISPLAYEnter the data in L1.(Press , select 1:Edit... and press

to access the screen.)

Calculate the mean.(a) Press .(b) Highlight CALC in the top line.(c) Highlight 1:1–Var Stats and press .(d) Type L1 and press .

(e) A number of values are given. The top entry = 19.3 gives us the mean.

= 19.3

1STAT

ENTER

2STAT

ENTERENTER

xx

2WORKEDExample

MQ Maths A Yr 11 - 10 Page 385 Wednesday, July 4, 2001 5:58 PM

Page 6: Maths A - Chapter 10

386 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Interpreting the meanWhen we use the mean, we are attempting to represent the central value of the data.Let us investigate what affects its value. Consider five scores: 1, 2, 3, 4 and 5. Thevalue of the mean is the total (15), divided by the number of scores (5). The answer, 3,clearly lies in the centre. What would be the value of the mean if the last score had been20 instead of 5? The answer of 6, where only one score lies above the mean and four liebelow it, clearly demonstrates the influence of extreme values on the mean. Since thecalculation takes into account the values of all scores, a check must be applied to deter-mine whether the resulting value is a reasonable representation of the centre of the data.

Calculating and interpreting the mean

Use a graphics calculator or manual working for the following.

1 Copy and complete the following:

Another word commonly used for ‘mean’ is __________. The mean is calculated byfinding the __________ of the scores, then dividing by the __________ of scores. Themean is a measure of __________ tendency. Two other measures are __________ and__________.

2 Calculate the mean of each of the following sets of scores.a 4, 8, 3, 5, 5b 16, 24, 30, 35, 23, 11, 45, 28c 65, 92, 56, 84d 9.2, 9.7, 8.8, 8.1, 5.6, 7.5, 8.5, 6.4, 7.0, 6.4e 356, 457, 182, 316, 432, 611, 299, 355

3 Majid sits for five tests in mathematics. His percentages on the tests were 45%, 90%,67%, 86% and 75%. Calculate Majid’s mean percentage on the five tests. How many ofhis percentages were above the mean, and how many below?

remember1. The mean is the statistical term for ‘average’.2. The mean is calculated by adding all scores then dividing by the number of

scores. That is,

x– =

3. As a measure of central tendency, the mean represents a value for the ‘centre’ of the scores.

4. Check to determine the number of scores above and below the mean.5. The value of the mean is affected by extremes in scores.6. Remember to include correct units in your final answer.

x∑n

--------

remember

10A

WORKEDExample

1

WORKEDExample

2

SkillSH

EET 10.1

MQ Maths A Yr 11 - 10 Page 386 Wednesday, July 4, 2001 5:58 PM

Page 7: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 3874 An oil company surveys the price of petrol in eight Brisbane suburbs. The results are

listed below. Manly 76.9 c/L Kenmore 72.9 c/L Bardon 73.4 c/L Nundah 70.9 c/L Springwood 72.3 c/L Mansfield 75.8 c/L Oxley 73.9 c/L Boondall 71.1 c/L

Based on these results, calculate the mean price of petrol in cents per litre in Brisbane.Is this mean a realistic representation of the central value? Explain.

5 The seven players on a netball team have the following heights: 1.65 m, 1.81 m,1.75 m, 1.78 m, 1.88 m, 1.92 m and 1.86 m. Calculate the mean height of the playerson this team, correct to 2 decimal places. How many of the players have heights abovethe mean height?

6 A golf ball manufacturer randomly tests the mass of 10 golf balls from a batch. Thebatch will be considered satisfactory if the average mass of the balls is between 44.8 gand 45.2 g. The masses, in grams, of those tested are:

45.19, 45.06, 45.35, 44.78, 45.47, 44.68, 44.95, 45.32, 44.60, 44.95.

a Will the batch be passed as satisfactory?b Which ball has a mass which is furthest from the mean — the lightest one or the

heaviest one?

7 Consider the five values, 1, 2, 3, 4 and 5. The mean is calculated as 3.a What happens to the value of the mean if 10 is added to each score?b What effect does multiplying each score by 10 have on the mean’s value?

Means of skull measurementsRefer to the table of skull measurements for 4000 BC and AD 150 displayed earlier in the chapter.1 Using the breadth, height and length measurements (in mm) for 4000 BC,

calculate the mean for each set of data.2 Draw up the table shown below and include the means calculated above. The

means for the corresponding measurements for AD 150 have been included for comparison.

3 Note the difference between the means for the 4000 BC measurements and the corresponding ones for AD 150. Do you notice a trend?

4 Examine the breadth data for 4000 BC. How many scores are above the mean? How many scores are below the mean?

5 Examine the height and length data sets for 4000 BC and determine the number of scores above and below the mean in each set.

6 In your opinion, does the mean appear to represent a value close to the centre of each data set?

inve

stigationinvestigatio

n

4000 BC AD 150

Breadth Height Length Breadth Height Length

Mean 136.2 130.3 93.5

MQ Maths A Yr 11 - 10 Page 387 Wednesday, July 4, 2001 5:58 PM

Page 8: Maths A - Chapter 10

388 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Frequency distribution tablesIn the last section, we dealt with easily manageable quantities of data. However, morecommonly we are confronted with the task of processing much larger data sets. Makingsense of large quantities of data is best achieved by using a frequency distributiontable. The headings for this table are Score (x), Tally (optional), Frequency ( f ) and afourth column, ( fx), which contains the score (x) multiplied by the frequency ( f ). Thetotal of this fourth column indicates the total of all the scores. The mean is then calcu-lated by dividing this total of all scores by the sum of the frequency column (whichrepresents the total number of scores). Written as a formula, this is:

x– = fx∑f∑

-----------

Complete the frequency table below, then calculate the mean.

Score (x) Tally Frequency (f ) fx

4 | | |5 | | | | | |6 | | | | | | | | |7 | | | | | | | | | | |8 | | | | | | | |9 | | | | |

Σ f = Σ fx =

THINK WRITE

Complete the frequency column from the tally column.Complete the fx column by multiplying each score by the frequency.Sum the frequency column.Sum the fx column.

Use the formula to calculate the mean.

x– =

x =

x = 6.76

1

2

34

Score (x) Tally

Frequency (f ) fx

4 | | | 3 12

5 | | | | | | 7 35

6 | | | | | | | | | 11 66

7 | | | | | | | | | | | 13 91

8 | | | | | | | | 10 80

9 | | | | | 6 54

Σ f = 50 Σ fx = 338

5fx∑f∑

-----------

33850

---------

3WORKEDExample

MQ Maths A Yr 11 - 10 Page 388 Wednesday, July 4, 2001 5:58 PM

Page 9: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 389

To enlist the aid of a graphics calculator in determining the mean in worked example 3:1. Enter the data.

(a) To clear any previous equations press and clear any functions.

(b) Press , select 1:EDIT and press .

(c) Enter the scores in L1 and the frequencies in L2.

2. Set up the calculator to calculate the mean.(a) Press , select CALC, then

the 1-Var Stats option. Type L1 and L2 separated by a comma.

(b) Press to display the number of statistical measures.

(c) Amongst other statistical data you can read off the number of scores, the sum of the scores and the mean.

Graphics CalculatorGraphics Calculator tip!tip! Calculating the mean

Y=STATENTER

STAT

ENTER

remember1. The mean for a large number of scores is generally calculated from a frequency

distribution table. A graphics calculator can also be used.2. The formula for the mean is

x– = fx∑f∑

-----------

remember

MQ Maths A Yr 11 - 10 Page 389 Wednesday, July 4, 2001 5:58 PM

Page 10: Maths A - Chapter 10

390 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Mean, from frequency distribution tables

1 Using our skull measurements for breadth for 4000 BC, draw up a frequencydistribution table as shown below. The tallies for each score have been included. Copyand complete the frequency ( f ) column and the ( fx) column; total the last twocolumns; then calculate the mean. Notice that its value is the same as that calculatedbefore, using the individual scores.

a

Same value Same valueas n. as total

of scores.

b x– =

= ———?

Breadth (x) Tally Frequency ( f ) fx

119 |

124 |

125 | |

126 | |

127 |

128 | |

129 |

130 |

131 | | | |

132 | | |

134 | | |

135 | |

136 |

138 | |

139 | |

141 |

Σf = Σfx =

10BWORKEDExample

3

EXCEL

Spreadsheet

Mean

fx∑f∑

-----------

MQ Maths A Yr 11 - 10 Page 390 Wednesday, July 4, 2001 5:58 PM

Page 11: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 3912 A class’s marks (out of 10) on a spelling test are recorded in the frequency table below.

a Copy and complete the table.

b Use the formula to calculate the class’s mean.

c How many scores are greater than the mean?

3 An electrical store records the number of television sets sold each week over a year.The results are shown in the table below.

a Copy and complete the table.b Calculate the mean number of television sets sold each week over the year. Give

your answer correct to one decimal place.

Score (x) Tally Frequency ( f ) fx

4 | |

5 | | | |

6 | | | |

7 | | | | | | | |

8 | | |

9 | | | |

10 | |

Σ f = Σ fx =

No. of television sets sold (x)

No. of weeks ( f ) fx

16 4

17 4

18 3

19 6

20 7

21 12

22 8

23 2

24 4

25 2

Σ f = Σ fx =

Meanfx∑f∑

-----------=

EXCEL Spreadsheet

MeanDIY

MQ Maths A Yr 11 - 10 Page 391 Wednesday, July 4, 2001 5:58 PM

Page 12: Maths A - Chapter 10

392 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

4 In a soccer season a team played 50 matches. The number of goals scored in eachmatch is shown in the table below.

a Redraw this table in the form of a frequency distribution table.b Use your table to calculate the mean number of goals scored each game.c By calculating the number of scores below and above the mean, decide whether its

value is suitable as a measure of central tendency. Justify your decision.

5 A clothing store records the dress sizes sold during a day. The results are shownbelow.

12 14 10 12 8 12 16 10 8 1210 12 18 10 12 14 16 10 12 1212 14 18 10 14 12 12 14 14 10

a Present this information in a frequency table.b Calculate the mean dress size sold this day.c Comment on your answer.

6

There are eight players in a Rugby forward pack. The mean mass of the players is104 kg. The total mass of the forward pack is:

7

A small business employs five people on a mean wage of $380 per week. A manager isthen employed and receives $500 per week. What is the mean wage of the sixemployees?

8

The mean height of five starting players in a basketball match is 1.82 m. During a timeout, a player who is 1.78 m tall is replaced by a player 1.88 m tall. What is the meanheight of the players after the replacement has been made?

Grouping data and using grouped dataIn some cases, the range of data values is so great that grouping the data into classesmakes the data more manageable. For example, consider the following data set ofpeople with ages ranging from 25 to 49. We might group the ages in intervals of 5 inthe form 25–29, 30–34 etc. This means that all the values (25, 26, 27, 28 and 29) wouldbe grouped in one class. The centre of this class would be 27, and this is the value used

No. of goals 0 1 2 3 4 5

No. of matches 4 9 18 10 5 4

A 13 kg B 104 kg C 112 kg D 832 kg

A $380 B $400 C $480 D $2400

A 1.78 m B 1.82 m C 1.84 m D 1.88 m

Mathca

d

Mean

mmultiple choiceultiple choice

mmultiple choiceultiple choice

mmultiple choiceultiple choice

MQ Maths A Yr 11 - 10 Page 392 Wednesday, July 4, 2001 5:58 PM

Page 13: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 393as the score (x). This class centre is then multiplied by the frequency, ( f ). In this case,the value obtained for the mean is an estimate rather than an exact value. Sometimesthe choice of the size of the class intervals also has an effect on the accuracy of themean.

Complete the frequency distribution table and use it to estimate the mean of the distribution.

Class Class centre (x) Tally Frequency (f ) fx

25–29 | | | |

30–34 | | | | | | | |

35–39 | | | | | | | | | | |

40–44 | | | | | | | | | |

45–49 | | | | | |

Σ f = Σ fx =

THINK WRITE

Calculate the class centres.

Complete the frequency column from the tally column.

Multiply each class centre by the frequency to complete the fx column.

Sum the frequency column.

Sum the fx column.

Use the formula to calculate the mean.

x– =

x =

x = 38

1

2

3Class

Class centre

(x) TallyFrequency

(f ) fx

25–29 27 | | | | 4 108

30–34 32 | | | | | | | | 9 288

35–39 37 | | | | | | | | | | | 13 481

40–44 42 | | | | | | | | | | 12 504

45–49 47 | | | | | | 7 329

Σ f = 45 Σ fx = 1710

4

5

6fx∑f∑

-----------

171045

------------

4WORKEDExample

MQ Maths A Yr 11 - 10 Page 393 Wednesday, July 4, 2001 5:58 PM

Page 14: Maths A - Chapter 10

394 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

A graphics calculator can be used to calculate the mean from a grouped data frequencydistribution. In such cases, the class centre can be entered as L1 and the frequency as L2.Remember to set up the 1-Var Stats to recognise these two lists (Xlist: L1 and Freq: L2).

Mean, from grouped data

1 a Using our skull measurements for breadth for 4000 BC (shown previously as anungrouped frequency distribution), draw up the table below, using class intervals119–121, 122–124 etc. Complete the columns and calculate the mean.

b Does the mean differ from the two previous calculations? Explain any difference.

Compare with Σfx from exercise 10B, question 1.

x– = = __________?

Class Class centre (x)

Tally Frequency (f)

fx

119–121 120

122–124

125–127

128–130

131–133

134–136

137–139

140–142

Σf = Σfx =

Graphics CalculatorGraphics Calculator tip!tip! Calculating the mean from grouped data

remember1. The mean is the statistical term for average.2. The mean is calculated by adding all scores then dividing by the number of

scores.3. When calculating the mean from a frequency distribution table, a column for

frequency × score ( fx) is added. The mean is then calculated using the formula

x– = .

4. If the frequency distribution uses grouped data, the fx column is calculated using class centres for the x-value.

5. The mean can also be calculated using a graphics calculator.

fx∑f∑

-----------

remember

10CWORKEDExample

4

fx∑f∑

-----------

MQ Maths A Yr 11 - 10 Page 394 Wednesday, July 4, 2001 5:58 PM

Page 15: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 3952 The table below shows a set of class marks on a test out of 100.

a Copy and complete the frequency distribution table.b Use the table to calculate the mean class mark.c In which class interval does the mean lie?

3 In the heats of the 100-m freestyle at a swimming meet, the times of the swimmerswere recorded in the table below.

a Copy and complete the frequency distribution table.b Use the table to calculate the mean time.c How many swimmers swam faster than this mean time?

4 A cricketer played 50 innings in test cricket for the following scores.

23 65 8 112 54 0 84 12 21 425 105 74 40 1 15 33 45 21 4716 70 22 33 21 8 34 36 5 769 104 57 78 158 0 51 16 6 16

0 49 0 14 28 52 21 3 3 7a Put the above information into a frequency

distribution table using appropriate groupings.b Use the table to estimate the batting average

for this player. c Repeat the exercise using a different size class interval. Compare your answers.

ClassClass centre

(x) TallyFrequency

(f ) fx

31–40 |

41–50 | | |

51–60 | | | |

61–70 | | | | | |

71–80 | | | | | | | | |

81–90 | |

91–100 | |

Σ f = Σ fx =

Time Class centre (x) No. of swimmers ( f ) fx

50.01–51.00 4

51.01–52.00 12

52.01–53.00 23

53.01–54.00 38

54.01–55.00 15

55.01–56.00 3

Σ f = Σ fx =

MQ Maths A Yr 11 - 10 Page 395 Wednesday, July 4, 2001 5:58 PM

Page 16: Maths A - Chapter 10

396 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

5 Use the statistics function on your calculator to find the mean of each of the followingscores, correct to 1 decimal place.a 11, 15, 13, 12, 21, 19, 8, 14b 2.8, 2.3, 3.6, 2.9, 4.5, 4.2c 41, 41, 41, 42, 43, 45, 45, 45, 45, 46, 49, 50

6 Use your calculator to find the mean from each of the following tables.

7 The table below shows the heights of a group of people.

Calculate the mean of this distribution.

8 Seventy students were timed on a 100-m sprint during their P.E. class. The results areshown in the table below.

a Calculate the class centre for each group in the distribution.b Use your calculator to find the mean of the distribution.

a Score Frequency b Score Frequency

3 7 28 5

4 10 29 18

5 18 30 25

6 19 31 25

7 38 32 14

8 27 33 10

9 10 34 3

10 5

Height Class centre Frequency

150–154 152 7

155–159 157 14

160–164 162 13

165–169 167 23

170–174 172 24

175–179 177 12

Time (s) 12 to <13 13 to <14 14 to <15 15 to <16 16 to <17

Number 13 17 25 15 10

MQ Maths A Yr 11 - 10 Page 396 Wednesday, July 4, 2001 5:58 PM

Page 17: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 3979 A drink machine is installed near a quiet beach. The number of cans sold over the first

10 weeks after its installation is shown below.

4 39 31 31 50 43 70 45 57 71 18 26 3 5251 59 33 51 27 62 30 90 3 30 97 59 33 4499 62 72 6 42 83 19 49 11 6 63 4 53 2045 58 1 9 79 41 2 33 97 71 52 97 69 8339 84 92 43 71 98 8 97 18 89 21 9 4 17

a Put this information into a frequency distribution table using the classes 1–10,11–20, 21–30 etc.

b Calculate the mean number of cans sold per day over these 10 weeks. c Using the raw data above, calculate the number of days on which the sales were

greater than the mean.

Median and modeSo far we have used the mean as a measure of the typical score in a data set. Considerthe case of someone who is analysing the typical house price in an area. On a particularday, five houses are sold in the area for the following prices:

$175 000 $149 000 $160 000 $211 000 $850 000

For these five houses the mean price is $309 000. The mean is much greater than mostof the houses in the data set. This is because there is one score which is much greaterthan all the others. For such data sets, we need to use a different measure of centraltendency.

MedianThe median is the middle score in a data set (of n scores), when all scores are arrangedin order. If the data set consists of an odd number of scores, there is one score whichlies exactly in the middle. For a data set consisting of an even number of scores, themedian will always occur half way between two scores.

WorkS

HEET 10.1

MQ Maths A Yr 11 - 10 Page 397 Wednesday, July 4, 2001 5:58 PM

Page 18: Maths A - Chapter 10

398 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Using single scoresThe position of the median can be found using the formula:

Median position = th score

The median becomes more complicated when there is an even number of scoresbecause there are two scores in the middle. When there is an even number of scores, themedian is the average of the two middle scores.

Median from a frequency distribution table of ungrouped dataThe median can be calculated from a frequency distribution table if we extend the tableby adding a cumulative frequency column. This column ‘cumulates’ or totals the fre-quencies as we descend the rows. It is then possible to determine which scores are ineach position. Consider the frequency distribution table following.

n 1+2

------------

For the scores 3, 4, 8, 2, 2, 6, 9, 1, 6 calculate the median.

THINK WRITE

Rewrite the scores in ascending order. There are 9 scores here.

1, 2, 2, 3, 4, 6, 6, 8, 9

The median is the middle score, that is,

the th score.

Median = th score

Median = 5th scoreMedian = 4

1

2

n 1+2

------------

9 1+2

------------

5WORKEDExample

Find the median of the scores 13, 13, 16, 12, 19, 18, 20, 18.

THINK WRITE

Write the scores in ascending order. 12, 13, 13, 16, 18, 18, 19, 20

There is an even number (8) scores, so average the two middle scores.

Median = th score.

Median = th score

= 4.5th scorethat is, half way between 4th and 5th score. The 4th score is 16. The 5th score is 18.

Median =

Median = 17

1

2

n 1+2

------------

8 1+2

------------

16 18+2

------------------

6WORKEDExample

MQ Maths A Yr 11 - 10 Page 398 Wednesday, July 4, 2001 5:58 PM

Page 19: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 399

There are 30 scores in this distribution and so the middle two scores will be the 15thand 16th scores. By looking down the cumulative frequency column we can see thatthese scores are both 6. Therefore, 6 is the median of this distribution.

Score FrequencyCumulative frequency

4 1 1 The 1st score is 4.

5 6 7 The 2nd–7th scores are 5.

6 9 16 The 8th–16th scores are 6.

7 8 24 The 17th–24th scores are 7.

8 4 28 The 25th–28th scores are 8.

9 2 30 The 29th and 30th scores are 9.

Find the median for the frequency distribution at right.

THINK WRITE

Redraw the frequency table with a cumulative frequency column.

There are 45 scores and so the middle score is the 23rd score.

Median = score

Median = score

Median = 23rd scoreMedian = 36

Look down the cumulative frequency column to see that the 23rd score is 36.

Score Frequency

34 3

35 8

36 12

37 9

38 8

39 5

1

Score FrequencyCumulative frequency

34 3 3

35 8 (3 + 8) 11

36 12 (11 + 12) 23

37 9 (23 + 9) 32

38 8 (32 + 8) 40

39 5 (40 + 5) 45

2n 1+

2------------

45 1+2

---------------3

7WORKEDExample

MQ Maths A Yr 11 - 10 Page 399 Wednesday, July 4, 2001 5:58 PM

Page 20: Maths A - Chapter 10

400 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

ModeThere are many examples where neither the mean nor the median is the appropriatemeasure of the typical score in a data set.

Using single scores

Consider the case of a clothing store. It needs to re-order a supply of dresses. To knowwhat sizes to order it looks at past sales of this particular style and gathers thefollowing data:

8 12 14 12 16 10 12 14 16 1814 12 14 12 12 8 18 16 12 14

For this data set the mean dress size is 13.2. Dresses are not sold in size 13.2, so thishas very little meaning. The median is 13, which also has little meaning as dresses aresold only in even-numbered sizes.

What is most important to the clothing store is the dress size that sells the most. Inthis case size 12 occurs most frequently. The score that has the highest frequency iscalled the mode.

When two scores share the ‘highest’ frequency, that is, occur an equal number of times,both scores are given as the mode. In this situation the scores are bimodal. If all scoresoccur an equal number of times, then the distribution has no mode.

Mode from a frequency distribution table

To find the mode from a frequency distribution table, we simply give the score that hasthe highest frequency.

Find the mode of the scores below.4, 5, 9, 4, 6, 8, 4, 8, 7, 6, 5, 4.

THINK WRITE

The score 4 occurs most often and so it is the mode. Mode = 4

8WORKEDExample

For the frequency distribution at right state the mode.

THINK WRITE

The highest frequency is 14 which belongs to the score 17 and so 17 is the mode.

Mode = 17

Score Frequency

14 3

15 6

16 11

17 14

18 10

19 7

9WORKEDExample

MQ Maths A Yr 11 - 10 Page 400 Wednesday, July 4, 2001 5:58 PM

Page 21: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a

401

When a table is presented using grouped data, we do not have a single mode. In thesecases, the class with the highest frequency is called the

modal class

.

Median and mode

1

Copy and complete the following:The median score is the __________ one, when the scores are ____________________. The formula for the position of the median score is __________. For aneven number of scores, it is the __________ of the two middle ones. When using afrequency distribution table, the median is obtained from the ____________________ column.

2

The scores of seven people on a spelling test are given below.

5 6 5 8 5 9 8

Calculate the median of these marks.

3

Below are the scores of eight people who played a round of golf.

75 80 81 76 84 83 81 82

Calculate the median for this set of scores.

4

Find the median for each of the following sets of scores.

a

3, 4, 5, 5, 5, 6, 9

b

5.6, 5.2, 5.4, 5.3, 5.8, 5.4, 5.3, 5.4

c

45, 62, 39, 88, 75

d

102, 99, 106, 108, 101, 103, 102, 105, 102, 101

5

A factory has 80 employees. Over a two-week period the number of people absent from work each day was recorded and the results are shown below.

3, 1, 5, 4, 3, 25, 4, 2, 4, 5

a

Calculate the median number of people absent from work each day.

b

Calculate the mean number of people absent from work each day.

c

Does the mean or the median give a better measure of the typical number of people absent from work each day? Explain your answer.

remember1. The median is the middle score in a data set or the average of the two middle

scores. The scores must be arranged in order.2. The median can be found using the cumulative frequency column of a

frequency table.3. The mode is the score that occurs the most.4. Remember to include units in the final answer.

remember

10D

WORKEDExample

5

EXCEL

Spreadsheet

Median

WORKEDExample

6

SkillSH

EET 10.2

EXCEL

Spreadsheet

MedianDIY

MQ Maths A Yr 11 - 10 Page 401 Friday, July 6, 2001 2:28 PM

Page 22: Maths A - Chapter 10

402 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

6 The table below shows the number of cans of drink sold from a vending machine at ahigh school each day.

7 The table at right shows the number of accidents a tow truck attends each day over a three-week period.Calculate the median number of accidents attended by the tow truck each day.

8 The table at right shows the number of errors made by a machine each day over a 50-day period. Calculate the median number of errors made by the machine each day.

9

There are 25 scores in a distribution. The median score will be the:A 12th score B 12.5th score C 13th scoreD average of the 12th and 13th scores.

Score FrequencyCumulative frequency

17 4

18 9

19 6

20 12

21 8

22 5

23 4

24 2

WORKEDExample

7a Copy and complete the

frequency distributiontable.

b Use the table to calculatethe median number ofcans of drink sold eachday from the vendingmachine.

No. of accidents No. of days

2 4

3 12

4 3

5 1

6 1

No. of errors per day Frequency

0 9

1 18

2 13

3 6

4 3

5 1

mmultiple choiceultiple choice

MQ Maths A Yr 11 - 10 Page 402 Wednesday, July 4, 2001 5:58 PM

Page 23: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 40310

For the scores 4, 5, 5, 6, 7, 7, 9, 10 the median is:

11Consider the frequency table at right.The median of these scores is:

12 The table below shows the number of sick days taken by each worker in a small busi-ness.

a Copy and complete the frequency distribution table.b Calculate the median class for this distribution.

13 Copy and complete the following:The mode is the __________ __________ score. If two scores occur most frequentlyan equal number of times, we have two modes, and this is termed a __________ dis-tribution. In a frequency distribution table of grouped data, we generally do notattempt to find a single mode, but give the __________ __________.

14 For each of the following sets of scores find the mode.a 2, 5, 3, 4, 5b 8, 10, 7, 10, 9, 8, 8c 11, 12, 11, 15, 14, 13d 0.5, 0.4, 0.6, 0.3, 0.2, 0.4, 0.6, 0.9, 0.4e 110, 113, 100, 112, 110, 113, 110

15 Find the mode for each of the following. (Hint: Some are bimodal and others have nomode.)a 16, 17, 19, 15, 17, 19, 14, 16, 17b 147, 151, 148, 150, 148, 152, 151c 2, 3, 1, 9, 7, 6, 8d 68, 72, 73, 72, 72, 71, 72, 68, 71, 68e 2.6, 2.5, 2.9, 2.6, 2.4, 2.4, 2.3, 2.5, 2.6

A 5 B 6 C 6.5 D 7

A 2B 3C 8D 13

Days sickness FrequencyCumulative frequency

0–4 10

5–9 12

10–14 7

15–19 6

20–24 5

25–29 3

30–34 2

mmultiple choiceultiple choice

mmultiple choiceultiple choice Score Frequency

1 12

2 13

3 8

4 7

5 5

EXCEL Spreadsheet

Mode

WORKEDExample

8

EXCEL Spreadsheet

ModeDiY

MQ Maths A Yr 11 - 10 Page 403 Wednesday, July 4, 2001 5:58 PM

Page 24: Maths A - Chapter 10

404 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

16 Use the tables below to state the mode of the distribution.

17 Use the frequency histogram below to state the mode of the distribution.

18 For each of the following grouped distributions, state the modal class.

19 The weekly wage (in dollars) of 40 people is shown below.

376 592 299 501 375 366 204 359 382 274223 295 232 325 311 513 348 235 329 203556 419 226 494 205 307 417 204 528 487543 532 435 415 540 260 318 593 592 393

a Use the classes $200–$249, $250–$299, $300–$349 etc. to display the informationin a frequency distribution table.

b From your table, calculate the median class.

WORKEDExample

9 a b cScore Frequency

1 2

2 4

3 5

4 6

5 3

Score Frequency

5 1

6 3

7 5

8 8

9 5

10 3

Score Frequency

38 2

39 4

40 1

41 5

42 6

43 3

44 6

45 2

120

10

20

30

13 14 15 16 17 18Score

Freq

uenc

y

40

19 20

5

15

25

35

a bClass Frequency

1–4 6

5–8 12

9–12 30

13–16 23

17–20 46

21–24 27

25–28 9

Class Frequency

1–7 3

8–14 8

15–21 9

22–28 25

29–35 12

36–42 11

43–49 2

MQ Maths A Yr 11 - 10 Page 404 Wednesday, July 4, 2001 5:58 PM

Page 25: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 405

1 Copy the frequency table above and complete the class centre column.

2 Complete the cumulative frequency column.

3 How many scores in the data set were above 30?

4 How many scores in the data set were 40 or less?

5 Is the data set an example of grouped or ungrouped data?

6 Draw a frequency histogram for the data set.

7 On your histogram draw a frequency polygon for this data set.

8 Calculate the mean of the data.

9 In which class would the median lie?

10 Which is the modal class?

Class Class centre FrequencyCumulative frequency

1–10 5

11–20 15

21–30 29

31–40 37

41–50 11

1

MQ Maths A Yr 11 - 10 Page 405 Wednesday, July 4, 2001 5:58 PM

Page 26: Maths A - Chapter 10

406 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Best summary statisticsHaving now examined all three summary statistics, it is important to recognise whenit is appropriate to use each one. In some circumstances, one summary statistic may bemore appropriate than the others. For example, a shoe manufacturer notes that in a newstyle of sporting footwear:

mean size sold is 8.63median size is 8.75mode size is 9.

Summary statistics for skull measurements

Looking back at the data on Egyptian skulls, we are now in a position to summarise the measurements with respect to the mean, median and mode for each set.

1 Draw the table below. (The values for AD 150 have been included for comparison.)

2 For the time period 4000 BC:a enter the values for the means calculated previouslyb calculate the median for each setc determine the mode for each set.

3 Compare the figures you obtained for 4000 BC with the corresponding values for AD 150. Jot down comments in the final column.

4 Write a paragraph indicating what you feel has happened to the shape of the Egyptian skulls over the time period 4000 BC to AD 150.

inve

stigationinvestigatio

n

4000 BC AD 150 Comment

Mean x– Breadth 136.2

Height 130.3

Length 93.5

Median15.5th score

Breadth 137

Height 130

Length 94

Mode Breadth 137

Height 135

Length 92

MQ Maths A Yr 11 - 10 Page 406 Wednesday, July 4, 2001 5:58 PM

Page 27: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 407In this case, the mode is the most useful measure as the manufacturer needs to

know which size sells the most. The mean and median are of less use to themanufacturer.

The term average is often used indiscriminately, being interpreted sometimes as themean, sometimes as the median and sometimes as the mode. The figure that best sup-ports the cause of the author is the one which (unfortunately) tends to be promoted. Weneed to be aware of this, particularly when interpreting statistics. When we summariseand report statistical information, we need to act in a responsible manner and reportfigures that are not misleading.

For each of these examples you will need to think carefully about the relevance of eachsummary statistic in terms of the particular example.

Below are the wages of ten employees in a small business.

$220 $230 $290 $275 $265 $250 $1500 $220 $220 $240

a Calculate the mean wage.b Calculate the median wage.c Calculate the mode wage.d Does the mean, median or mode give the best measure of a typical wage in this

business?

THINK WRITE

a Total all the wages. a Total = $3710

Divide the total by 10. Mean = $3710 ÷ 10 = $371

b Write the wages in ascending order. b $220 $220 $220 $230 $240 $250 $265 $275 $290 $1500

Average the 5th and 6th score to find the median.

Median =

Median = $245

c $220 is the score that occurs most often and so this is the mode.

c Mode = $220

d The mean is larger than what is typical because of one very large wage, and the mode is the lowest wage and so this is not typical. Therefore, the median is the best measure.

d The median is the best measure of the typical wage as the mode is the lowest score, which is not typical, and the mean is inflated by the $1500 wage.

1

2

1

2$240 $250+

2------------------------------

10WORKEDExample

MQ Maths A Yr 11 - 10 Page 407 Wednesday, July 4, 2001 5:58 PM

Page 28: Maths A - Chapter 10

408 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Best summary statistics

1 There are ten houses in a street. A real estate agent values each house with thefollowing results.

$150 000 $190 000 $175 000 $150 000 $650 000 $150 000 $165 000 $180 000 $160 000 $180 000

a Calculate the mean house valuation.b Calculate the median house valuation.c Calculate the mode house valuation.d Which of the above is the best measure of central tendency?

2 The table below shows the number of shoes of each size that were sold over a week ata shoe store.

a Calculate the mean shoe size sold.b Calculate the median shoe size sold.c Calculate the mode of the data set.d Which measure of central tendency

has the most meaning to the shoe store proprietor?

Size Frequency

4 5

5 7

6 19

7 24

8 16

9 8

10 7

remember1. The three summary statistics are:

mean — calculated by adding all scores, then dividing by the number of scoresmedian — the middle score or average of the two middle scores (when scores are arranged in order)mode — the score with the highest frequency.

2. Be careful when using the mean. One or two extreme scores can greatly increase or decrease its value.

3. When the mean is not a good measure of central tendency, the median is used.4. The mode is the best measure in some examples where discrete data mean that

the mean and median may have very little meaning.

remember

10EWORKEDExample

10Mathca

d

Median, mode and range

MQ Maths A Yr 11 - 10 Page 408 Wednesday, July 4, 2001 5:58 PM

Page 29: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 4093 The table below shows the crowds at football matches over a season.

a Calculate the mean crowd over the season.b Calculate the median class.c Calculate the modal class.d Which measure of central tendency would best describe the typical crowd at foot-

ball matches over the season?

4

Mr and Mrs Yousef research the typical price of a large family car. At one car yard theyfind six family cars. Five of the cars are priced between $30 000 and $40 000, while thesixth is priced at $80 000. What would be the best measure of the price of a typicalfamily car?

5 Thirty men were asked to reveal the number of hours they spent doing housework eachweek. The results are given below.

1 5 2 12 2 6 2 8 14 180 1 1 8 20 25 3 0 1 27 10 12 1 5 1 18 0 2 2

a Represent the data in a frequency distribution table. (Use classes 0–4, 5–9, 10–14 etc.)b Find the mean number of hours that the men spend doing housework.c Find the median class for hours spent by the men at housework.d Find the modal class for hours spent by the men at housework.

6 The resting pulse rates of 20 female athletes were measured. The results are shownbelow.

50 62 48 52 7161 30 45 42 4843 47 51 52 3461 44 54 38 40

a Represent the data in a frequency distribution table using appropriate groupings.b Find the mean of the data.c Find the median class of the data.d Find the modal class of the data.e Comment on the similarities and differences between the three values.

Crowd Class centre Frequency

10 000 to <20 000 15 000 95

20 000 to <30 000 25 000 64

30 000 to <40 000 35 000 22

40 000 to <50 000 45 000 15

50 000 to <60 000 55 000 3

60 000 to <70 000 65 000 0

70 000 to <80 000 75 000 1

A Mean B Median C Mode D All are equally important.

mmultiple choiceultiple choice

MQ Maths A Yr 11 - 10 Page 409 Wednesday, July 4, 2001 5:58 PM

Page 30: Maths A - Chapter 10

410 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

7 The following data give the age of 25 patients admitted to the emergency ward of ahospital.

18 16 6 75 2423 82 74 25 2143 19 84 72 3174 24 20 63 7980 20 23 17 19

a Represent the data in a frequency distribution table. (Use classes 1–15, 16–30,31–45, etc.)

b Find the mean age of patients admitted.c Find the median class of age of patients admitted.d Find the modal class for age of patients admitted.e Do any of your statistics (mean, median or mode) give a clear representation of the

typical age of an emergency ward patient?f Give some reasons that could explain the pattern of the distribution of data in this

question.

8 The batting scores for two cricket players over six innings are as follows:

Player A 31, 34, 42, 28, 30, 41 Player B 0, 0, 1, 0, 250, 0

a Find the mean score for each player.b Which player appears to be better if the mean result is used?c Find the median score for each player.d Which player appears to be better when the decision is based on the median result?e Which player do you think would be more useful to have in a cricket team and

why? How can the mean result sometimes lead to a misleading conclusion?

9 The following frequency table gives the number of employees in different salarybrackets for a small manufacturing plant.

a Workers are arguing for a pay rise but the management of the factory claims thatworkers are well paid because the mean salary of the factory is $22 100. Is this asound argument?

b Suppose that you were representing the factory workers and had to write a shortsubmission in support of the pay rise. How could you explain the management’sclaim? Provide some other statistics to support your case.

Position Salary ($) No. of employees

Machine operator 18 000 50

Machine mechanic 20 000 15

Floor steward 24 000 10

Manager 62 000 4

Chief Executive Officer 80 000 1

MQ Maths A Yr 11 - 10 Page 410 Wednesday, July 4, 2001 5:58 PM

Page 31: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 411

Wage riseThe workers in an office are trying to obtain a wage rise. In the previous year, the ten people who work in the office received a 2% rise while the company CEO received a 42% rise.1 What was the mean wage rise received in the office last year?2 What was the median wage rise received in the office last year?3 What was the modal wage rise received in the office last year?4 The company is trying to avoid paying the rise. What statistic do you think they

would quote about last year’s wage rises? Why?5 What statistic do you think the trade union would quote about wage rises?

Why?6 Which statistic do you think is the most ‘honest’ reflection of last year’s wage

rises? Explain your answer.

Summary statistics for house pricesQuoting different averages can give different impressions about what is normal. Try the following task.1 Visit a local real estate agent and study the properties for sale in the window.

Alternatively, retrieve the for-sale ads for a real estate company from the newspaper.

2 Calculate the mean, median and mode price for houses in the area.3 If you were a real estate agent and a person wanting to sell his/her home asked

what the typical property sold for in the area, which figure would you quote?4 Which figure would you quote to a person who wanted to buy a house in the area?

Best summary statistics and comparison of samples

For this investigation, work in groups of 3 to 6 students.Examine each of the following statistics.• The typical mark in maths among Year 11 students.• The number of attempts taken by Years 11 and 12 students to get their driver’s

licence.• The typical number of days taken off school by Year 11 students so far this year.1 For each of the above, gather your data by selecting a random sample.2 Calculate the mean, median and mode for each topic.3 Compare your results with the results of other students who will have selected

their samples from the same population.4 In each case, state the best summary statistic and explain your answer to the

other groups in your class.

inve

stigationinvestigatio

nin

vestigation

investigation

inve

stigationinvestigatio

n

MQ Maths A Yr 11 - 10 Page 411 Wednesday, July 4, 2001 5:58 PM

Page 32: Maths A - Chapter 10

412 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Measures of dispersion or spreadOnce a set of scores has been collected and tabulated, we are ready to make some con-clusions about the data. Two key concepts are the range and the interquartile range,which are used to measure the spread of a set of scores.

RangeThe range is the difference between the highest and the lowest score.

Range = highest score − lowest score

Range from single scores

A smaller range will usually represent a more consistent set of scores. Exceptions tothis are when one or two scores are much higher or lower than most.

A graphics calculator can also be used to determine the range of a distribution.The 1-Var Stats displays min X (lowest score) and max X (highest score). The differencebetween these two values indicates the range of the data.

Range from a frequency distribution tableWhen we are calculating the range from a frequency distribution table, we find the

highest and lowest scores from the score column. We do not use any information fromthe frequency column in calculating the range. When the data are presented in groupedform, the range is found by taking the highest score from the highest class and thelowest score from the lowest class.

There are 17 players in the squad for a State of Origin match. The number of State of Origin matches played by each member of the squad is shown below.

2 6 12 8 1 4 8 9 244 5 11 14 6 11 15 10

What is the range of this distribution?

THINK WRITE

The lowest number of matches played is 1.

Lowest score = 1 match

The highest number of matches played is 24.

Highest score = 24 matches

Calculate the range by subtracting the lowest score from the highest score.

Range = 24 − 1= 23 matches

1

2

3

11WORKEDExample

Graphics CalculatorGraphics Calculator tip!tip! Calculating the range of a distribution

MQ Maths A Yr 11 - 10 Page 412 Wednesday, July 4, 2001 5:58 PM

Page 33: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 413

Interquartile range

In many cases, the range is not a good indicator of the overall spread of scores.Consider the two sets of scores below, showing the wages of people in two smallbusinesses.

A: $240, $240, $240, $245, $250, $250, $260, $800B: $180, $200, $240, $290, $350, $400, $500, $600

The range for business A = $800 − $240 and for business B = $600 − $180= $560 = $420

While the range for business A is greater, by looking at the wages in the two busi-nesses, we can see that the wages in business B are generally more spread. The rangeuses only two scores in its calculation. The interquartile range is usually a bettermeasure of dispersion (spread). We looked at this in the previous chapter.

The quartiles are found by dividing the data into quarters. The lower quartile is thelowest 25% of scores, the upper quartile is the highest 25% of scores.

Before we can calculate an interquartile range we must be able to calculate themedian. To calculate the median, we must first arrange the scores in ascending order.The median is the middle score (if there is an odd number of scores; or the average ofthe two middle scores if there is an even number of scores). Remember that the median

position is the th score.

The frequency distribution table at right shows the heights (in cm) of boys competing for a place on a basketball team.Find the range of these data.

THINK WRITE

The lowest score is at the bottom of the 170 to <175 class.

Lowest score = 170 cm

The highest score is at the top of the 195 to <200 class.

Highest score = 200 cm

Range = highest score − lowest score. Range = 200 − 170= 30 cm

Height Frequency

170 to <175 3

175 to <180 6

180 to <185 12

185 to <190 10

190 to <195 8

195 to <200 1

1

2

3

12WORKEDExample

n 1+2

------------

MQ Maths A Yr 11 - 10 Page 413 Wednesday, July 4, 2001 5:58 PM

Page 34: Maths A - Chapter 10

414 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

The interquartile range is the difference between the upper quartile and the lowerquartile. To find the lower and upper quartiles we arrange the scores in ascending order.The lower quartile is of the way through the distribution and the upper quartile is of the way through the distribution.

To find the interquartile range we follow the steps below.1. Arrange the data in ascending order.2. Divide the data into halves by finding the median.

(a) If there is an odd number of scores the median score should not be included ineither half of the scores.

(b) If there is an even number of scores the middle will be half way between twoscores and this will divide the data neatly into two sets.

3. The lower quartile will be the median of the lower half of the data.4. The upper quartile will be the median of the upper half of the data.5. The interquartile range will be the difference between the medians of the two halves

of the data.

Calculate the median of: a 2, 5, 8, 8, 8, 11, 12 b 45, 69, 69, 87, 88, 92, 99, 100.

THINK WRITE

a These scores are already arranged in ascending order, so there is no need to reorder. There are 7 scores, so the

median is the 4th score.

a Median = 8

b There are 8 scores, so the median is the average of the

4th score and the 5th score.

b Median =

= 87.5

7 1+2

------------ 4th score=

8 1+2

------------ 4.5th score=

87 88+2

------------------

13WORKEDExample

14--- 3

4---

14WORKEDExampleFind the interquartile range of the following data which shows the number of home runs scored in a series of baseball matches.12, 9, 4, 6, 5, 8, 9, 4, 10, 2

THINK WRITE

Write the data in ascending order. 2, 4, 4, 5, 6, 8, 9, 9, 10, 12Divide the data into two equal halves. 2, 4, 4, 5, 6 8, 9, 9, 10, 12The lower quartile will be the median of the lower half.

Lower quartile = 4 runs

The upper quartile will be the median of the upper half.

Upper quartile = 9 runs

The interquartile range will be the upper quartile minus the lower quartile.

Interquartile range = 9 − 4= 5 runs

123

4

5

MQ Maths A Yr 11 - 10 Page 414 Wednesday, July 4, 2001 5:58 PM

Page 35: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 415The interquartile range can also be calculated using a graphics calculator.

The data below give the amount spent (to the nearest whole dollar) by each child in agroup that was taken on an excursion to the Brisbane Exhibition.

15 12 17 23 21 19 16 11 17 18 23 24 25 21 20 37 17 25 22 21 19

Calculate the interquartile range for these data.

THINK DISPLAYEnter the data.(a) Press .(b) Select 1:Edit by pressing .(c) Enter the data in L1.Note: There is no need to organise the data into ascending order first.

Obtain the values of the quartiles.(a) Press .(b) Select CALC. Make sure that 1-Var

Stats is set up as Xlist: L1 and Freq: 1.

(c) Select 1:1–Var Stats by pressing .

(d) Type L1. Press .

A list of statistics appears. Locate the first and third quartiles.

Scroll down the screen using the key.Q1 = 17 and Q3 = 23So, IQR = $23 − $17 = $6

1STAT

ENTER

2STAT

ENTERENTER

3 �

15WORKEDExample

remember1. Measures of dispersion are used to measure the spread of a set of scores.2. The range is calculated by subtracting the lowest score from the highest score.3. A single outlying score can enlarge the range. The interquartile range is

therefore a better measure of dispersion.4. The interquartile range is found by subtracting the lower quartile from the

upper quartile.5. The lower and upper quartiles are found by dividing the scores into two equal

halves. The median of the lower half is the lower quartile and the median of the upper half is the upper quartile.

6. Remember to show units in your final answer.

remember

MQ Maths A Yr 11 - 10 Page 415 Wednesday, July 4, 2001 5:58 PM

Page 36: Maths A - Chapter 10

416 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Range and interquartile range

1 Copy and complete the following:The range is a measure of __________ or __________ of a set of scores. It can becalculated by subtracting the __________ score from the __________ score. Thevalue of the range can be affected by a single __________ score. For this reason, the__________ range is sometimes a better measure of the spread of the scores. It can becalculated as the difference between the __________ quartile and the __________quartile. The __________ divides the scores in half; the lower quartile represents ascore below which lies __________ of the scores; the upper quartile represents ascore above which __________ of the scores lie. The lower quartile, median andupper quartile divide the distribution into __________ equal parts. In each of theseparts there is the same number of __________.

2 Find the range of each of the following sets of data.a 2, 5, 4, 5, 7, 4, 3b 103, 108, 111, 102, 111, 107, 110c 2.5, 2.8, 3.4, 2.7, 2.6, 2.4, 2.9, 2.6, 2.5, 2.8d 3.20, 3.90, 4.25, 7.29, 1.45, 2.77, 8.39e 45, 23, 7, 47, 76, 89, 96, 48, 87, 76, 66

3 Use the frequency distribution tables below to find the range for each of the followingsets of scores.

a Score Frequency b Score Frequency

1 2 38 23

2 6 39 46

3 12 40 52

4 10 41 62

5 7 42 42

43 45

c Score Frequency

89 12

90 25

91 36

92 34

93 11

94 9

95 4

10F

WORKEDExample

11

MQ Maths A Yr 11 - 10 Page 416 Wednesday, July 4, 2001 5:58 PM

Page 37: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a

417

4

For the grouped dispersions below, state the range.

5

The scores below show the number of points scored by two AFL teams over the first10 games of the season.

Sydney: 110 95 74 136 48 168 120 85 99 65Collingwood: 125 112 89 111 96 113 85 90 87 92

a

Calculate the range of the scores for each team.

b

Based on the results above, which team would you say is the more consistent?

6

Two machines are used to put approximately 100 Smarties into boxes. A check ismade on the operation of the two machines. Ten boxes filled by each machine havethe number of Smarties in them counted. The results are shown below.

Machine A: 100, 99, 99, 101, 100, 101, 100, 100, 101, 108Machine B: 98, 104, 96, 97, 103, 96, 102, 100, 97, 104

a

What is the range in the number of Smarties from the first machine?

b

What is the range in the number of Smarties from the second machine?

c

Ralph is the quality control officer and he argues that machine A is more consis-tent in its distribution of Smarties. Explain why.

7

Find the median for each of the data sets below.

a

3, 4, 4, 5, 7, 9, 10

b

17, 20, 19, 25, 29, 27, 28, 25, 29

c

52, 55, 53, 53, 54, 55, 52, 53, 54, 52

d

12, 14, 15, 12, 14, 19, 17, 15, 18, 20

e

56, 75, 83, 47, 93, 35, 84, 83, 73, 20, 66, 90

a

Class Frequency

b

Class Frequency

51–60 2 150 to <155 12

61–70 8 155 to <160 25

71–80 15 160 to <165 38

81–90 7 165 to <170 47

91–100 1 170 to <175 39

175 to <180 20

c

Class Frequency

40–43 48

44–47 112

48–51 254

52–55 297

56–59 199

60–63 84

WORKEDExample

12

WORKEDExample

13

MQ Maths A Yr 11 - 10 Page 417 Wednesday, July 4, 2001 6:00 PM

Page 38: Maths A - Chapter 10

418

M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

8

For each of the data sets in question

7

, calculate the interquartile range.

9

For the frequency table below, what is the range?

10

Calculate the interquartile range of the following data.17, 18, 18, 19, 20, 21, 21, 23, 25

11

The interquartile range is considered to be a better measure of the variability of a setof scores than the range because it:

A

takes into account more scores

B

is the difference between the upper and lower quartiles

C

is easier to calculate

D

is not affected by extreme values.

12

The distribution below shows the ranges in the heights of 25 members of a football squad.

Which of the statements below is correct?

A

The range of the distribution is 40.

B

The range of the distribution is 49.

C

The range of the distribution is 9.

D

The range can be estimated only by using the cumulative frequency.

Score Frequency

25 14

26 12

27 19

28 25

29 19

A

4

B

5

C

6

D

17

A

3

B

4

C

5

D

8

Height (cm) Class centre FrequencyCumulative frequency

140–149 144.5 2 2

150–159 154.5 5 7

160–169 164.5 10 17

170–179 174.5 7 24

180–189 184.5 1 25

WORKEDExample

14,15

EXCEL

Spreadsheet

Interquartile range

mmultiple choiceultiple choice

mmultiple choiceultiple choice

mmultiple choiceultiple choice

mmultiple choiceultiple choice

MQ Maths A Yr 11 - 10 Page 418 Wednesday, July 4, 2001 6:00 PM

Page 39: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a

419

Standard deviation

We have already discussed using the range and the interquartile range as measures ofthe spread of a data set. However, the most commonly used measure of spread is the

standard deviation

. The standard deviation is a measure of how much a typical score in a data set differs

from the mean.

Standard deviation from single scores

The standard deviation may be found by entering a set of scores into your calculator,just as you do when you are finding the mean. Your calculator will have a statisticalfunction that gives the standard deviation.

There are two standard deviation functions on your calculator. The first,

σ

n

, is the

population

standard deviation. This function is used when the statistical analysis isconducted on the entire population.

When the statistical analysis is done using a

sample

of the population, a slightly dif-ferent standard deviation function is used. Called the

sample

standard deviation, thisvalue will be slightly higher than the population standard deviation.

The sample standard deviation will be found on your calculator using the

σ

n

1

or

the

s

n

function.

Below are the scores out of 100 achieved by a class of 20 students on a science exam. Calculate the mean and the standard deviation.

87 69 95 73 88 47 95 63 91 66 59 70 67 83 71 57 82 65 84 69

THINK WRITE

Enter the data set into your calculator.Retrieve the mean using the x– function. x– = 74.05 marksRetrieve the standard deviation using the σn function. σn = 13.07 marks

123

16WORKEDExample

Ian surveys twenty Year 11 students and asks how much money they earn from part-time work each week. The results are given below.

$65 $82 $47 $78 $108 $94 $60 $79 $88 $91$50 $73 $68 $95 $83 $76 $79 $72 $69 $97

Calculate the mean and standard deviation.

THINK WRITE

Enter the statistics into your calculator.Retrieve the mean using the x– function. x– = $77.70Retrieve the standard deviation using the σn − 1 function, as a sample has been used.

σn − 1 = $15.56

123

17WORKEDExample

MQ Maths A Yr 11 - 10 Page 419 Wednesday, July 4, 2001 6:00 PM

Page 40: Maths A - Chapter 10

420 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

For most examples, you will need to read the question carefully to decide whether touse the population or the sample standard deviation.

Standard deviation using a graphics calculator

A graphics calculator can be used to determine a population or sample standard devi-ation. Using the TI–83, the population standard deviation is displayed under 1-Var Statsas sx and the sample standard deviation is shown as Sx.

As mentioned previously, the sample standard deviation is slightly higher than thepopulation standard deviation (compare the values for Sx and σx in the above example).

Standard deviation from a frequency distribution tableThe standard deviation can also be calculated when the data are presented in tableform. This is done by entering the data in the same way as they were when calculatingthe mean earlier in this chapter. A graphics calculator can also be used.

The price (in cents) per litre of petrol at a service station is recorded each Friday over a 15-week period and the data are given below.

76.2 80.1 79.8 84.3 80.7 78.3 82.4 81.380.5 78.2 79.5 80.1 81.3 84.2 83.4

Calculate the sample standard deviation for this set of data using a graphics calculator.

THINK WRITE/DISPLAY

Enter the data as L1 in the graphics calculator.

Calculate the standard deviation.(a) Press .(b) Highlight CALC.(c) Select 1–Var Stats.(d) Type L1 or the name of the list into which

you have entered the data. A list of statistics is produced with the mean at the top. So, the standard deviation is given by Sx = 2.257 959 467Sx = 2.26 (correct to 2 decimal places).

s = 2.257 959 467s = 2.26 cents/L

1

2STAT

18WORKEDExample

MQ Maths A Yr 11 - 10 Page 420 Wednesday, July 4, 2001 6:00 PM

Page 41: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 421

If you are using a graphics calculator to determine the mean and standard deviation inexample 19, enter the scores as L1 and their corresponding frequencies as L2.Remember to set up 1-Var Stats as Xlist: L1 and Freq: L2. The mean will then be displayedas x and the standard deviation as sx.

Once we have calculated the standard deviation we can make conclusions about thereliability and consistency of the data set. The lower the standard deviation, the less spreadout the data set is. By using the standard deviation we can determine whether a set ofscores is more or less consistent (or reliable) than another set. The standard deviation isthe best measure of this because, unlike the range or interquartile range as a measure ofdispersion, the standard deviation considers the distance of every score from the mean.

A higher standard deviation means that scores are less clustered around the mean andless dependable. For example, consider the following two students’ results over anumber of assessment pieces:

Student A: x– = 60 σn = 5Student B: x– = 60 σn = 15

The table below shows the scores of a class of thirty Year 3 students on a spelling test.

Calculate the mean and standard deviation.

Score Frequency

4 1

5 2

6 4

7 9

8 6

9 7

10 1

THINK WRITE

Enter the data into your calculator using score × frequency.Retrieve the mean by using the x– function.

x– = 7.4

Retrieve the standard deviation using the σn function, as the whole population is included in the statistics.

σn = 1.4

1

2

3

19WORKEDExample

Graphics CalculatorGraphics Calculator tip!tip! Standard deviation

MQ Maths A Yr 11 - 10 Page 421 Wednesday, July 4, 2001 6:00 PM

Page 42: Maths A - Chapter 10

422 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Both students have the same mean. However, student A has a standard deviation of 5and student B has a standard deviation of 15. Student A is far more consistent and canconfidently be expected to score around 60 in any future exam. Student B is moreinconsistent but is probably capable of scoring a mark higher than student A’s.

Two brands of light globe are tested to see how long they will burn (in hours).

Brand X: 850 950 1400 875 12001150 1000 900 850 825

Brand Y: 975 1100 1050 1000 975950 1075 1025 950 900

Which of the two brands of light globe is more reliable?

THINK WRITE

Enter both sets of data into your calculator.

Choose the sample standard deviation because a sample of each light globe brand has been chosen.

Write down the sample standard deviation for each brand.

Brand X: sample standard deviation = 190.4 hBrand Y: sample standard deviation = 62.4 h

The brand with the lower standard deviation is the more reliable.

Brand Y is the more reliable as it has a lower standard deviation.

1

2

3

4

20WORKEDExample

remember1. The standard deviation is a measure of the spread of a data set. 2. Standard deviation is found on your calculator by entering the data set using

the calculator’s statistical mode.3. The population standard deviation is used when an entire population is

considered in the statistical analysis and can be found on the calculator using the σn function (or Sx on the graphics calculator).

4. The sample standard deviation is used when a sample of the population is used in the analysis and can be found using the σn − 1 function (Sx on the graphics calculator).

5. A set of data is considered to be more consistent or reliable if it has a low standard deviation.

remember

MQ Maths A Yr 11 - 10 Page 422 Wednesday, July 4, 2001 6:00 PM

Page 43: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 423

Standard deviation

1 Copy and complete the following:Standard deviation is a measure of the __________ of the scores. It represents howmuch a typical score differs from the __________ of the data set. The statistical func-tion used on the calculator for a population standard deviation is __________, whilefor a sample of the population, the function used is __________. A low value for thestandard deviation indicates that the set of scores is more __________ while a highvalue indicates __________ consistency or reliability.

2 For each of the sets of scores below, calculate the standard deviation. Assume that thescores represent an entire population and answer correct to 2 decimal places.a 3, 5, 8, 2, 7, 1, 6, 5b 11, 8, 7, 12, 10, 11, 14c 25, 15, 78, 35, 56, 41, 17, 24d 5.2, 4.7, 5.1, 12.6, 4.8e 114, 12, 3.6, 42.8, 0.5

3 For each of the sets of scores below, calculate the sample standard deviation, correctto 2 decimal places.a 25, 36, 75, 85, 6, 49, 77, 80, 37, 66b 4.8, 9.3, 7.1, 9.9, 7.0, 4.1, 6.2c 112, 25, 56, 81, 0, 5, 178, 99, 41d 0.3, 0.3, 0.3, 0.4, 0.5, 0.6, 0.8, 0.8, 0.8, 0.9, 1.0e 56, 1, 258, 45, 23, 58, 48, 35, 246

4 For each of the following, state whether it is appropriate to use the population standard deviation or the sample standard deviation.a A quality control officer tests the life of

50 batteries from a batch of 1000.b The weight of every bag of potatoes is checked

and recorded before being sold.c The number of people who attend every football

match over a season is analysed.d A survey of 100 homes records the number of

cars in each household.e The score of every Year 11 student in

mathematics is recorded.

5 The band ‘Aquatron’ is to release a new CD. The recording company needs to predictthe number of copies that will be sold at various music stores throughout Australia. Todo so, a sample of 10 music stores supplied information about the sales of theprevious CD released by Aquatron, as shown below.

580 695 547 236 458 620 872 364 587 1207

a Calculate the mean number of sales at each store.b Should the population or sample standard deviation be used in this case?c What is the value of the appropriate standard deviation?

10G

WORKEDExample

16

GC program

UVstatistics

WORKEDExample

17

WORKEDExample

18

MQ Maths A Yr 11 - 10 Page 423 Wednesday, July 4, 2001 6:00 PM

Page 44: Maths A - Chapter 10

424 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

6 A supermarket chain is analysing its sales over a week. The chain has 15 stores andthe sales for each store for the past week were (in $million):

1.5 2.1 2.4 1.8 1.1 0.8 0.9 1.1 1.4 1.6 2.0 0.7 1.2 1.7 1.3

a Calculate the mean sales for the week.b Should the population or sample standard deviation be used in this case?c What is the value of the appropriate standard deviation?

7 Use the statistical function on your calculator to find the mean and standard deviation(correct to 1 decimal place) for the information presented in the following tables. Ineach case, use the population standard deviation.

8 Copy and complete the class centre column for each of the following distributions andhence use your calculator to find an estimate for the mean and standard deviation(correct to 2 decimal places). In each case use the population standard deviation.

WORKEDExample

18

a b cScore Frequency

3 12

4 24

5 47

6 21

7 7

Score Frequency

45 1

46 16

47 39

48 61

49 52

50 36

Score Frequency

75 22

76 17

77 8

78 10

79 12

80 21

81 29

WORKEDExample

19

a

c

bClass

Class centre Frequency

10–12 12

13–15 16

16–18 25

19–21 28

22–24 13

Class Class centre Frequency

31–40 15

41–50 28

51–60 36

61–70 19

71–80 8

81–90 7

91–100 2

Class Class centre Frequency

0–4 15

5–9 24

10–14 31

15–19 33

20–24 29

25–29 17

MQ Maths A Yr 11 - 10 Page 424 Wednesday, July 4, 2001 6:00 PM

Page 45: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 4259 Below are the marks achieved by two students in five tests.

Brianna: 75, 80, 70, 72, 78Katie: 50, 95, 90, 80, 55

a Calculate the mean and standard deviation for each student.b Which of the two students is more consistent? Explain your answer.

10

From Year 11, 21 students are chosen to complete a test. The scores are shown in thetable below.

When preparing an analysis of the typical performance of Year 11 students on the test,the standard deviation used is:

11

The results below are Ian’s marks in four exams for each subject that he studies.

English: 63 85 78 50Maths: 69 71 32 97Biology: 45 52 60 41Geography: 65 78 59 61

In which subject does Ian achieve the most consistent results?

12 The following frequency distribution gives the prices paid by a car wrecking yard fora sample of 40 car wrecks.

Find the mean and standard deviation of the price paid for these wrecks.

Class Frequency

10 to <20 1

20 to <30 6

30 to <40 9

40 to <50 4

50 to <60 1

A 9.209 B 9.437 C 21 D 34.048

A English B Maths C Biology D Geography

Price ($) Frequency

0 to <500 2

500 to <1000 4

1000 to <1500 8

1500 to <2000 10

2000 to <2500 7

2500 to <3000 6

3000 to <3500 3

WORKEDExample

20

mmultiple choiceultiple choice

mmultiple choiceultiple choice

MQ Maths A Yr 11 - 10 Page 425 Wednesday, July 4, 2001 6:00 PM

Page 46: Maths A - Chapter 10

426 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

13 The table below shows the life of a sample of 175 household light globes.

a Find the range of the data.b Use the class centres to find the

mean and standard deviation in the lifetimes of this sample of light globes.

14 Crunch and Crinkle are two brands of potato crisps. Each is sold in packets nominally of the same size and for the same price. Upon investigation of a sample of packets of each, it is found that Crunch and Crinkle packetshave the same mean mass (25 g). The standard deviation of the masses of Crunch packets is, however, 5 g and the standard deviation of the masses of Crinkle packets is 2 g. Which brand do you think represents better value for money under these circumstances? Why?

Life (hours) Frequency

200 to <250 2

250 to <300 5

300 to <350 12

350 to <400 25

400 to <450 42

450 to <500 38

500 to <550 26

550 to <600 15

600 to <650 7

650 to <700 3

WorkS

HEET 10.2

MQ Maths A Yr 11 - 10 Page 426 Wednesday, July 4, 2001 6:00 PM

Page 47: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a

427

For the set of scores 23, 45, 24, 19, 22, 16, 16, 27, 20, 21, find:

1

the mean

2

the median

3

the mode

4

the range

5

the lower quartile

6

the upper quartile

7

the interquartile range

8

the population standard deviation.

9

Which measure of central tendency is the best measure of location in this data set?

10

Explain why the interquartile range is a better measure of spread than the range.

Displaying statistical data and statistical graphs

1

As a class, collect information on:

a

the number of people that live in each student’s household

b

the number of pets in each student’s household.

2

Use your graphics calculator to enter the data as two separate lists,

L1

and

L2

.

3

Use the statistics function on the calculator to find the following information for each data set.

a

mean

b

median

c

minimum value

d

maximum value

e

lower quartile

f

upper quartile

4

Use the statistical plotting function on your calculator to draw a boxplot of the data you have entered.

inve

stigationinvestigatio

n

2

MQ Maths A Yr 11 - 10 Page 427 Thursday, July 5, 2001 9:14 AM

Page 48: Maths A - Chapter 10

428 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Archie’s new skull siteHaving considered some of the statistics on skull measurements from 4000 BC and AD 150, we are now in a position to collate all the figures from these two time periods and compare them with the statistics from Archie’s new discovery. We must approach this task in a methodical manner.

Step 1Organise the statistics we already know. Copy the table below and fill in any data already calculated for 4000 BC and AD 150.

Step 2Calculate and fill in any statistics missing for 4000 BC and AD 150 (that is, standard deviation and range values).

Step 3Consider the data Archie has collated from measurements on the skulls from his new site.

GCpro

gram

UV statistics

inve

stigationinvestigatio

n

4000 BC AD 150 New site

Mean Breadth

Height

Length

Median Breadth

Height

Length

Mode Breadth

Height

Length

Standarddeviation

Breadth

Height

Length

Range Breadth

Height

Length

MQ Maths A Yr 11 - 10 Page 428 Wednesday, July 4, 2001 6:00 PM

Page 49: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 429

Calculate the mean, median, mode, standard deviation and range for the data from the new site. Add these to the table.

Step 4Compare the values obtained for Archie’s new site with those for 4000 BC and AD 150.

Step 5Now comes the decision phase. To which era does the new site appear closer? Is there consistency here?

Step 6The final phase requires a report of the findings. In doing so, you must remember that any conclusions drawn must be backed by providing substantial evidence.

Write a paragraph advising Archie of the results of this study. Describe the changes you have observed in the shape of the skulls from 4000 BC to AD 150 and indicate which time period you feel his new data most closely matches. Back your recommendation with ample statistical evidence.

These types of project are undertaken every day in a variety of situations. It is vital that we realise the importance of reporting statistical information accurately and in an unbiased manner.

Mathcad

Summarystatistics

Breadth Height Length Breadth Height Length

124 138 101 131 128 98

133 134 97 138 129 107

138 134 98 123 131 101

148 129 104 130 129 105

126 124 95 134 130 93

135 136 98 137 136 106

132 145 100 126 131 100

133 130 102 135 136 97

131 134 96 129 126 91

133 125 94 134 139 101

133 136 103 131 134 90

131 139 98 132 130 104

131 136 99 130 132 93

138 134 98 135 132 98

130 136 104 130 128 101

MQ Maths A Yr 11 - 10 Page 429 Wednesday, July 4, 2001 6:00 PM

Page 50: Maths A - Chapter 10

430 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Comparing sets of dataBack-to-back stem-and-leaf plotsSome of the most useful and interesting statistical investigations involve the com-parison of two sets of data. In the previous chapter, we drew stem-and-leaf plots ofsingle sets of data. We shall now consider using such plots to compare two data sets.

Back-to-back stem-and-leaf plots are useful to compare the distribution of twosimilar sets of data. This is particularly useful in the situation of controlledexperiments. The two sets of data use the same central stem. One set of leaves is set tothe right of the stem and the other to the left. Care must be taken when arranging thedata of the left set. Place the smallest numeral closest to the central margin, then rangeoutwards as the data size increases. The key generally relates to data which arepresented on the right of the plot.

An example of such a plot is presented below.The data show the lifetimes of a sample of 40batteries of each of two brands when fitted intoa standard children’s toy. Some of the toys arefitted with an ordinary brand battery and somewith Brand X. Which brand is better?

Key: 6 9 = 69 hours

The spread of each set of data can be seen graphically from the stem-and-leaf plot. Inthis case it can be seen that, although Brand X showed a little more variability than theordinary brand, it generally gave a longer lasting performance.

Side-by-side or parallel boxplotsTwo or more sets of data may be compared by using side-by-side boxplots. Theboxplots share a common scale. Numerical comparisons can be made between the setsof data based upon the size and position of the range, interquartile range and median.This is a strong feature of a boxplot.

In general, a histogram or stem-and-leaf plot is better than a boxplot at giving thereader information about the distribution of a set of scores (because boxplots do notshow individual scores), but boxplots have greater scope for making quantitative com-parisons. In the case of the battery test data above, the following side-by-side boxplotwould result. (Quartiles and medians are found in the usual way.)

Ordinary brand Brand XLeaf

8 6 2 0 09 9 9 8 8 6 4 0

8 8 7 5 3 1 1 1 09 6 6 4 2 2 2 0 0

8 7 5 3 1 1 14 2

Stem6789

1011121314

Leaf93 52 4 80 1 4 5 5 90 0 2 5 8 8 9 90 0 1 1 3 3 6 7 91 4 6 6 6 7 8 83 56

MQ Maths A Yr 11 - 10 Page 430 Wednesday, July 4, 2001 6:00 PM

Page 51: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 431We can make the fol-

lowing comparisons be-tween the sets of data:1. Brand X showed more

variability in its performance (that is, its lifetime) than the ordinary brand. (Brand Xrange = 77, ordinary brand range = 54; Brand X interquartile range = 27.5, ordinarybrand interquartile range = 19.)

2. The longest lifetime recorded was that of a Brand X battery (146).3. The shortest lifetime recorded was that of an ordinary brand battery (60).4. Brand X batteries’ median lifetime was better than that of the ordinary brand (Brand

X median = 109.5, ordinary brand median = 87.5).5. Over one-quarter of Brand X batteries were better performers than the best ordinary

brand battery; that is, had longer lifetimes than the longest of the ordinary brandbatteries’ lifetimes. (Remember that the four sections of a boxplot each representone-quarter of the scores.)

6050 70 80 90 100 110 120 130 140 150 Hours

Brand X

Ordinary Brand

The stem-and-leaf plot below shows the weights of two samples of chickens 3 months after hatching. One group of chickens (Group A) had been given a special growth hormone. The other group (Group B) was kept under identical conditions but was not given the hormone. Prepare side-by-side boxplots of the data and draw conclusions about the effectiveness of the growth hormone.

Key: 0* 8 = 0.8 kg

1 3 = 1.3 kg

Continued over page

Group B Group ALeaf

4 4 49 8 8 7 7 5 5

4 4 3 0 0 0

00*11*22*

Leaf

835 7 7 90 0 0 1 1 3 35 8 8

THINK WRITE

First locate the medians of each group. There are 16 observations in each group.

The median of each group is the th score

— that is, the 8.5th score, or between the 8th and 9th scores.

The median divides each group into two halves; the quartiles are the medians of the upper and lower halves. There are 8 scores in each half.

The position of the quartiles is given by the

th score — that is, the 4.5th score,

halfway between the 4th and 5th scores in each half.

116 1+

2---------------

2

8 1+2

------------

21WORKEDExample

MQ Maths A Yr 11 - 10 Page 431 Wednesday, July 4, 2001 6:00 PM

Page 52: Maths A - Chapter 10

432 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

The graphics calculator can also be used to display parallel boxplots.

THINK WRITEFind the quartiles and medians on the stem-and-leaf plot. Be careful to count from the centre out with each set of data.

Write a five-number summary for each group.

Group A: 0.8, 1.7, 2.0, 2.3, 2.8Group B: 0.4, 0.5, 0.8, 1.0, 1.4

Draw the boxplots using a common scale.

Compare the data. Consider central score, highest and lowest scores, variability in scores, etc.

• The biggest of all chickens was from Group A (hormone group).

• The smallest of all chickens was from Group B (no hormone).

• The Group A data showed a little more vari-ability (Group A interquartile range = 0.6, Group B interquartile range = 0.5).

• The median size of chickens in Group A was larger (Group A median = 2, Group B median = 0.8).

• Over three-quarters of the Group A chickens were bigger than all of the Group B chickens!

Conclusion:The growth hormone proved to be effective.

3 Key: 0* 8 = 0.8 kg

1 3 = 1.3 kg

Group B Group ALeaf

4 4 49 8 8 7 7 5 5

4 4 3 0 0 0

Stem00*11*22*

Leaf

835 7 7 90 0 0 1 1 3 35 8 8 Q3

Median

Q1

Q3

Q1Median

4

5

Group A

Group B

0 1.51.00.5 2.0 2.5 3.0 kg

6

MQ Maths A Yr 11 - 10 Page 432 Wednesday, July 4, 2001 6:00 PM

Page 53: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 433

The four Year 11 Maths A classes at Western Secondary College complete the same end-of-year maths test. The marks, expressed as percentages for each of the students in the four classes, are given below.

Display the data using a parallel boxplot and use this to describe any similarities or differences in the distributions of the marks among the four classes.

Continued over page

11A 11B 11C 11D 11A 11B 11C 11D

40 60 50 40 63 78 70 69

43 62 51 42 63 82 72 73

45 63 53 43 63 85 73 74

47 64 55 45 68 87 74 75

50 70 57 50 70 89 76 80

52 73 60 53 75 90 80 81

53 74 63 55 80 92 82 82

54 76 65 59 85 95 82 83

57 77 67 60 89 97 85 84

60 77 69 61 90 97 89 90

THINK WRITE/DISPLAYCreate the first boxplot (for class 11A) on a graphics calculator using [STAT PLOT] and appropriate WINDOW settings. Using to show key values, sketch the first boxplot using pen and paper, leaving room for three additional plots.

Repeat step 1 for the other three classes.

All four boxplots share the common scale.

12nd

TRACE

2

3

30 40 50 60 70 80 90 100

11D

11C

11B

11A

Maths mark (%)

22WORKEDExample

MQ Maths A Yr 11 - 10 Page 433 Wednesday, July 4, 2001 6:00 PM

Page 54: Maths A - Chapter 10

434 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

It is important to remember that, in a boxplot, each section represents one-quarter ofthe scores. If one section of a boxplot is longer compared with another section of thesame boxplot, it can be interpreted that the scores are more spread out in that section —not that the longer section contains a greater number of scores.

THINK WRITE

Describe the similarities and differences between the four distributions.

Class 11B had the highest median mark and the range of the distribution was only 37. The lowest mark in 11B was 60.

We notice that the median of 11A’s marks is approximately 60. So, 50% of students in 11A received less than 60. This means that half of 11A had scores that were less than the lowest score in 11B.

The range of marks in 11A was about the same as that of 11D with the highest scores in each about equal, and the lowest scores in each about equal. However, the median mark in 11D was higher than the median mark in 11A so, despite a similar range, more students in 11D received a higher mark than in 11A.

While 11D had a top score that was higher than that of 11C, the median score in 11C was higher than that of 11D and the bottom 25% of scores in 11D were less than the lowest score in 11C. In summary, 11B did best, followed by 11C then 11D and finally 11A.

4

remember1. Back-to-back stem plots are useful for comparing distributions of two similar

sets of data.When completing back-to-back stem plots:(a) use a common stem(b) distinguish between the two sets of data by labelling them clearly(c) the key generally relates to data on the right-hand side of the central stem(d) when organising the data to the left of the central stem, the smallest piece

of data goes closest to the central stem, then outwards as the data increases.2. Side-by-side or parallel boxplots:

(a) share a common scale(b) allow us to make quantitative comparisons between sets of data, based

upon the size and position of the range, quartiles, interquartile range and medians

(c) allow us to compare more than two sets of data.

remember

MQ Maths A Yr 11 - 10 Page 434 Wednesday, July 4, 2001 6:00 PM

Page 55: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 435

Comparing sets of data

1 The boxplots at right show the results of testingtwo brands of transistor by applying increasingvoltage.a Which brand had the transistor which could

withstand the highest voltage?b Which brand had the transistor which could withstand the least voltage?c Which brand gave the better median performance?d Which brand showed most variability in terms of range?e Which brand showed most variability in terms of interquartile range?f Which brand would you recommend to a manufacturer of electronic equipment if

they requested a transistor that reliabily worked at its expected voltage?g Which brand would you recommend to a manufacturer of electronic equipment if

they requested a transistor that was likely to withstand higher voltages?

2 The boxplots at right weredrawn by a teacher who wastrying to assess the effects ofallowing students the privi-lege of an ‘open book’ exam.Plot A gives information about the results of students who were allowed to use theirtextbook in an end-of-unit test. Plot B gives information about the results of studentswho did the test without the aid of the textbook.a Which group had the student who gained the best test result?b Which group did best on median result?c Compare the variability in the performance of each group.d Does the use of the textbook get a better test performance? Explain.e What other things need to be taken into account when drawing these conclusions?

3 Draw side-by-side boxplots for the following pair of five figure summaries.

Group X: 14, 18.5, 21.5, 27.5, 33Group Y: 11, 17.5, 21, 26.5, 35

4 The following stem-and-leaf plots give the age at marriage of a group of 10 womenand a group of 10 men.

Key: 1 8 = 18 years old

a Draw side-by-side boxplots of the data.b Make comparisons about the distribution of the sets of data.

Men WomenLeaf8 7

9 8 7 5 16 3

0

Stem1234

Leaf8 80 2 3 4 4 50 1

10H5 10 15 20 25 30 Volts

Brand A

Brand BMathcad

Interpretingboxplots

10 20 30 40 50 60 70 80 90 100 Test result

Text

No Text

Plot A

Plot B

WORKEDExample

21

MQ Maths A Yr 11 - 10 Page 435 Wednesday, July 4, 2001 6:00 PM

Page 56: Maths A - Chapter 10

436 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

5 The number of words in each of the first 12 sentences is counted in each of 3 differenttypes of book: a children’s book, a Year 12 geography text, and a major dailynewspaper. The results are as follows:

a Draw side-by-side boxplots of the data.b Make comparisons about the sentence length of each type of publication. Use stat-

istics in your answer.

6 The stem-and-leaf plot below gives the batting scores of two cricket players — Smithand Jones — who share the responsibility of ‘opening the batting’ for their side.

Key: 1 2 = 12

a Derive a five-number summary for each player.b Draw side-by-side boxplots of the data.c Make comparisons between the two sets of data. Use statistics in your answer.d Which player do you consider to be the best ‘opening bat’ and why?

Questions 7 and 8 refer to the following stem-and-leaf plot.Key: 12 2 = 122

7The lower quartile of Group B is:

8Which of the following statements is untrue?A Data from Group A show less consistency than the data from Group B.B Data from Group B have a lower interquartile range.C Group B has a greater median.D Group A shows a greater amount of variability.E None of the above. (All of the statements are true.)

Children’s bookGeography textNewspaper

61612

818

6

1225

8

151314

61018

825

7

102912

81810

57

21

112217

102816

822

8

Jones SmithLeaf

3

8 7 4 29 9 8 7 7 5

8 4 4 2 05 2 0

1

Stem012345678

Leaf0 0 12 6 9

6 6 87 8 8 9 90 4 62 45

Group B Group ALeaf

68 5 4 2 2

8 5 5 3 0 07 4 4 1 0

1 1

Stem12131415161718

Leaf23 80 4 4 62 3 5 7 82 4 4 52 61

A 156.5 B 144 C 155 D 152 E none of the above.

WORKEDExample

22

mmultiple choiceultiple choice

mmultiple choiceultiple choice

MQ Maths A Yr 11 - 10 Page 436 Wednesday, July 4, 2001 6:00 PM

Page 57: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 437Questions 9 and 10 refer to theboxplots at right.

9

Which of the following statements is a correct comparison of the data?A Group X has a higher median and shows more variability than Group Y.B Group X has a lower median and shows more variability than Group Y.C Group X has a higher median and shows less variability than Group Y.D Group X has a lower median and shows less variability than Group Y.E It is impossible to make comparisons like this without seeing the data displayed on

a stem-and-leaf plot.

10

Which of the following statements is untrue of the boxplots?A One-quarter of all Group X data is greater than any of Group Y data.B The median of Group X is 25.C The interquartile range of Group X is 25.D The range of Group Y is 9.E None of the above. (All the statements are true.)

11 A packing machine is meant to pack sacks of flour in 20.0 kg weights. A quality con-trol manager notices that the machine appears to be too generous in the amount that itis putting into each sack. After checking a sample of sacks for weight he adjusts themachine. After some time he selects a second sample of sacks and checks theirweights. The results are detailed on the stem-and-leaf plot below.

Key: 20 3 = 20.3 kg

20* 5 = 20.5 kg

a Derive a five-number summary for each set of data.b Draw side-by-side boxplots of the data.c Compare the performance of the machine before and after it was adjusted. Use

statistics in your answer.d Should the manager have adjusted the machine? Why (not)?

12 A new spray treatment has been developed to improve the budding of apple trees.Twenty-five trees are subjected to the spray treatment while 25 others are kept as acontrol. The number of apples that form on each tree is recorded below.

Group A (sprayed)35 52 71 21 34 42 76 45 48 3229 85 73 28 34 59 52 56 27 2933 38 54 42 51

After adjustment Before adjustmentLeaf4 3

9 8 8 7 7 6 5 54 3 1 1 0 0

9 8 64 3 0

6 62

Stem1919*2020*2121*22

Leaf

3 4 45 5 7 8 8 8 90 0 1 1 2 3 3 3 45 5 6 7 7 8

15 20 25 30 35 40 Scale

Group X

Group Ymmultiple choiceultiple choice

mmultiple choiceultiple choice

MQ Maths A Yr 11 - 10 Page 437 Wednesday, July 4, 2001 6:00 PM

Page 58: Maths A - Chapter 10

438 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

Group B (unsprayed)44 42 55 41 39 68 63 62 58 5143 47 49 45 40 52 56 50 71 3935 37 38 52 58

a Detail the data on a back-to-back stem-and-leaf plot. Use a class size of 10.b Prepare side-by-side boxplots of the data.c Make comparisons of the data. Use statistics in your answer.d Comment on the overall effect of the spray. Would you recommend that orchard-

ists use the spray?

13 Twenty different flashlight bulbs of each of two brands are tested until they burn out.The lifetime of each (in hours) is recorded below.

Glow-worm23 45 31 38 39 41 48 47 54 2328 35 42 49 50 41 52 48 27 35

Starlet28 16 24 36 47 18 59 32 64 6872 35 46 72 54 31 29 36 55 43

a Detail the data on a back-to-back stem-and-leaf plot. Use a class size of 5.b Prepare side-by-side boxplots of the data.c Make comparisons of the data. Use statistics in your answer.d Which brand would you recommend as the better? Why?

Drug test analysisA new drug for the relief of cold symptoms has been developed. To test the drug, 40 people were exposed to a cold virus. Twenty patients were then given a dose of the drug while another 20 patients were given a placebo. (In medical tests a control group is often given a ‘placebo’ drug. The subjects in this group believe that they have been given the real drug but in fact their dose contains no drug at all.) All participants were then asked to indicate the time when they first felt relief of symptoms. The number of hours from the time the dose was administered to the time when the patients first felt relief of symptoms are detailed below.

1 Detail the data on a back-to-back stem-and-leaf plot.2 Prepare side-by-side boxplots of the data.3 Make comparisons of the data. Use statistics in your answer.4 Does the drug work? Justify your answer.5 What other considerations should be taken into account when trying to draw

conclusions from an experiment of this type?

inve

stigationinvestigatio

n

Group A (drug)2542

2938

3244

4542

1835

2147

3762

4217

6234

1332

Group B (placebo)2534

1732

3525

4218

3522

2828

2021

3224

3832

3536

MQ Maths A Yr 11 - 10 Page 438 Wednesday, July 4, 2001 6:00 PM

Page 59: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 439

Using a spreadsheet or graphics calculator to obtain summary

statisticsTask 1 Using a spreadsheetConsider the data below for the average daily sales for the fast food outlets McDonald’s, KFC and Pizza Hut.

1 Set up the spreadsheet as indicated, entering the sales figures as numeric values, then formatting them to ‘currency’ with zero decimal places.

2 The Excel formula that calculates the mean is the average formula. Its format is =AVERAGE(range). In cell B12 enter the formula =AVERAGE(B4:B10) to calculate the mean daily sales for McDonald’s.

3 Copy this formula across to cells C12 and D12 to calculate the mean daily sales for KFC and Pizza Hut.

4 The formula for standard deviation is =STDEV(range). Enter the appropriate formula into cell B13, then copy it across to cells C13 and D13.

5 Consider the mean and standard deviation values for the three companies. Whose sales are the best? Which company experiences the most consistent sales throughout the week?

inve

stigationinvestigatio

n

MQ Maths A Yr 11 - 10 Page 439 Wednesday, July 4, 2001 6:00 PM

Page 60: Maths A - Chapter 10

440 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

6 Select the range A4 to D10. Enter the Sort facility of the Data option and sort the data in Column B (McDonald’s data) into ascending order. From this sorted list, determine the five-number summary data for McDonald’s.

7 Sort the data in KFC sales and determine the five-number summary figures for KFC.

8 Similarly, determine the five-number summary figures for Pizza Hut.

9 The spreadsheet does not offer the facility of graphing a boxplot. From your data collected above, draw parallel boxplots on the same scale, then compare the performances of the three companies.

10 Write a paragraph reporting the results of your findings. Support your conclusions by specific reference to your spreadsheet and boxplots.

Task 2 Using a graphics calculator1 Using the average daily sales figures for McDonald’s, KFC and Pizza Hut

indicated in the spreadsheet in Task 1, enter these figures into your graphics calculator as three separate lists, L1, L2 and L3.

2 Use the statistical function on the calculator to determine the following for each set of data:a meanb standard deviationc five-number summary data.

3 Use the statistical plotting function on your calculator to draw parallel boxplots of the three sets of data.

4 Compare these results with those you obtained on the spreadsheet.

Concluding the Egyptian skulls studyIn a previous investigation, you made recommendations to Archie regarding the male Egyptian skulls he discovered at a new site. Your conclusions were based on numerical calculations of mean, median, mode, standard deviation and range. It is now appropriate to check whether the conclusions drawn at that stage would be consistent with those which might be made on the basis of graphical comparisons.

We will now compare five-number summaries of the new site data with those of the 4000 BC and AD 150 measurements. Values for the two known eras are shown in the following table. The five-number values have also been included for the breadth of the skulls at the New site.

1 Consult the previous investigation, which tabled data from the New site. Copy and complete the table.

inve

stigationinvestigatio

n

MQ Maths A Yr 11 - 10 Page 440 Wednesday, July 4, 2001 6:00 PM

Page 61: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 441

2 Boxplots comparing the breadth data for the time periods 4000 BC and AD 150, and at the New site are shown on the same scale at right.

3 When we are comparing boxplots, we should keep in mind that measurements obtained from the ‘box’ are more relevant than those from the ‘whiskers’, since unusually high or low scores can influence the length of the whiskers, and so the range of the scores. With this in mind, what conclusion would you make concerning the age of the skulls found at the New site?

4 Draw boxplots on the same scale for the height measurements of the three sites.

5 Repeat the boxplot drawings for the length measurements for the three sites.

6 Referring to each of your three graphs, what conclusion do you reach regarding the age of the male skulls from the New site?

7 Write a paragraph to Archie outlining your conclusions. Are they consistent with your previous recommendations? Provide statistical evidence to support your claims.

4000 BC AD 150 New site

Lowest score 119 126 123

Lower quartile 128 132 130

Breadth Median 131 137 132

Upper quartile 135 139 135

Highest score 141 147 148

Lowest score 121 120

Lower quartile 131 126

Height Median 134 130

Upper quartile 136 135

Highest score 143 138

Lowest score 89 81

Lower quartile 95 91

Length Median 100 94

Upper quartile 103 97

Highest score 114 103

115 120 125 130 135Breadth measurements

140 145 150

4000 BC

AD 150

New site

MQ Maths A Yr 11 - 10 Page 441 Wednesday, July 4, 2001 6:00 PM

Page 62: Maths A - Chapter 10

442 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

The mean

• For a small number of scores, the mean is calculated using the formula: x– =

• When the data are presented in a frequency table the mean can be calculated using

the formula x– = .

• The mean can also be calculated using the statistical function on your calculator.

Median and mode• The median is the middle score of a data set, or the average of the two middle

scores, when the scores are arranged in numerical order.• The median is available through the statistical function of a graphics calculator.• The mode is the score with the highest frequency.

Summary statistics• The summary statistics are the mean, median and mode.• Each summary statistic must be examined in the context of the statistical analysis to

determine which is the most relevant.

Range and interquartile range• The range is the difference between the highest score and the lowest score.• The interquartile range is the difference between the scores at the lower quartile and

the upper quartile.• The range, quartiles and interquartile range are available from a graphics calculator.

Standard deviation• The standard deviation is a measure of the spread of a data set.• The smaller the standard deviation, the smaller the spread of the data set.• The standard deviation is found using the statistical function on your calculator.• When the analysis is conducted on the entire population, the population standard

deviation is used.• When the analysis is conducted on a sample of the population, the sample standard

deviation is used.

Stem-and-leaf plot• The first part of the data forms the stem.• The last part of the data forms the leaves.• The data in the leaves must be in ascending order.• The five-number summary can be obtained from the stem-and-leaf plot.

Boxplots• A five-number summary is a list, consisting of the lowest score, lower quartile,

median, upper quartile and the greatest score (in that order) of the data• A boxplot is a graph of the five number summary.• The boxplot is a powerful tool to show the spread of the data.• Boxplots are always drawn to scale.

summaryx∑

n--------

fx∑f∑

-----------

MQ Maths A Yr 11 - 10 Page 442 Wednesday, July 4, 2001 6:00 PM

Page 63: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 443• The box spans the interquartile range; the median is marked by the vertical line

inside the box; the whiskers extend to the lowest and greatest scores.

Comparing sets of data• Back-to-back stem-and-leaf plots:

1. Useful to compare the distribution of 2 similar sets of data.2. Share the same stem.3. Contain a key, which usually relates to the data on the right.4. The data on the left are arranged outwards from the stem, as it increases.

• Side-by-side boxplots:1. Useful for quantitative comparisons.2. Share a common scale.3. Compare two or more sets of data.

Indicates thelowest score

Indicates thelower quartile

Indicates themedian

Indicates theupper quartile

Indicates thegreatest score

MQ Maths A Yr 11 - 10 Page 443 Wednesday, July 4, 2001 6:00 PM

Page 64: Maths A - Chapter 10

444 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

1 Calculate the mean of each of the following sets of scores.

a 4, 9, 5, 3, 5, 6, 2, 7, 1, 10b 65, 67, 87, 45, 90, 92, 50, 23c 7.2, 7.9, 7.0, 8.1, 7.5, 7.5, 8.7d 5, 114, 23, 12, 25

2 Use the statistics function on your calculator to find the mean of each of the following sets of scores.a 2, 18, 26, 121, 96, 32, 14, 2, 0, 0b 2, 2, 12, 12, 12, 32, 32, 47, 58c 0.2, 0.3, 0.6, 0.4, 0.3, 0.7, 0.8, 0.6, 0.5, 0.4, 0.1

3 Copy and complete the tables below and then use them to calculate the mean.

a

b

Score (x) Frequency (f ) fx

5 11

6 15

7 24

8 21

9 9

Σ f = Σ fx =

Score (x) Frequency (f ) fx

9.2 36

9.3 48

9.4 74

9.5 65

9.6 51

9.7 32

9.8 14

9.9 2

Σ f = Σ fx =

10A

CHAPTERreview

10A

10B

MQ Maths A Yr 11 - 10 Page 444 Wednesday, July 4, 2001 6:00 PM

Page 65: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 4454 Complete the frequency distribution table below and use it to estimate the mean of the

distribution.

5 Use the statistics function on your calculator to find the mean of the following distributions. Where necessary, give your answers correct to 1 decimal place.

c

6 For each of the following sets of scores, find the median.a 25, 26, 26, 27, 27, 28, 30, 32, 35b 4, 5, 8, 5, 8, 6, 7, 10, 4, 8, 4c 3.2, 3.1, 3.0, 3.5, 3.2, 3.2, 3.2, 3.6d 2, 3, 7, 4, 4, 8, 5, 7, 7, 6e 121, 135, 111, 154, 147, 165, 101, 108

Class Class centre (x) Frequency (f ) fx

21–24 3

25–28 9

29–32 17

33–36 31

37–40 29

41–44 25

45–48 19

49–52 10

Σ f = Σ fx =

Class Class centre Frequency

10–12 11 18

13–15 14 32

16–18 17 34

19–21 20 40

22–24 23 28

25–27 26 14

28–30 29 6

10C

10Ca bScore Frequency

10 23

20 47

30 68

40 56

50 17

Score Frequency

24 45

25 89

26 124

27 102

28 78

29 46

10D

MQ Maths A Yr 11 - 10 Page 445 Wednesday, July 4, 2001 6:00 PM

Page 66: Maths A - Chapter 10

446 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

7 Copy and complete each of the following frequency tables and then use them to find the median. Alternatively, a graphics calculator could be used.

a

b

c

Score Frequency Cumulative frequency

0 2

1 6

2 11

3 7

4 6

5 3

Score Frequency Cumulative frequency

54 2

55 5

56 14

57 11

58 6

59 1

60 1

Score FrequencyCumulative frequency

66 8

67 10

68 12

69 14

70 7

71 5

72 4

10D

MQ Maths A Yr 11 - 10 Page 446 Wednesday, July 4, 2001 6:00 PM

Page 67: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 4478 a Copy and complete the frequency distribution table below.

b What is the median class of this distribution?

9 For each set of scores below state the mode.a 2, 3, 6, 8, 4, 2, 4, 2, 6, 5, 2b 23, 24, 19, 23, 27, 25, 31, 24, 23, 27, 27c 1.2, 5.6, 4.7, 6.8, 4.5, 2.1

10 For each of the frequency tables below, state the mode.

a b

11 Use the frequency table below to state the modal class.

Class Class centre FrequencyCumulative frequency

30–39 18

40–49 34

50–59 39

60–69 45

70–79 29

80–89 10

90–99 5

10D

10D

10DScore Frequency

1 23

2 35

3 21

4 19

5 8

Score Frequency

14 9

15 15

16 8

17 12

18 15

19 7

20 1

Class Class centre Frequency

30–33 31.5 12

34–37 35.5 26

38–41 39.5 34

42–45 43.5 45

46–49 47.5 52

50–53 51.5 23

10D

MQ Maths A Yr 11 - 10 Page 447 Wednesday, July 4, 2001 6:00 PM

Page 68: Maths A - Chapter 10

448 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

12 Below are the number of goals scored by a netball team in ten matches in a tournament.25 26 19 24 28 67 21 22 28 18a Calculate the mean.b Calculate the median.c Calculate the mode.d Which of the above is the best summary statistic? Explain your answer.

13 Give an example of a statistical analysis where the best summary statistic is:

14 Find the range of each of the following sets of scores.a 28 24 26 24 25 29 22 27 25b 118 2 56 45 72 43 69 84 159 0c 1.9 0.7 0.5 0.8 1.1 1.5 1.4

15 The marks of 30 students in a Geography test are shown below.66 47 43 80 42 92 92 90 92 7767 87 75 72 42 60 86 53 95 7846 87 49 70 82 92 93 71 62 67

a Should the population or sample standard deviation be used in this case?b Write the value of the appropriate standard deviation.

16 To find the number of attempts most people make to get their driver’s licence, a sample of twenty Year 12 students is chosen. The results are shown below.1 2 3 3 1 2 1 2 4 11 1 2 2 2 3 1 2 2 3a Should the population or sample

standard deviation be used in this case?

b Write the value of the appropriate standard deviation.

17 Use the statistics function on your calculator to find the mean and population standard deviation of each of the following distributions. Give each answer correct to 3 decimal places.a 0.7, 1.2, 0.5, 0.9, 1.3, 1.5, 0.1, 1.0, 0.4, 0.5 b 23, 254, 12, 89, 74, 15, 26, 45

c

a the mean b the median c the mode.

Score Frequency

26 12

27 25

28 29

29 28

30 14

10E

10E

10F

10G

10G

10G

MQ Maths A Yr 11 - 10 Page 448 Wednesday, July 4, 2001 6:00 PM

Page 69: Maths A - Chapter 10

C h a p t e r 1 0 D e s c r i b i n g , e x p l o r i n g a n d c o m p a r i n g d a t a 449

d

18 Which of the statements below is true for this back-to-back stem-and-leaf plot?

A Data from Group A is bimodal.

B Data from Group B is bimodal.

C Group A has the larger IQR.

D Group B has the larger range.

E Group A has the smaller median.

19 Which of the statements below is true of these boxplots?

A Group A has higher median and shows more variability than Group B.

B Group B has higher median and shows more variability than Group A.

C Data from Group A are spread.

D Group B has the smaller interquartile range.

E One-quarter of all Group A data is of higher value than any of Group B data.

20 The following data give the amount of cut meat (in kg) obtained from each of 20 lambs.

a Detail the data on a stem-and-leaf plot. (Use a class size of 0.5 kg.)

b Prepare a five-number summary of the data.

c Draw a boxplot of the data.

Class Class centre Frequency

10–14 12 8

15–19 17 12

20–24 22 32

25–29 27 45

30–34 32 40

35–39 37 19

40–44 42 6

4.55.9

6.25.8

5.85.0

4.74.3

4.04.0

3.94.6

6.24.8

6.85.3

5.54.2

6.14.8

Key: 23 0 = 230Group B Group A

Leaf3 2 1

7 7 7 6 5 4 08 6 5 2 2

7 4962

Stem23242526272829

Leaf01 50 2 5 83 5 6 6 84 7 82 60 1

10H

10H1050 15 20 25 30 40 45 5035 Scale

Group B

Group A

10H

MQ Maths A Yr 11 - 10 Page 449 Wednesday, July 4, 2001 6:00 PM

Page 70: Maths A - Chapter 10

450 M a t h s Q u e s t M a t h s A Ye a r 1 1 f o r Q u e e n s l a n d

21 The back-to-back stem-and-leaf plot belowshows the number of lessons per week given by two piano teachers, Victoria and Elena, over the period of 30 consecutive weeks.

a Derive a five-number summary for each set of data.

b Draw a side-by-side boxplot of the data.c Make comparisons between the 2 sets of

data. Use statistics in your answer.

22 In order to compare two textbooks a teacher recommends one book to one of his classes and another book to another class. At the end of the year the classes are each tested. The results are detailed below:

a Prepare a back-to-back stem-and-leaf plot of the data.b Prepare a five-number summary for each group. (Note that the groups are different

sizes.)c Prepare side-by-side boxplots of the data.d Compare the performance of each of the classes.e Which textbook do you think would be best? Why?f What other things would you need to take into account before drawing final

conclusions?

Text A (25 students)442284

522521

956435

767269

133528

9448

8356

7259

5584

8198

Text B (28 students)658137

727239

486455

635358

685852

595982

686479

626655

7568

7942

10H

Key: 1 3 = 13 lessonsElena Victoria

Leaf4 3 2 2

9 8 8 7 6 6 5 54 4 3 2 2 2 1 0 0

7 6 6 5 52 2 1

5

Stem00*11*22*33*

Leaf3 46 8 90 2 4 45 5 7 8 90 0 1 2 3 4 45 5 6 7 93 45 5

10H

testtest

CHAPTERyyourselfourself

testyyourselfourself

10

MQ Maths A Yr 11 - 10 Page 450 Wednesday, July 4, 2001 6:00 PM