New HANDOUT ELEMENTARY STATISTICSstaff.uny.ac.id/sites/default/files/pendidikan... · 2011. 4. 21. · HANDOUT ELEMENTARY STATISTICS Kismiantini NIP. 19790816 200112 2 001 ... Interval

0

HANDOUT

ELEMENTARY STATISTICS

Kismiantini

NIP. 19790816 200112 2 001

Mathematics Education Department Faculty of Mathematics and Natural Sciences

Yogyakarta State University 2011

1

Yogyakarta State University Faculty of Mathematics and Natural Sciences

Mathematics Education Department Topic 1 : Fundamental Concepts of Statistics

There are some concepts for studying in statistics, such as the definition of statistics, data, parameter and statistic, population and sample, scale of measurement, variables. Definition of Statistics Statistics is the study of how to collect, organize, analyze, and interpret numerical information from data. Descriptive statistics generally characterizes or describes a set of data elements by graphically displaying the information or describing its central tendencies and how it is distributed. Inferential statistics tries to infer information about a population by using information gathered by sampling. What is data? Data is a collection of numbers or facts that is used as a basis for making conclusions. Types of Data We can classify data into two types:

a. Numerical or Quantitative data is data where the observations are numbers, and further classified as either discrete or continuous. - Discrete data are numeric data that have a finite number of possible values.

For example: a classic example of discrete data is a finite subset of the counting numbers, {1,2,3,4,5} perhaps corresponding to {Strongly Disagree... Strongly Agree}

- Continuous data have infinite possibilities For example: 1.4, 1.41, 1.414, 1.4142, 1.141421...

b. Categorical or Qualitative data is data where the observations are non-numerical. For example: economic social status {Poor, Fair, Good, Better, Best}, colors (ignoring any physical causes), and types of material {straw, sticks, bricks}

Data can be put into one of several categories called scale of measurement:

a. Nominal. Nominal data have classification and thus only gives names or labels to various categories. For example: the gender of person (1: man, 0: woman).

b. Ordinal. Ordinal data have order, but the differences between values are not important. For example: Likert scales, rank on a scale of 1..5 about degree of satisfaction

c. Interval. Interval data have order, constant scale, but there is no true natural zero. For example: temperature, date. Temperature scales are interval data with 25C warmer than 20C and a 5C difference has some physical meaning. Note that 0C is arbitrary, so that it does not make sense to say that 20C is twice as hot as 10C.

d. Ratio. Ratio data have the highest level of measurement. Ratio data have order, constant scale, and natural zero. For example: height, weight, age, length.

2

It is now meaningful to say that 10 m is twice as long as 5 m. This ratio hold true regardless of which scale the object is being measured in (e.g. meters or yards), this is because there is a natural zero.

Variable Variable is the characteristic measured or observed when an experiment is carried out or an observation is made. Variables may be non-numerical or numerical. Population versus Sample Population is defined as the total set of individuals that we are interested about. Sample is a subset of the individuals selected from population in a prescribed manner of study. Typically, population data is very hard or even impossible to gather. Statisticians and researchers will instead extract data from a sample. Parameter versus Statistic Parameter: A parameter is a characteristic of the whole population. Statistic: A statistic is a characteristic of a sample, presumably measurable.

Why sample was taken?

a. By studying the sample it is hoped to draw valid conclusions about the larger group. b. A sample is generally selected for study because the population is too large to study in

its entirety. c. The sample should be representative of the general population. d. This is often best achieved by random sampling. e. Also, before collecting the sample, it is important that the researcher carefully and

completely defines the population, including a description of the members to be included.

3

Yogyakarta State University Faculty of Mathematics and Natural Sciences

Mathematics Education Department Topic 2 : Visualizing Data

One of the first tasks in an analysis of a dataset is to look at the distribution of values, i.e. examine the variation in the data values. This is often accomplished using a graph. The type of graph often depends on the scale of measurement of the variable.

Table 1. Graph used to examine distribution of data

Scale Summary table and graph Nominal or Ordinal Frequency chart

Bar graph (also called a histogram) Segmented bar chart (also called a divided bar chart)

Interval or Ratio Frequency distribution table Dot plot Histogram Ogive Stem-leaf plot Box plot

Line Graphs A line graph is a way to summarize how two pieces of information are related and how they vary depending on one another. The numbers along a side of the line graph are called the scale.

Figure 1. Line graphs for John’s weight

The graph above shows how John's weight varied from the beginning of 1991 to the beginning of 1995. The weight scale runs vertically, while the time scale is on the horizontal axis. Following the gridlines up from the beginning of the years, we see that John's weight was 68 kg in 1991, 70 kg in 1992, 74 kg in 1993, 74 kg in 1994, and 73 kg in 1995. Examining the graph also tells us that John's weight increased during 1991 and 1995, stayed the same during 1993, and fell during 1994. Pie Chart A pie chart is a circle graph divided into pieces, each displaying the size of some related piece of information. Pie charts are used to display the sizes of parts that make up some whole.

4

Figure 2. Pie chart for an ingredients

The pie chart below shows the ingredients used to make a sausage and mushroom pizza. The fraction of each ingredient by weight shown in the pie chart below is now given as a percent. Again, we see that half of the pizza's weight, 50%, comes from the crust. Note that the sum of the percent sizes of each slice is equal to 100%. Graphically, the same information is given, but the data labels are different. Always be aware of how any chart or graph is labeled. Bar Graphs Bar graphs consist of an axis and a series of labeled horizontal or vertical bars that show different values for each bar. The numbers along a side of the bar graph are called the scale.

Figure 3. Bar chart for weight of some fruits

The bar chart above shows the weight in kilograms of some fruit sold one day by a local market. We can see that 52 kg of apples were sold, 40 kg of oranges were sold, and 8 kg of star fruit were sold. A double bar graph is similar to a regular bar graph, but gives 2 pieces of information for each item on the vertical axis, rather than just 1. The bar chart below shows the weight in kilograms of some fruit sold on two different days by a local market. This lets us compare the sales of each fruit over a 2 day period, not just the sales of one fruit compared to another. We can see that the sales of star fruit and apples stayed most nearly the same. The sales of oranges increased from day 1 to day 2 by 10 kilograms. The same amount of apples and oranges was sold on the second day.

Figure 4. Double bar chart for weight of some fruits

5

Dot Plot Dot Plot is a set of data is represented by using dots over a number line.

Figure 5. Dot plot for a pair of dice

Ogive The Ogive is a frequency polygon (line plot) graph of the cumulative frequency or the relative cumulative frequency. The horizontal axis is marked with the class boundaries and the vertical axis is the frequency. For example: Marks Frequency Cumulative Frequency1 – 10 2 2 11 – 20 8 10 21 – 30 12 22 31 – 40 18 40 41 – 50 28 68 51 – 60 22 90 61 – 70 6 96 71 – 80 4 100 Figure 6. Ogive for marks Stem-and-Leaf Plots Steam-and-leaf plot is a plot where each data value is split into a "leaf" (usually the last digit) and a "stem" (the other digits). For example "32" would be split into "3" (stem) and "2" (leaf). The "stem" values are listed down, and the "leaf" values go right (or left) from the stem values. The "stem" is used to group the scores and each "leaf" indicates the individual scores within each group. This stem-and-leaf plot shows grades that students received on a math quiz, with stem is 10, and leaf is 1.

Stem Leaves 3 1 4 5 6 6 2 7 7 0 5 8 3 2 9 5 7 0 9 2 8 4 1 6 9

10 0 Figure 7. Steam-and-leaf for test scores

Box Plot A box plot is a way of summarizing a set of data measured on an interval scale. It is often used in exploratory data analysis. It is a type of graph which is used to show the shape of the distribution, its central value, and spread. The picture produced consists of the most extreme

6

values in the data set (maximum and minimum values), the lower and upperquartiles, and the median. For example: Consider the following dataset 52, 57, 60, 63, 71, 72, 73, 76, 98, 110, 120

120

110

100

90

80

70

60

50

Dat

a

Boxplot of Data

Figure 8. Box plot for dataset

Exercises

1. A computer retailer collected data on the number of computers sold during 20 consecutive Saturdays during the year. The results are as follows: 12, 14, 14, 17, 21, 24, 24, 25, 25, 26, 26, 27, 29, 31, 34, 35, 36, 39, 40, 42, 42, 45, 46, 47, 49, 49, 56, 59, 62. Make this data into a stem and leaf plot.

2. We asked the students what country their car is from (or no car) and make a tally of the answers. Then we computed the frequency and relative frequency of each category. The relative frequency is computed by dividing the frequency by the total number of respondents. The following table summarizes.

Country Frequency Relative FrequencyUS 6 0.3 Japan 7 0.35 Europe 2 0.1 Korea 1 0.05 None 4 0.2

Make a bar chart for this data.

7

Yogyakarta State University

Faculty of Mathematics and Natural Sciences

Mathematics Education Department

Topic 3 : Sigma Notation

Sigma notation is a method used to write out a long sum in a concise way. Sigma notation is a

concise and convenient way to represent long sums. For example, we often

wish to sum a number of terms such as

1 + 2 + 3 + 4 + 5

or

1 + 4 + 9 + 16 + 25 + 36

where there is an obvious pattern to the numbers involved. The first of these is the sum of the

first five whole numbers, and the second is the sum of the first six square numbers. More

generally, if we take a sequence of numbers x1, x2, x3, . . . , xn then we can write the sum of

these numbers as

�� ⋯ .��

A shorter way of writing this is to let xi represent the general term of the sequence and put

∑=

n

i

ix1

Here, the symbol Σ is the Greek capital letter Sigma corresponding to our letter ‘S’, and refers

to the initial letter of the word ‘Sum’.

∑=

=++++5

1

54321k

k

∑=

=+++++6

1

2362516941k

k

Rules for use with sigma notation

There are a number of useful results that we can obtain when we use sigma notation.

If a and c are constants and if f(k) and g(k) are functions of k, then

1. nccn

k

=∑=1

2. ∑∑==

=n

k

n

k

kcck11

3. ( ) ∑∑==

+=+n

k

n

k

kncck11

4. ( )( ) ( )∑∑==

+=+n

k

n

k

kgancckag11

5. ( ) ( )( ) ( ) ( )∑∑∑===

+=+n

k

n

k

n

k

kgkfkgkf111

8

Exercises

1. Express each of the following in sigma notation

a. 6

1

5

1

4

1

3

1

2

1

1

1+++++

b. ( ) ( ) ( ) ( )2

4

2

3

2

2

2

1 µµµµ −+−+−+− xxxx

2. Evaluate

a. ∑=

5

1

3n

n b. ( )∑=

−4

1

31r

rr c. ∑

=

4

1

2

k

k d. ( )∑=

+6

1

12

1

r

rr

3. Write out what is meant by

a. ∑=

n

i

ix1

2 b. ∑=

n

i

ii xf1

c. ∑=

m

j jx2

1

4. Writing out the terms explicitly,

a. ∑=

5

1

3k

k b. ( )∑

= +

5

1 12

1

k kk c. ∑

=

5

1

5i

d. ∑=

10

1k

c

9




Topic 4 : Frequency Distribution

Frequency distribution is a tabulation of the values that one or more variables. We divide an

interval containing all the data into a small number of segments, usually of equal width. These

segments are called classes (or class intervals).

For example, the weights (in kg.) of 50 pieces of luggage are presented in a grouped

frequency distribution with the class interval as follows:

Weights (kgs) Number of pieces

7 – 9

10 – 12

13 – 15

16 – 18

19 – 21

2

8

14

19

7

50

From the above frequency distribution we note the following:

a. The intervals of weights i.e. 7-9, 10-12, …, 19-21 are known as class intervals.

b. 7, 10, …, 19 are called lower limits of the respective classes.

c. 9, 12, …, 21 are called upper limits of the respective classes.

d. 6.5 – 9.5, 9.5 – 12.5, 12.5 – 15.5, 15.5 – 18.5 and 18.5 – 21.5 are known as class

boundaries. These class boundaries are obtained by

Lower class boundary = lower class limit – d/2

Upper class boundary = upper class limit + d/2

Where, d = difference between any two consecutive classes.

For the above example, d = 1 ⇒ d/2 = 1/2 = 0.5

e. 2, 8, 14, 19 and 7 are called class frequencies.

f. The class width is the difference between the upper and lower class boundaries of a

class interval. There, the class width for the class interval 13 –15 is

Class width = 15.5 – 12.5 = 3

g. The class mark (or midpoint), xm , of class interval is obtained by

2

boundaryupper boundarylower +=mx

2

limitupper limitlower +=mx

Constructing Frequency Distribution. When construct a frequency distribution, we need to

make the following:

1. Decide on the number of classes you want (c).

The Sturge’s formula may be helpful to decide the number of classes, is given below.

c = 1 + 3.3logn

Where c is the number of classes and n is the number of observations in the data set.

2. Calculate of range

Range = x[n] – x[1]

The lowest score in data or x[1] and the highest score in data or x[n].

3. Divide the range by the number of classes to get class width

w = range/c

10

Round this result to get a convenient number, usually round up.

4. Starting point (first lower limit). Begin by choosing a number for the lower limit or the

first class.

5. Determine of first lower class boundary

lb = ll – ½ d

ll = lower limit, lb = lower boundary, d = difference between any two consecutive

classes.

6. Determine of first class upper boundary

ub = lb + w

ub = upper boundaries, lb = lower boundary, w = width

7. Determine first upper limit

ul = ub – ½ d

ub = upper boundaries, ul = upper limit

8. List of all the lower and upper limit, add by width for each lower and upper limit.

9. Determine frequencies for each class

Example: The following are the marks of final exam of Elementary statistics

23 60 79 32 57 74 52 70 82 36

80 77 81 95 41 65 92 85 55 76

52 12 64 75 78 25 80 98 81 67

41 71 83 54 64 72 88 62 74 43

60 78 89 76 84 48 84 90 15 79

34 67 17 82 69 74 63 80 85 61

Make frequency distribution with 9 classes and the lowest value is 10.

Solution:

1. Class = 9

2. Range = 98 – 12 = 86

3. Width = w = 86/9 = 9.6 ≈ 10

4. Lower limit = ll = 10

5. Lower boundaries = lb = 10 – 0.5(1) = 9.5

6. Upper boundaries = ub = 9.5 + 10 = 19.5

7. Upper limit = ul = 19.5 – 0.5 = 19

Marks Tally marks Frequency Cumulative frequency

10 – 19 /// 3 3

20 – 29 // 2 5

30 – 39 /// 3 8

40 – 49 //// 4 12

50 – 59 //// 5 17

60 – 69 //// //// / 11 28

70 – 79 //// //// //// 14 42

80 – 89 //// //// //// 14 56

90 – 99 //// 4 60

∑ 60

11

Exercises

1. The following are the numbers of credit reports prepared by credit reported agency on

110 businesses days.

62 60 43 64 58 52 52 67 59 60 51 62 56 63 61 68 57 51 59 47 42 64 43

67 52 58 47 59 64 58 52 63 48 65 60 61 59 63 56 62 56 62 57 59 62 56

63 55 73 60 69 53 66 54 52 54 61 55 65 55 61 59 74 62 49 63 63 53 71

59 46 64 41 60 51 55 64 46 64 56 59 49 64 60 57 58 66 53 65 62 58 65

61 50 55 57 61 45 55 60 66 63 58 78 65 61 57 67 54 53

Make a frequency distribution with 8 classes and the lowest value is 40.

2. The following frequency distribution gives the lengths of 15 cucumbers.

Length (cm) frequency

6 – 10

11 – 15

16 – 20

21 – 25

26 – 30

3

4

5

2

1

(a) What is the upper class limit of the class interval 16-20?

(b) What is the lower class boundary of the class interval 16-20?

(c) What is the class width of the class interval 16-20?

(d) What is the class mark of the class interval 16-20?

3. A household appliance service and repair company received the following numbers of

orders for service and repair daily during eight weeks of six working day each. 19 27 24 18 26 24 21 18 30 32 16 28 25 18 22 22 25 31 22 28 26 25 20

29 22 34 22 28 25 25 19 24 30 23 27 15 25 24 28 21 29 23 22 26 31 24

25 26

Make a frequency distribution.

4. The following are the mark of the monthly test that have been done by 100

participants.

6.4 4.7 5.3 5.1 6.5 5.1 7.1 5.6 6.3 4.9 5.4 6.5 6.3 7.1 4.7 5.7 6.9 7.0 6.3 5.9

7.1 4.8 5.8 7.3 5.6 5.0 6.6 5.7 6.5 6.0 6.5 5.5 5.7 6.2 4.7 6.4 7.3 5.7 5.2 5.0

6.9 7.0 5.5 6.2 5.6 6.5 5.4 5.6 6.1 6.6 7.1 4.8 6.7 5.5 5.4 5.9 6.8 6.3 5.1 6.9

5.6 6.3 4.7 6.4 5.1 4.9 5.1 7.2 5.6 7.3 5.7 6.6 7.1 4.8 6.7 5.5 5.2 5.0 6.9 5.4

5.6 4.7 6.6 7.1 4.8 6.7 5.5 5.4 5.9 5.2 6.3 5.1 4.5 5.7 6.2 4.7 6.4 7.3 5.7 5.2

Make a frequency distribution with the lowest value is 4.2

12




Topic 5 : Measures of Central Tendency (The Arithmetic Mean and The Geometric

Mean)

A measure of central tendency is a one-number description of a distribution or data set, and

we focus on three:

a. Mean or average: the sum of the numbers divided by the number of numbers.

b. Median or 50th percentile: a real number that separates the lower 50% and upper 50%

of the numbers.

c. Mode: the number that occurs most frequently in the set. There can be more than one

mode.

Several types of mean can be defined i.e. the arithmetic mean, the geometric mean, the

harmonic mean, the weighted mean, and the overall mean.

Sampel data

Ungrouped Data Grouped Data Frequency Distribution

nxxx ,,, 21 K xi fi

x1

x2

.

.

.

xk

f1

f2

.

.

.

fk

Total nf

k

i

i =∑=1

Score fi

a1 – b1

a2 – b2

.

.

.

ak - bk

f1

f2

.

.

.

fk

Total nf

k

i

i =∑=1

The Arithmetic Mean (Mean)

The measure of central tendency most commonly used by statisticians is the same measure

most people have in mind when they use the word average. This is the arithmetic average,

which is called by statisticians the arithmetic mean, or simply the mean. It is obtained by

adding together all the scores or values and dividing the resulting sum by the number of cases.

The mean is the mathematical average of a set of numbers. The average is calculated by

adding up two or more scores and dividing the total by the number of scores.

For ungrouped data

Population data ( Nxxx ,,, 21 K ), mean is defined by

13

N

xN

i

i∑== 1µ

Sample data ( nxxx ,,, 21 K ), mean is defined by

n

x

x

n

i

i∑== 1

For grouped data

Sample data, mean is defined by

For frequency distribution

Sample data, mean is defined by

xi fi *

ix fixi*

a1 - b1

a2 – b2

.

.

.

ak – bk

f1

f2

.

.

.

fk

x1*

x2*

.

.

.

xk*

f1x1*

f2x2*

.

.

.

fkxk*

Total nf

k

i

i =∑=1

∑=

k

i

ii xf1

If width class is same for all interval classes, the mean can be defined by

n

cf

wxx

k

i

ii∑=+= 1*

0

Example:

Scores fi xi fixi*

ci fici

31 – 40

41 – 50

51 – 60

61 – 70

71 – 80

81 – 90

91 – 100

4

3

11

21

33

15

3

35.5

45.5

55.5

65.5

75.5

85.5

95.5

142

136.5

610.5

1375.5

2491.5

1282.5

286.5

-4

-3

-2

-1

0

1

2

-16

-9

-22

-21

0

15

6

Σ 90 6325 -47

xi fi fixi

x1

x2

.

.

.

xk

f1

f2

.

.

.

fk

f1x1

f2x2

.

.

.

fkxk

Total nf

k

i

i =∑=1

∑=

k

i

ii xf1

n

xf

x

k

i

ii∑== 1

n

xf

x

k

i

ii∑== 1

*

Midpoint:

2

* ii

i

bax

+=

278.7090

47105.751*

0 =−

×+=+=

∑=

n

cf

wxx

k

i

ii

14

The Geometric Mean

The geometric mean is a measure of central tendency which calculated by multiplying a series

of numbers and taking the nth root of the product, where n is the number of items in the

series.

For Ungrouped Data

The geometric mean Gx of a set of n positive numbers nxxx ,,, 21 K is the nth root of the

product of the numbers:

∆

=

=⇒

∆==⇒

=

∑

10

loglog

..

1

1

321

G

n

i

inG

nnG

x

xx

xxxxx K

Example:

The geometric mean of the numbers 2, 4, and 8 is 464)8)(4)(2( 33 == .

For Grouped Data

The geometric mean for grouped data is obtained by calculated the average weighted mean of

the logarithm of each mid-point value, then convert this mean value back to a base 10

number.

∆

=

=

=⇒

∆==⇒

=

∑

∏

10

log1

log1

1

G

k

i

iiG

n

k

i

f

iG

x

xfn

x

xx i

Example:

Find the geometric mean of this sample data.

Mark Frequency

61

64

67

70

73

5

18

42

27

8

Σ 100

Solution:

Mark (xi) Frequency (fi) ii xf log

61

64

67

70

73

5

18

42

27

8

8.926649175

32.51123953

76.69514171

49.81764708

14.90658288

Σ 100 182.8572604

15

The geometric mean is

386.6710

828572604.18572604.182100

1log

1log

828572604.1

1

==⇒

=×==⇒ ∑=

G

k

i

iiG

x

xfn

x

For Frequency Distribution

The geometric mean for frequency distribution is obtained by calculated the product of all the

values in the data set (frequency of each mid-point value), then take the nth root of the

product, with n being equal to the cumulative frequency.

∆

=

=

=⇒

∆==⇒

=

∑

∏

10

log1

log1

*

1

*

G

k

i

iiG

n

k

i

f

iG

x

xfn

x

xxi

Example:

Calculate the geometric mean for the given below:

Marks 30 – 39 40 – 49 50 – 59 60 – 69 70 – 79 80 – 89 90 – 99

Frequency 2 3 11 20 32 25 7

Solution:

The necessary calculations are given below:

Marks if ix ii xf log

30 – 39 2 34.5

40 – 49 3 44.5

50 – 59 11 54.5

60 – 69 20 64.5

70 – 79 32 74.5

80 – 89 25 84.5

90 – 99 7 94.5

Σ 100

Now we will find the geometric mean as

..........................................100

1log

1log

1

* =×== ∑=

k

i

iiG xfn

x

16




Topic 6 : Measures of Central Tendency (The Harmonic Mean, The Related

Arithmetic Mean, The Geometric Mean and The Harmonic Mean, The

Weighted Mean and The Overall Mean)

The Harmonic Mean

The harmonic mean is used to calculate the average of a set of numbers. The Harmonic mean

is always the lowest mean.

For Ungrouped Data

The harmonic mean Hx is the average of a set numbers nxxx ,,, 21 K , here the number of

elements will be averaged and divided by the sum of the reciprocals of the elements.

∑=

=n

i i

H

x

nx

1

1

For Grouped Data

The harmonic mean Hx for grouped data is

∑=

=k

i i

i

H

x

f

nx

1

Example:

Given the following this table of first year students of a particular college. Calculate the

Harmonic Mean.

Age (years) 13 14 15 16 17

Number of students 2 5 13 7 3

Solution:

The given table belongs to a grouped data and the variable involved is ages of first year

students. While the number of students represent frequencies.

Age (years)

ix

Number of students

if ii xf

13 2

14 5

15 13

16 7

17 3

Σ

Now we will find the harmonic mean as

....

1

===

∑=

K

K

k

i i

i

H

x

f

nx


The harmonic mean Hx for grouped data is

17

∑=

=k

i i

i

H

x

f

nx

1*

Example:

Calculate the harmonic mean for the given below:

Marks 30 – 39 40 – 49 50 – 59 60 – 69 70 – 79 80 – 89 90 – 99

Frequency 2 3 11 20 32 25 7

Solution:

The necessary calculations are given below:

Marks if ix ii xf

30 – 39 2 34.5 0.0580

40 – 49 3 44.5 0.0674

50 – 59 11 54.5 0.2018

60 – 69 20 64.5 0.3101

70 – 79 32 74.5 0.4295

80 – 89 25 84.5 0.2959

90 – 99 7 94.5 0.0741

Σ 100 1.4368

Now we will find the harmonic mean as

60.694368.1

100

1*

===

∑=

k

i i

i

H

x

f

nx

The Relation Between The Arithmetic, Geometric, And Harmonic Means

The geometric mean of a set of positive numbers nxxx ,,, 21 K is less than or equal to their

arithmetic mean but is greater than or equal to their harmonic mean. In symbols,

xxx GH <<

The equality signs hold only if all the numbers nxxx ,,, 21 K are identical.

Example:

The set 2, 4, 8 has arithmetic mean 4.67; geometric mean 4; and harmonic mean 3.43.

The Weighted Mean

The weighted mean is a mean where there are some variations in the relative contribution of

individual data values to the mean. Each data value (xi) has a weight assigned to it (wi). Data

values with larger weights contribute more to the weighted mean and data values with smaller

weights contribute less to the weighted mean. The formula is

∑

∑

=

==k

i

i

k

i

ii

w

w

xw

x

1

1

18

There are several reasons why you might want to use a weighted mean.

1. Each individual data value might actually represent a value that is used by multiple

people in your sample. The weight, then, is the number of people associated with that

particular value.

2. Your sample might deliberately over represent or under represent certain segments of

the population. To restore balance, you would place less weight on the over

represented segments of the population and greater weight on the under represented

segments of the population.

3. Some values in your data sample might be known to be more variable (less precise)

than other values. You would place greater weight on those data values known to have

greater precision.

The Overall Mean

The overall mean (also called the grand mean, pooled mean, or common mean is the

appropriate way to combine arithmetic means from several samples. The formula for the

overall mean is

∑

∑

=

==k

i

i

k

i

ii

O

n

xn

x

1

1

where in is the sample size, ix is the mean of the sample; k is the number of samples being

considered. Therefore, the numerator of the overall mean formula is the sum of all the data

values in all the samples. As the denominator is the sum of the sample sizes, the overall mean

is really the sum of all the data values divided by the number of values.

Example:

The effects of a new blood-pressure drug are being studied in three different hospitals. One

measurement taken from groups of female patients in each hospital before and after treatment

is resting heart rate in beats per minute. The results for this measurement when taken before

treatment are, Hospital 1, 1n = 30 patients, 1x = 76.2 beats/min; Hospital 2, 2n = 25 patients,

2x = 79.3 beats/min; Hospital 3, 3n = 16 patients, 3x = 80.1 beats/min. Combine these three

arithmetic means to get an overall mean for this pretreatment measurement.

Solution:

( ) ( ) ( )beats/min 2.78

162530

1,80163,79252,76303

1

3

1 =++

×+×+×==

∑

∑

=

=

i

i

i

ii

O

n

xn

x

19




Topic 7 : Measures of Central Tendency (Median and Mode)

Median

The median is the middle value in the list of numbers. To find the median, your numbers have

to be listed in numerical order, so you may have to rewrite your list first.

For Ungrouped Data and Grouped Data

Let data is nxxx ,,, 21 K .

If n is odd number, the formula of median is

+=

2

1nxMe

If n is even number, the formula of median is

2

122

+

+

=

nnxx

Me

We can use this formula for n are odd or even number,

( )

+

=1

2

1n

xMe

Example 1. The set of numbers 3, 4, 4, 5, 6, 8, 8, 8, and 10. Find the median.

Solution:

The data was ordered. n = 9

[ ] 65

2

19

2

1====

+

+xxxMe

n

Median is 6.

Example 2. The set of numbers 5, 5, 7, 9, 11, 12, 15, and 18. Find the median.

Solution:

The data was ordered. n = 8

( ) ( )[ ] [ ] [ ] [ ]( ) ( ) 109115.095.0 4545.4

182

11

2

1=−+=−+====

+

+

xxxxxxMen

.

Or

[ ] [ ]10

2

119

222

541

2

8

2

81

22=

+=

+=

+

=

+

=

+

+

xx

xxxx

Me

nn

.

Median is 10.

Example 3.

Given the following frequency distribution of first year students of a particular college.

Calculate the median.

Age (years) 13 14 15 16 17


Solution:

The data was ordered and n is even.

Age (years)⇒ xi

Number of students (fi)

Cumulative Frequency (fcum)

( ) ( )[ .15

1302

11

2

1===

+

+

xxxMen

The median for first year students of a particular college is 15.


For frequency distribution, the median, obtained by interpolation,

−

+=f

Fn

wMe 2l

where

l = lower boundary of the interval containing the median

w = width of the interval containing the median

n = total number of frequencies

F = cumulative frequency before the median class

f = number of cases in the interval containing the median

See the frequency distribution below,

Score fi f

1 – 10

11 – 20

21 – 30

31 – 40

41 – 50

40

70

30

100

60

40

110

140

240

300

∑ 300

The class interval of the median is the class which is contain

[ ]150)300(

2

1

2

1==

xxx

n

The median class interval is 31

The width of this interval is 40

Mode

The mode is the value that occurs most often. If no number is repeated, then there is no mode

for the list.

For Ungrouped Data

The mode of a set of data is the value in the set that occurs most often.

13 14 15 16 17

2 5 13 7 3

2 7 20 27 30

] [ ] [ ] [ ]( ) ( )15155.0155.0 1516155. =−+=−+= xxx

The median for first year students of a particular college is 15.

For frequency distribution, the median, obtained by interpolation, is given by

of the interval containing the median

= width of the interval containing the median

= total number of frequencies

before the median class

= number of cases in the interval containing the median


fcum

110

140

240

300

The class interval of the median is the class which is contain

n

x

2

1.

4031−→

The median class interval is 31 – 40 so the lower boundary of this interval is 30

The width of this interval is 40.5 – 30.5 = 10.

100

140150105.30 =

−+=Me


of a set of data is the value in the set that occurs most often.

frequency

20

15= .

is given by

of this interval is 30.5.

5.31


Example 1. The set of numbers 3, 4, 4, 5, 6, 8, 8, 8, and 10. The mode is 8.

Example 2. The set of numbers 51, 55, 56, 59, 63, 65, 69, 74, 81, 85, 91. There is no mode.

Example 3. In a crash test, 11 cars were tested to determine what impact speed was required

to obtain minimal bumper damage. Find the mode of the speeds given in miles per hour

below. 24, 15, 18, 20, 18, 22,

Since both 18 and 24 occur three times, the modes are 18 and 24 miles per hour. This data set

is bimodal.

For Grouped Data

The mode for a grouped data is the largest number of class frequencies.


The mode for a grouped data is the midpoint of the class containing the largest number of

class frequencies.

++=

21

1

ff

fwMo l

where

l = lower boundary of the interval containing the mode

w = width of the interval containing the mode

f 1 = frequency of the mode class

f 2 = frequency of the mode class


21

1

2

1 ,

thenand

,

ff

f

yx

x

b

b

y

x

d

y

c

x

b

y

a

x

+=

+=

==

The mode class interval is 31 –

The width of this interval is 40

70301001 =−=f ; 1002 −=f

5.3021

1 +=

++=

ff

fwMo l

score fi fcum

1 – 10

11 – 20

21 – 30

31 – 40

41 – 50

40

70

30

100

60

40

110

140

240

300

∑ 300

The set of numbers 3, 4, 4, 5, 6, 8, 8, 8, and 10. The mode is 8.

The set of numbers 51, 55, 56, 59, 63, 65, 69, 74, 81, 85, 91. There is no mode.

In a crash test, 11 cars were tested to determine what impact speed was required

tain minimal bumper damage. Find the mode of the speeds given in miles per hour

22, 24, 26, 18, 26, 24.


The mode for a grouped data is the largest number of class frequencies.


of the interval containing the mode

= width of the interval containing the mode

= frequency of the mode class – frequency before the mode class

= frequency of the mode class – frequency after the mode class


– 40 so the lower boundary of this interval is 30


4060 =−

37864.364070

7010 ≈=

++ .

frequency

21

The set of numbers 3, 4, 4, 5, 6, 8, 8, 8, and 10. The mode is 8.

The set of numbers 51, 55, 56, 59, 63, 65, 69, 74, 81, 85, 91. There is no mode.

In a crash test, 11 cars were tested to determine what impact speed was required

tain minimal bumper damage. Find the mode of the speeds given in miles per hour



of this interval is 30.5.

22




Topic 8 : Measures of Location (Quartile, Decile, Percentile)

Measures of location are the measure which indicates the location of several data on the

overall data set. These measures are quartile, decile, percentile and median. The data set have

to sort from lowest to highest or vice versa.

Quartile

The quartiles divide the sorted set of data into four equal parts. There are three quartiles, the

first quartile Q1, the second quartile Q2 is the median, and the third quartile Q3.


The formula of quartiles is

( )

+

=1

4n

ii xQ

where 3,2,1=i and n is total number of data.

Example 1: The set of numbers 3, 4, 4, 5, 6, 8, 8, 8, and 10. Find the first quartile.

Solution: The data was ordered and n = 9.

( ) ( )[ ] [ ] [ ] [ ]( ) ( ) 4445.045.0 2325.2

194

11

4

11 =−+=−+====

+

+

xxxxxxQn

The first quartile is 4.

Example 2.

Given the following frequency table of first year students of a particular college. Calculate the

third quartile.

Age (years) 13 14 15 16 17


Solution:

The data was ordered and n is 30.

Age (years)⇒ xi 13 14 15 16 17

Number of students (fi) 2 5 13 7 3

Cumulative Frequency (fcum) 2 7 20 27 30

( ) ( )[ ] [ ] [ ] [ ]( ) ( ) 16161625.01625.0 23242325.23

1304

31

4

33 =−+=−+====

+

+

xxxxxxQn

.

The third quartile for first year students of a particular college is 16.



−

+=f

Fin

wQi

4l

where

l = lower boundary of the interval containing the i-th quartiles

w = width of the interval containing the i-th quartiles


23

F = cumulative frequency corresponding to the lower limit

f = number of cases in the interval containing the i-th quartiles

Example: Find the first quartile for the frequency distribution below.

Score fi fcum

1 – 10

11 – 20

21 – 30

31 – 40

41 – 50

40

70

30

100

60

40

110

140

240

300

∑ 300

The class interval of the first quartile is the class which is contain

n

x

4

1.

[ ] 201175)300(

4

1

4

1−→==

xxx

n

The lower boundary of this interval is 10.5.


5.1570

404

3001

105.104

1

1 =

−

×

+=

−

+=f

Fn

wQ l .

The first quartile for this frequency distribution is 15.5.

Decile

A decile of a sorted data set is any of the 9 values that divide the data set into 10

approximately equal parts.



( )

+

=1

10n

ii xD where 9,8,7,6,5,4,3,2,1=i and n is total number of data.



−

+=f

Fin

wDi

10l

where

l = lower boundary of the interval containing the i-th deciles

w = width of the interval containing the i-th deciles



f = number of cases in the interval containing the i-th deciles

24

Percentile

A percentile of a sorted data set is any of the 99 values that divide the data set into 100

approximately equal parts.



( )

+

=1

100n

ii xP where 99,,3,2,1 K=i and n is total number of data.



−

+=f

Fin

wPi

100l

where

l = lower boundary of the interval containing the i-th percentiles

w = width of the interval containing the i-th percentiles



f = number of cases in the interval containing the i-th percentiles

25




Topic 9 : Measures of Variation (Range, Variance, and Standard Deviation)

Measure of variation is a measure that describes how spread out or scattered a set of data. It is

also known as measures of dispersion or measures of spread. There are three measures of

variation: The range, the variance, and the standard deviation.

Range

The range of a data set is a measure of the spread or the dispersion of the observations. It is

the difference between the largest and the smallest data value.

Example:

The heights in cm of ten students are: 157, 152, 165, 151, 160, 156, 155, 162, 158, 163. Find

the range of the data.

Solution: Maximum height = 165, minimum height = 151, so the range = 165 - 151 = 14.

Variance

For Ungrouped Data

Population Variance

Population variance is the arithmetic mean of its squared deviations from the population

mean. The formula of population variance for a set of population data ( Nxxx ,,, 21 K ) is

( )

N

xN

i

i∑=

−

= 1

2

2

µ

σ ⇒ 2

2

11

2

2

N

xxNN

i

i

N

i

i

−

=

∑∑==

σ

where µ is population mean, N is the number of the population data

Sample Variance

Sample variance is a measure of the spread within a set of sample data ( nxxx ,,, 21 K ). The

formula of population is

( )

1

1

2

2

−

−

=

∑=

n

xx

s

n

i

i

⇒ ( )1

2

11

2

2

−

−

=

∑∑==

nn

xxn

s

n

i

i

n

i

i

where x is sample mean, n is the number of the sample data.

For Grouped Data

The formula of sample variance for grouped data is

( )

1

1

2

2

−

−

=

∑=

n

xxf

s

k

i

ii

Remember that sample mean for grouped data is n

xf

x

k

i

ii∑== 1 , so we can get

26

( )1

2

11

2

2

−

−

=

∑∑==

nn

xfxfn

s

k

i

ii

k

i

ii


The formula of sample variance for frequency distribution is

( )

1

1

2*

2

−

−

=

∑=

n

xxf

s

k

i

ii

Remember that sample mean for grouped data is n

xf

x

k

i

ii∑== 1

*

, so we can get

( )1

2

1

*

1

2*

2

−

−

=

∑∑==

nn

xfxfn

s

k

i

ii

k

i

ii

If width class is same for all interval classes, the sample variance can be defined by

( )

−

−

=

∑∑==

1

2

11

2

22

nn

cfcfn

ws

k

i

ii

k

i

ii

Standard Deviation

Standard deviation is calculated by taking the square root of the variance.

Population standard deviation is 2σσ = .

Sample standard deviation is 2ss = .

27

Exercises

1. The following stem-and-leaf plot summarizes the exam scores of a sample of 32

statistics students.

4 0

5 8

6 379

7 2445788

8 0013346777899

9 0111268

Determine the following maximum value, minimum value, mean, standard deviation,

median, modus, harmonic mean, geometric mean, the value of first quartile, the value

of the ninth decile.

2. Table shows the length of the fish in cm (sample data).

Calculate the mean and variance of the length of the fish. Also modus, median, the

third quartile.

3. The table shows the number of magazines read by students in a month (sample data).

Magazines 0 1 2 3 4

Frequency 8 11 9 7 5

Find the mean, standard deviance, fifteenth percentile of the data.

Length (cm) Number of fish

5 – 9 8

10 – 14 17

15 – 19 20

20 – 24 20

25 – 29 18

30 – 34 11

35 – 39 6

28




Topic 10 : Combinatorics

In many experiments with finite possible results, such as tossing one die, it may be reasonable

to assume that all the possible results are equally likely. In that case, a realistic probability

model should be solved by simply counting the number of different ways that a certain event

can occur. The mathematical theory of counting is formally known as combinatorial analysis.

There are three counting rules: the multiplication principle, permutations, and combinations.

Any arrangement of the outcomes in a unique and defined order is a permutation of the

outcomes. Any arrangement without regard to order is a combination of the outcomes. The

fundamental tool for deriving the formulas for the number of permutations and the number of

combinations are the multiplication principle.

Multiplication Principle

If r experiments that are to be performed are such that the first one may result in any of n1

possible outcomes, and if for each of these n1 possible outcomes there are n2 possible

outcomes of the second experiment, and if for each of the possible outcomes of the first two

experiments there are n3 possible outcomes of third experiment, and if,…, then there are a

total of rnnn ××× K21 possible outcomes of the r experiments.

Example 1

How many different 7-place license plates are possible if the first 3 places are to be occupied

by letters and the final 4 by numbers? How many license plates would be possible if repetition

among letters or numbers are prohibited?

Solution: (a) 26×26×26×10×10×10×10 = 175760000.

(b) 26×25×24×10×9×8×7 = 78624000.

Example 2

How many three-letter words can be formed from the last four letters of the alphabet (W, X, Y,

and Z), if each letter can be used more than once in a word?

Solution:

Consider this experiment as drawing three times from a group of four letters. After a letter is

drawn, it is returned to the group and thus is available again for the next draw. The first trial

has four possible outcomes (W, X, Y, and Z), the second trial has the same four possible

outcomes, as does the third trial. Thus, n1 = 4, n2 = 4, n3 = 4, and

# sample points = n1 × n2 × n3 = 4×4×4 = 43 = 64

Factorials

The symbol n! (which is read n factorial) represents the product of all positive integers from n

to 1,

( ) ( ) ( ) 1321! ××−×−×−×= Knnnnn

where the symbol K indicates that not all of the multiplications are shown.

Permutation

A permutation of a set of objects is a listing of the objects in some specified order. The

number of different permutations of n different objects is given by n!.

29

Example 3

A baseball team has nine players. How many different possible batting orders are there once it

has been decided who the starting players will be?

Solution:

There are 9 players. The number of ways to write down a batting order is 9! = 362.880.

Example 4

With three different letters, ABC, then n = 3 and 3! = 6, corresponding to the six possible

arrangements which are: ABC, ACB, BAC, BCA, CAB, CBA.

Example 5

3 boys and 4 girls have bought tickets for a row of 7 seats at a movie.

a. In how many ways can they arrange themselves in the seats

b. In how many ways can they arrange themselves if the boys all sit together and the girls

sit together?

c. In how many ways can they arrange themselves if no one sits beside a person of the

same sex?

Solution:

a. 7! = 5040

b. B B B G G G G

2! × 3! × 4! = 288

c. G B G B G B G

3! × 4! = 144

An ordered arrangement of r objects chosen from n objects is called a permutation of r

objects chosen from n objects. ( r

n P , rn P , n

rP )

( )!

!

rn

nP n

r−

=

Example 6

How many different arrangements can be made by taking 5 letters of the word numbers ?

Number of permutations = 2520!2

!77

5 ==P .

Example 7

If you have five clean shirts and are going to pick one to wear on Saturday and another

(different) one to wear on Sunday, how many possible ways can you make your choice?

Solution:

20!3

!55

2 ==P ways.

Combination

A set of r objects chosen from a set of n objects is called a combination of r objects chosen

from n objects (

r

nCCC rn

n

rr

n,,, ). The number of different combinations of r objects that

may be chosen from n given objects is

( ) !!

!

rrn

nC n

r−

=

30

Example 8

A class consists of 14 boys and 17 girls. Four students from the class are to be selected to go

on a trip.

a. How many different possibilities are there for the 4 students selected to make the trip?

b. If it has been decided that 2 boys and 2 girls will make the trip, then in how many

different ways could the 4 students be selected?

Solution:

a. 31465!27!4

!3131

4 ==C .

b. 123761369117

2

14

2 =×=× CC

Example 9

How many different committees of 3 could be formed from 8 people? If Anne is one of the 8

people, how many different committees could be formed with Anne as a committee member?

Solution: 8

3C = 56 possible committees

There are 7

2C = 21 committees possible with Jane as a member, so the answers is

217

2

1

1 =× CC .

Binomial Expansion

A general expression for ( )nyx + , where n is any positive integer, will be of the form

( ) 022110

0 210yx

n

nyx

nyx

nyx

nyx

r

nyx

nnnnn

r

rnrn

++

+

+

=

=+

−−

=

−∑ K

where the binomial coefficients

0

n,

1

n,

2

n, …,

n

nrepresent the number of ways in

which the corresponding terms nyx 0 , 11 −nyx , 22 −nyx , …, 0yx n can be formed in the

expansion.

Example 10

Find the binomial coefficient of x4y

6 from the ( )6

yx + .

Solution:

( )

6464

106

0

6

6

10

6

10

yxyx

yxyxii

i

→

=+ −

=

∑

So

6

10 or

4

10.

31

Exercises

1. In how many ways can 5 differently colored marbles be arranged in a row?

2. It is required to seat 5 men and 4 women in a row so that the women occupy the even

places. How many such arrangements are possible?

3. Four different mathematics books, 6 different physics books, and 2 different chemistry

books are to be arranged on a shelf. How many different arrangements are possible if

(a) the books in each particular subject must all stand together and

(b) only the mathematics books must stand together?

4. In how many ways can 7 people be seated at a round table if

(a) they can sit anywhere and

(b) 2 particular people must not sit next to each other?

5. Out of 5 mathematicians and 7 physicists, a committee consisting of 2 mathematicians

and 3 physicists is to be formed. In how many ways can this be done if

(a) any mathematician and any physicist can be included,

(b) one particular physicist must be on the committee, and

(c) two particular mathematicians cannot be on the committee?

6. From 7 consonants and 5 vowels, how many words can be formed consisting of 4

different consonants and 3 different vowels? The words need not have meaning.

32




Topic 11 : Probability

In considering probability, we have to know about event and sample space. An event is any

collection of results or outcomes of a procedure, and denoted as big letters A, B, C, etc.

The sample space for a procedure consists of all possible simple events. That is, the sample

space consists of all outcomes that cannot be broken down any further, and denoted as S.

Definition of Classic Probability

Suppose that an event A can happen in a ways out of a total of n possible equally likely ways.

Then the probability of occurrence of the event is denoted by

( )n

aAP =

Example 1

Find the probability that when a couple has 3 children, they will have exactly 2 boys. Assume

that boys and girls are equally likely and that the gender of any child is not influenced by the

gender of any other child.

Solution:

The sample space consists of 8 different ways that 3 children can occur.

1st 2nd 3rd

boy boy boy

boy boy girl

exactly 2 boys boy girl boy

girl boy boy

boy girl girl

girl boy girl

girl girl boy

girl girl girl

From 8 different possible outcomes, 3 correspond to exactly 2 boys, so

P(2 boys in 3 births) = 3/8 = 0.375.

The Properties of Probabilities

These are some properties of probabilities:

1. For sample space S, ( ) 1=SP

2. For the empty event ∅ in S, ( ) 0=∅P

3. For event A in S, ( ) 10 ≤≤ AP

4. For event A and its complement Ac, ( ) ( ) 1=+ cAPAP

5. For events A and B in S, ( ) ( ) ( ) ( )BAPBPAPBAP ∩−+=∪

6. For events A and B in S are mutually exclusive, then ( ) ( ) ( )BPAPBAP +=∪

7. For events kAAA ,,, 21 K in S are all mutually exclusive, then

( ) ( ) ( ) ( )kk APAPAPAAAP +++=∪∪∪ KK 2121

33

Example 2

A ball is drawn at random from a box containing 6 red balls, 4 white balls, and 5 blue balls.

Determine the probability that the ball drawn is (a) red, (b) white, (c) blue, (d) not red, and (e)

red or white.

Solution:

Let R, W, and B denote the events of drawing a red ball, white ball, and blue ball,

respectively. Then

a. ( )( )( ) 5

2

15

6

546

6

ball a choosing of waystotal

ball red a choosing of ways==

++===

Sn

RnRP .

b. ( )15

4

546

4=

++=WP

c. ( )3

1

15

5

546

5==

++=BP

d. ( ) ( )5

3

5

211 =−=−= RPRP c

e. ( ) ( ) ( )3

2

15

10

15

4

15

6==+=+=∪ WPRPWRP

Exercises

1. Three cards are drawn from a deck of 52 cards. Find the probability that

(a) two are jacks and one is a king,

(b) all cards are of one suit,

(c) all cards are of different suits, and

(d ) at least two aces are drawn.

2. It happens that 4 hotels in a certain large city have the same name, e.g., Grand Hotel.

Four persons make an appointment to meet at the Grand Hotel. If each one of the 4

persons chooses the hotel at random, calculate the following probabilities:

a. All 4 choose the same hotel.

b. All 4 choose different hotels.

3. A and B play 12 games of chess, of which 6 are won by A, 4 are won by B, and 2 end

in a draw. They agree to play a match consisting of 3 games. Find the probability that

a. A wins all 3 games,

b. 2 games end in a draw,

c. A and B win alternately, and

d. B wins at least 1 game.

4. Find the probability of boys and girls in families with three children, assuming equal

probabilities for boys and girls.

34




Topic 12 : Conditional Probability, Independent Event, and Bayes Theorem

Conditional Probability

If A and B are events, then the probability of A given B is

0,

BPBP

BAPBAP

If all outcomes are equally likely, then we can also use the alternative formula

Bn

BAnBAP

(Recall that n(B) means the number of outcomes in the event B)

Example 1

Your neighbor has 2 children. You learn that he has a son, Joe. What is the probability that

Joe’s sibling is a brother?

Solution:

The “obvious” answer that Joe’s sibling is equally to have been born male or female suggests

that the probability the other child is a boy is ½. This is not correct.

Consider the experiment of selecting a random family having two children and recording

whether they are boys and girls. Then, the sample space is GGGBBGBBS ,,, , where

outcome “BG” means that the first-born child is a boy and the second-born is a girl. Assuming

boys and girls are equally likely to be born, the 4 elements of S are equally likely.

The event E, that the neighbor has a son is the set GBBGBBE ,, . The event F, that the

neighbor has two boys (i.e. Joe has a brother) is the set BBF .

We want to compute

3

1

43

41

,,

GBBGBBP

BBP

EP

EFPEFP .

Example 2

Consider a population of individuals where none are aware they have diabetes. Suppose that

10000 individuals live in the population under study, with 1000 have undiagnosed diabetes

and 9000 without diabetes. From 1000 with undiagnosed diabetes is 950 would be expected to

test positive. From 9000 without diabetes, 900 would be expected to test positive. Then an

individual is selected from 10000 individuals to follow diabetes test.

Let D is event that the selected individual has undiagnosed diabetes,

T is event that the test indicates diabetes.

Find (a) P(D), (b) P(Dc), (c) P(T), (d) P(T

c), (e) P(D|T), (f) P(D|T

c), (g) P(D|D

c), (h) P(T|D),

(i) P(T|Dc), (j) P(T|T

c)

D Dc

Total

T 950 900 1850

Tc

50 8100 8150

Total 1000 9000 10000

35

Independent Events The events A and B are independent if any one of the following three equivalent conditions

hold.

BPAPBAP

APBAP B has no effect on A

BPABP A has no effect on B

Intuitively, two events are independent if the occurrence of one has no effect on the

probability of the other. If two events E and F are not independent, then they are dependent.

Example 3

Let A and B be independent events 41AP and APBPBAP 2 . Find (a) BP ,

(b) BAP , and (c) ABP c.

Solution:

a) We know that BPAPBPAPBAPBPAPBAP (since

A and B are independent). Thus,

52

42

45

41

41

412

2

BP

BP

BPBPBP

BPAPBPAPAPBP

So P(B) = 2/5.

b) Since A and B are independent, so 41 APBAP .

c) 41 APABPABP c so

5

3

41

41

52

41

41

41

ABP

AP

ABPABP

cc .

Bayes Theorem

Given two dependent events A and B, the previous formulas for conditional probability allow

one to find P(A and B) or P(B|A). Related to these formulas is a rule developed by the English

Presbyterian minister Thomas Bayes (1702 – 61). The rule is known as Bayes’ theorem.

The probability of event A, given that event B has subsequently occurred, is

cc ABPAPABPAP

ABPAPBAP

In general, assume that Ai represents one of n possible mutually exclusive events and that the

conditional probability for the occurrence of Ai given that B has occurred is P(Ai|B). In this

case, the total probability for the occurrence of B is

n

i

ii APABPBP1

and the conditional probability that event Ai has occurred given that event B has been

observed to occur is given by

36

n

j

jj

iiii

i

APABP

APABP

BP

APABPBAP

1

Example 4

Suppose that a test for a particular disease has a very high success rate. If a tested patient has

the disease, the test accurately reports this, a 'positive', 99% of the time (or, with probability

0.99), and if a tested patient does not have the disease, the test accurately reports that, a

'negative', 95% of the time (i.e. with probability 0.95). Suppose also, however, that only 0.1%

of the population have that disease (i.e. with probability 0.001). What is the probability of a

patient has the disease given the test returns a positive result?

Solution:

Let A be the event that the patient has the disease, and B be the event that the test returns a

positive result.

019.005.0999.099.0001.0

99.0001.0

cc ABPAPABPAP

ABPAP

BP

BAPBAP

The probability of a patients has the disease given the test

returns a positive result is 0.019.

B

0.001

0.999

A

Ac

B

Bc

Bc

0.99

0.95

0.05

37

Exercises

1. Pregnancy test result is summarized in the table below

Positive test result

(Pregnancy is indicated)

Negative test result

(Pregnancy is not indicated)

Subject is pregnant 80 5

Subject is not pregnant 3 11

a. If one of the 99 test subjects is randomly selected, what is the probability of

getting a subject who is pregnant?

b. A test subject is randomly selected and is given a pregnancy test. What is the

probability of getting a subject who is pregnant, given that the test result is

positive?

c. One of the 99 test subjects is randomly selected. What is the probability of getting

a subject who is not pregnant?

d. A test subject is randomly selected and is given a pregnancy test. What is the

probability of getting a subject who is not pregnant, given that the test result is

negative?

2. Suppose that Bob can decide to go to work by one of three modes of transportation,

car, bus, or commuter train. Because of high traffic, if he decides to go by car, there is

a 50% chance he will be late. If he goes by bus, which has special reserved lanes but is

sometimes overcrowded, the probability of being late is only 20%. The commuter train

is almost never late, with a probability of only 1%, but is more expensive than the bus.

a. Suppose that Bob is late one day, and his boss wishes to estimate the probability

that he drove to work that day by car. Since he does not know which mode of

transportation Bob usually uses, he gives a prior probability of 1/3 to each of the

three possibilities. What is the boss’ estimate of the probability that Bob drove to

work?

b. Suppose that a coworker of Bob’s knows that he almost always takes the

commuter train to work, never takes the bus, but sometimes, 10% of the time,

takes the car. What is the coworkers probability that Bob drove to work that day,

given that he was late?

38




Topic 13 : Random Variables

A random variable is a real-valued function defined over a sample space S. Random

variables can be classified into two categories i.e. discrete or continuous.

Discrete Random Variables

A random variable X is said to be discrete if it can assume only a finite { }nxxx ,,, 21 K or

countably infinite number of distinct values{ }K,, 21 xx .

Example 1

Suppose a coin is rolled for three times. The sample space is S = {HHH, HHT, HTH, THH,

HTT, THT, TTH, TTT}. Let X denote the number of heads that turn up. Let H = head, T =

tail. Then we have X = {0, 1, 2, 3}.

HHH → 3 heads

HHT, HTH, and THH → 2 heads

HTT, THT, and TTH → 1 head

TTT → 0 head

The probability mass function of X is defined by

( ) ( )xXPxf ==

( )xf is a probability mass function of discrete random variables if any values of x is satisfied

1) ( ) xxf ∀≥ ,0

2) ( )∑ =x

xf 1

Example 2

From example 10.1, we get the probability mass function

x ( ) ( )xXPxf ==

0 1/8

1 3/8

2 3/8

3 1/8

( )

=

==

2,1,

3,0,

83

81

x

xxf

and then show that ( )xf is the probability mass function.

Solution:

(i) The first properties,

For 3,2,1,0=x → ( ) 0>xf

For x otherwise → ( ) 0=xf

and then ( ) xxf ∀≥ ,0 .

(ii)The second properties,

39

( ) 181

83

83

81 =+++=∑

x

xf

From (i) and (ii), we can conclude that ( )xf is probability mass function.

Continuous Random Variables

A continuous random variable takes values from an uncountable set, and the probability of

any one value is zero, but a set of values can have positive probability.

A random variable X is said to be a continuous random variable if there is a function ( )xf

(the probability density function or pdf) mapping the real line ℜ into [0,∞) such that for any

open interval (a,b), ( )( ) ( ) ( )∫=<<=∈

b

a

dxxfbXaPbaXP , .

The function ( )xf is a probability density function of continuous random variable X if

1) ( ) ℜ∈∀≥ xxf ,0

2) ( ) 1=∫∞

∞−

dxxf

Example 3

Suppose that the error in the reaction temperature, in °C, for a controlled laboratory

experiment is a continuous random variable X having the probability density function

( ) <<−

=elsewhere,0

21,3

2

xxf

x

a) Show that ( )xf is probability density function

b) Find ( )10 ≤< XP

Solution:

a) We examine,

(i) For -1 < x < 0 ⇒ ( ) 0>xf x = 0 ⇒ f (x) = 0

0 < x < 2 ⇒ f (x) > 0

For x otherwise ⇒ ( ) 0=xf

and then we conclude that ( ) xxf ∀≥ ,0

(ii) ( ) ( ) 1189

10

9

100

30

2

1

3

2

2

1

21

=+=++=++=−

∞

−

−

∞−

∞

∞−

∫∫∫∫ xdxdxx

dxdxxf

From (i) and (ii), we can conclude that ( )xf is probability density function

b) ( )9

1

9

1

310

1

0

3

1

0

2

===≤< ∫ xdxx

XP .

40




Topic 14 : Discrete Probability Distribution (Bernoulli and Binomial distributions)

There are two probability distribution i.e. discrete probability distribution and continuous

probability distribution.

Such a handful of distributions actually describe several real life random phenomena. For

instance, in a study involving testing the: effectiveness of a new drug, the number of cured

patients among all the patients who use such a drug approximately follows a binomial

distribution. In an industrial example, when a sample of items selected from a. batch of

production is tested, the number of defective items in the sample usually can be modeled as a

hypergeometric random variable. The number of white cells from a fixed amount of an

individual's blood sample is usually random and may be: described by a Poisson distribution.

Discrete probability distribution such as

1. Bernoulli distribution

2. Binomial distribution

3. Poisson distribution

4. Hypergeometric distribution

Continuous probability distribution such as

1. Normal distribution

2. Student’s t distribution

3. Chi-square distribution

4. F distribution

Bernoulli Distribution

The Bernoulli trial is following the properties:

1. A single trial with two possible outcomes (success or failure).

2. The probability of success is denoted by p, and the probability of failure is denoted by

q = 1 – p.

A random variable of X is number of successful trials (zero or one).

The probability mass function of X as a Bernoulli random variable is

( ) ( ) 1,0,11

=−=−

xppxfxx

Binomial Distribution

The number X of successes in n Bernoulli trials is called a binomial random variable. The

experiment obeys:

1. n repeated trials

2. each trial has two possible outcomes (success and failure)

3. P(ith

trial is successful) = p for all i

4. The trials are independent

The probability mass function of X as a binomial random variable is defined

( ) ( ) nxppx

nxf

xnx ,,2,1,0,1 K=−

=

−

41

Example

The probability that a patient recovers from a rare blood disease is 0.4. If 15 people are known

to have contracted this disease, what is the probability that (a) at least 10 survive, (b) from 3

to 8 survive, and (c) exactly 5 survive?

Solution:

Let X is the number of survives.

( ) ( ) ( ) ( ) 15,,2,1,0,6.04.015

4.0,15~15

K=

=⇔

−x

xxfBINX

xx

a) ( ) ( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )

0338.0

6.04.015

156.04.0

14

156.04.0

13

156.04.0

12

156.04.0

11

156.04.0

10

15

151413121110

15141312111010

015114213312411510

=

+

+

+

+

+

=

+++++=

=+=+=+=+=+==≥

ffffff

XPXPXPXPXPXPXP

b)

( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( )

8779.0

.............................................................

7654

765483

=

=

+++=

=+=+=+==<<

ffff

XPXPXPXPXP

c)

( ) ( ) ( ) 1859.06.04.05

155

105=

==XP .

Exercises

1. A factory finds that, on average, 20% of the bolts produced by a given machine will be

defective for certain specified requirements. If 10 bolts are selected at random from

the day’s production of this machine, find the probability (a) that exactly 2 will be

defective, (b) that 2 or more will be defective, and (c) that more than 5 will be

defective.

2. A laser production facility is known to have a 75% yield; that is, 75% of the lasers

manufactured by the facility that is, 75% of the lasers manufactured by the facility

pass the quality test. Suppose that today the facility is scheduled to produce 15 lasers.

What is the probability that at least 14 lasers will pass the quality test?

42




Topic 15 : Discrete Probability Distribution (Poisson and Hypergeometric distributions)

Poisson Distribution

The Poisson distribution is a discrete probability distribution for the counts of events that occur

randomly in a given interval of time (or space). If we let X = The number of events in a given

interval, and then the probability mass function of X as Poisson distribution is

( ) K,3,2,1,0,!

==−

xx

exf

xµµ

If X is binomial random variable with probability mass function

( ) ( ) nxppx

nxf

xnx ,,2,1,0,1 K=−

=

−, when ∞→n and 0→p , and then we can get X

distributed to Poisson distribution with np=µ .

Example 1

If the number of arrivals is 10 per hour on average, determine the probability that, in any hour

there will be

a. 0 arrivals

b. 6 arrivals

c. more than 6 arrivals

Solution:

X is the number of arrival. µ = 10. ( ) ( ) K,2,1,0,!

1010~

10

==⇔−

xx

exfPOIX

x

.

a. ( ) ( ) 0000454.0!0

1000

010

====−

efXP

b. ( ) ( ) 06305.0!6

1066

610

====−

efXP

c. ( ) ( )616 ≤−=> XPXP

Example 2

If the probability that an individual suffers a bad reaction from injection of a given serum is

0.001, determine the probability that out of 2000 individuals (a) exactly 3 and (b) more than 2

individuals will suffer a bad reaction.

Solution:

X is the number of individuals suffer a bad reaction.

p = 0.001 and n = 2000 so µ = 2000×0.001 = 2.

( ) ( ) K,2,1,0,!

22~

2

==⇔−

xx

exfPOIX

x

a. ( ) ( ) 18045.0!3

233

32

====−

efXP

b. ( ) ( ) ( ) ( )[ ] ( ) ( )[ ]101 101212 ffXPXPXPXP +−==+=−=<−=≥

43

Hypergeometric Distribution

The characteristics of hypergeometric distribution are

1. The population or set to be sampled consists of N individuals, objects, or elements (a

finite population).

2. Each individual can be characterized as a success (S) or a failure (F), and there are M

successes in the population.

3. A sample of n individuals is selected without replacement in such a way that each subset

of size n is equally likely to be chosen.

Let the random variable X be the total number of “successes” in a sample of n elements drawn

from a population of N elements with a total number of M “successes.” Then, the probability

mass function of X, called hypergeometric distribution, is given by:

( ) ( ) ( )Mnx

n

N

xn

MN

x

M

xfNMnxHYPX ,min,,2,1,0,,,,~ K=

−

−

=⇔

Example 3

A batch of 100 computer chips contains 10 defective chips. Five chips are chosen at random,

without replacement.

a. Compute the probability mass function of the number of defective chips.

b. Find the probability that the computer chips contain at least one defective chip.

Solution:

X is the number of defective chips in the sample.

a. The probability mass function the number of defective chips is

( ) ( ) 5,4,3,2,1,0,

5

100

5

9010

100,10,5,~ =

−

=⇔ xxx

xfxHYPX

b. The probability that computer chips contain at least one defective chip is

( ) ( ) ( ) ( )

41625.058375.01

5

100

5

90

0

10

1

0101111

=−=

−=

−==−=<−=≥ fXPXPXP

44

Exercises

1. The average rate of telephone calls in a busy reception is 4 per minute. If it can be

assumed that the number of telephone calls per minute interval is Poisson distributed,

calculate the probability that

a. at least 2 telephone calls will be received in any minute.

b. any minute will be free of telephone calls.

c. no more than one telephone call will be received in any one minute interval.

2. If 3% of the electric bulbs manufactured by a company are defective, find the probability

that in a sample of 100 bulbs (a) 0, (b) 1, (c) 2, (d) 3, (e) 4, and (f) 5 bulbs will be

defective.

3. A club contains 50 members; 20 are men and 30 are women. A committee of 10 members

is chosen at random.

a. Compute the probability density function of the number of women on the committee.

b. Find the probability that the committee members are all the same gender.

4. A life insurance salesman sells on the average 3 life insurance policies per week.

Calculate the probability that in a given week he will sell

a. some policies

b. 2 or more policies but less than 5 policies

c. assuming that there are 5 working days per week, what is the probability that in a

given day he will sell one policy?

45




Topic 16 : Continuous Probability Distribution (Normal and Student’s t Distribution)

Normal Distribution

The normal distribution or Gauss distribution is defined as the distribution with the density

( ) ( )∞<<∞−>∞<<∞−=

−−

µσπσ

σ

µ

,0,,2

12

2

1

xexf

x

The Normal curve is symmetrical about the line µ=x (because ( ) ( )µµ −=+ xfxf for all real

x) and bell-shaped. Since a normal variable assumes values ranging from -∞ to ∞, its curve is

asymptotic to the x-axis and the total area under its density curve is equal to 1.

Thus, given any Normal distribution ( )2,~ σµNX , a transformation, known as standardisation,

is applied in order to map it onto the standard Normal distribution so that probabilities (areas) can

be read. The standardisation formula (z score) is given by

σ

µ−=

XZ

So Z as a random variable is distributed to standard normal distribution with mean 0 and variance

1. The probability density function of Z is

( ) ∞<<∞−=−

zezfz

,2

1 2

2

1

π

The standard normal distribution is a normal probability distribution that has a mean of 0 and a

standard deviation of 1, and the total area under its density curve is equal to 1.

Finding Probabilities When Given z Scores We can find the probabilities from z scores with a standard normal table.

Figure 1. Normal probability density function

46

Example 1

Let X has a normal distribution with mean 60 and standard deviation 12. Find of the probability X

less than 76.

Solution:

See the standard normal distribution table below.

Student’s t Distribution

Student's t-distribution (or simply the t-distribution) is a probability distribution that arises in

the problem of estimating the mean of a normally distributed population when the sample size is

small (n < 30). The derivation of the t-distribution was first published in 1908 by William Sealy

Gosset, while he worked at a Guinness Brewery in Dublin. Due to proprietary issues, the paper

was written under the pseudonym Student. The t-test and the associated theory became well-

known through the work of R.A. Fisher, who called the distribution "Student's distribution".

Student's t-distribution has the probability density function

( )( )

∞<<−∞

+

+Γ

Γ=

+−−

xx

xf ,12

1

2

2/121 ν

ν

ννπ

ν

where ν = n-1 is the number of degrees of freedom, n is sample size, and Γ is the Gamma

function.

( ) ( ) 9082.033.176

33.112

6076

=<=<

=−

=

ZPXP

z

Figure 2. Student’s t probability density function

Example 2.

Let T is distributed to the student’s t distribution. Find of the value of

= 6.

Solution:

n = 6 so ν = 5.

From t distribution table

We can get k equal to 2.571.

Figure 2. Student’s t probability density function

is distributed to the student’s t distribution. Find of the value of k if P

47

P(T > k) = 0.025 with n

48

Exercises

1. Given X~ N(41, 36) and that P[X > b] = 0.05, find the value of b.

2. Given X~ N(20, σ2) and that P[X > 12] = 0.75, find the value of σ.

3. Given X~ N(µ, σ2), P[X < 63] = 0.975 and P[X > 46] = 0.6, find the values of µ and σ

2.

4. Assume that Z scores are normally distributed with a mean of 0 and a standard deviation

of 1.

a. If ( ) 3907.00 =<< aZP , find a.

b. If ( ) 8664.0=<<− bZbP , find b.

c. If ( ) 0643.0=> cZP , find c.

d. If ( ) 9922.0=> dZP , find d.

e. If ( ) 4500.0=< eZP , find e.

5. The sitting height (from seat to top of head) of drivers must be considered in the design of

a new car model. Men have sitting heights that are normally distributed with a mean of 36

inches and a standard deviation of 1.4 inches (based on anthropometric survey data from

Gordon, Clauser, et al.). Engineers have provided plans that can accommodate men with

sitting heights up to 38.8 inches, but taller men cannot fit. If a man is randomly selected,

find the probability that he has a sitting height less than 38.8 inches.

6. Compute the probabilities below

a. P(T < 2.365) with n = 8

b. P(T > 1.318) with ν = 24

c. P(-1.356 < T < 2.179) with ν = 12

d. P(-t0.005 < T < t0.01) with n = 10

7. Let a random sample size is 24 which taken from normal population, then find the value

of k if

a. P(-2.069 < T < k) = 0.965

b. P (k< T < 2.807) = 0.095

c. P(-k < T < k) = 0.90

49




Topic 17 : Continuous Probability Distribution (Chi-Square and F Distribution)

Chi-Square Distribution

The formula for the probability density function of X of the chi-square distribution is

0,2

2 2/1

2

1

2/

xexxf x

where = n-1 is the shape parameter and is the gamma function. The formula for the gamma

function is

dteta ta

0

1

Example 1

Find the value of 2

0.99 with = 4.

Solution:

See chi-square table below

So the value is 0.30.

F Distribution

the F-distribution is a continuous probability distribution. It is also known as Snedecor's F

distribution or the Fisher-Snedecor distribution.

The formula for the probability density function of X as F distribution is

http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm#PDF

http://en.wikipedia.org/wiki/Continuous_probability_distribution

http://en.wikipedia.org/wiki/Probability_distribution

http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm#PDF

50

0,

122

22/

2

1

122

2

1

21

21

21

11

x

x

xxf

where 1 = n1 – 1 and 2 = n2 – 1 is the shape parameter and is the gamma function.

Example 2

Find the value of f0.99 if 1 = 4 and 2 = 3.

Solution:

From F distribution table with = 0.01

f0.99(4,3) = 1/ f0.01(3,4) = 1/16.69 = 0.05992.

Exercises

1. Find the values of

a. 2

0.975 with n = 30

b. 2 so that P(X

2 <

2 ) = 0.99 with n = 5

2. A random variable X is distributed to F distribution with 1 = 10 and 2 = 20. Find the

value of

a. a so that P(F > a) = 0.05

b. b so that P(F ≤ b) = 0.01

1,1

1

2211

,,1 1221

nn

ff

51




Topic 18 : Sampling Distribution (Sampling Distribution of Mean)

A sample statistic used to estimate an unknown population parameter is called an estimate.

The discrepancy between the estimate and the true parameter value is known as sampling

error. A statistic is a random variable with a probability distribution, called the sampling

distribution, which is generated by repeated sampling. We use the sampling distribution of a

statistic to assess the sampling error in an estimate.

Sampling Distribution of Mean

Distribution of sample means are the set of values of sample means obtained from all possible

samples of the same size (n) from a given population.

For example, we have a population: 2, 3, 4. That population has a mean (µ) of 3 and a

variance (σ2) of 0.67 and a standard deviation (σ) of 0.82. Taking every possible sample of n

=2 from that population (sampling with replacement) gave us 9 (=32) sample means.

( ) µµ ===×+×+×+×+×==∑ 3)4()5.3()3()5.2()2(927

91

92

93

92

91xfx

X

( ) ( ) ( ) ( )31222 3 =−=−= ∑∑ xfxxfx

Xµσ

The Central Limit Theorem

The central limit theorem states that for a randomly selected of size (n is large) with a mean µ

and a standard deviation σ:

1. The distribution of sample means x is approximately normal regardless of whether the

population distribution is normal.

2. The mean of the distribution of sample means is equal to the mean of the population

distribution – that is µµ =X

.

3. The standard deviation of the distribution of sample means is equal to the standard

deviation of the population divided by the square root of the sample size – that is,

nX

σσ = .

x f ( )xf ( )xfx ( ) ( )xfx2

µ−

2

2.5

3

3.5

4

1

2

3

2

1

1/9

2/9

3/9

2/9

1/9

2/9

5/9

9/9

7/9

4/9

1/9

i Sample Mean

1 {2,2} 2

2 {2,3} 2.5

3 {2,4} 3

4 {3,2} 2.5

5 {3,3} 3

6 {3,4} 3.5

7 {4,2} 3

8 {4,3} 3.5

9 {4,4} 4

52

Table 1. Characteristics of a population distribution and its distribution of sample means

Characteristic Population Distribution Distribution of Sample Means

Mean µ µµ =X

Standard Deviation σ

nX

σσ =

Z score

σ

µ−=

XZ

n

XZ

σ

µ−=

t statistics

ns

XT

µ−=

Example 1

The blood glucose from the entire Honolulu Heart Study population is approximately

normally distributed with a mean of 161.52 and a standard deviation of 58.15. Suppose we

select samples of size 25 from this population:

a. What proportion of sample means would have values of 170 or greater?

b. What proportion of sample means would have values of 155 or lower?

Solution:

X is blood glucose. ( )( )215.58,52.161~ NX , n = 25

a. ( ) ( ) 2327.073.02515.58

52.161170170 =≥=

−≥=≥ ZPZPXP

b. ( ) ( ) 2877.056.02515.58

52.161155155 =−≤=

−≤=≤ ZPZPXP

Sampling Distribution of Difference Between Two Means

The sampling distribution of the difference between means can be thought of as the

distribution that would result if we repeated the following three steps over and over again: (1)

sample n1 scores from Population 1 and n2 scores from Population 2, (2) compute the means

of the two samples ( 1X and 2X ), (3) compute the difference between means 21 XX − . The

mean of the sampling distribution of the mean is:

2121

µµµ −=− XX

which says that the mean of the distribution of differences between sample means is equal to

the difference between population means. From the variance sum law, we know that:

2

2

2

1

2

1222

2121 nnXXXX

σσσσσ +=+=

−

which says that the variance of the sampling distribution of the difference between means is

equal to the variance of the sampling distribution of the mean for Population 1 plus the

variance of the sampling distribution of the mean for Population 2. Recall the formula for the

variance of the sampling distribution of the mean:

nX

22 σ

σ =

So the standard error of a sampling distribution is the standard deviation of the sampling

distribution, the standard error of the difference between means is:

53

2

2

2

1

2

1

21 nnXX

σσσ +=

−

So the sampling distribution of the difference between two means by extending the Central

Limit Theorem is the sampling distribution of 21 XX − which is approximately normally

distributed (If the two variables are normal then 21 XX − is normally distributed as well.)

with mean 21 µµ − and standard deviation

2

2

2

1

2

1

nn

σσ+

Thus,

( ) ( )

2

2

2

1

2

1

2121

nn

XXZ

σσ

µµ

+

−−−=

is either normally distributed or approximately normally distributed.

For example we have 2 populations, population 1: 3, 5, and 7, population 2: 0 and 3. The

population 1 has a mean of 5 and a variance of 8/3. The population 2 has a mean of 3/2 and a

variance of 9/4.

( ) ( ) ( )3

8

3

575553

53

753

222

2

1

1

=−+−+−

=

=++

=

σ

µ

( ) ( )

4

9

2

30

2

3

2

30

2

232

23

2

2

2

=−+−

=

=+

=

σ

µ

Thus, from population 1 is taken sample size of n1 = 2 with replacement and from of the

population 2 is taken sample size of n2 = 3 with replacement also, and each calculated of the

mean.

Population 1 Population 2

No Sample 1x No Sample

2x

1

2

3

4

5

6

7

8

9

3, 3

3, 5

3, 7

5, 3

5, 5

5, 7

7, 3

7, 5

7, 7

3

4

5

4

5

6

5

6

7

1

2

3

4

5

6

7

8

0, 0, 0

0, 0, 3

0, 3, 0

3, 0, 0

0, 3, 3

3, 0, 3

3, 3, 0

3, 3, 3

0

1

1

1

2

2

2

3

( )21 xx − f ( )21 xxf −

0

1

2

3

4

5

6

7

1

5

12

18

18

12

5

1

1/72

5/72

12/72

18/72

18/72

12/72

5/72

1/72

54

5,35,1521

=−=− XX

µ

12

25

32

49

38

2

21=+=

− XXσ

Example 2

Suppose that we draw random samples of size 5 from two normal populations. The mean and

standard deviation of population 1 are 100 and 25. The mean and standard deviation of

population 2 are 90 and 40. Find the probability that the mean of sample 1 exceeds the mean

of sample 2.

Solution:

We want to determine ( )[ ]021 >− XXP . The mean and standard deviation of the sampling

distribution are

109010021 =−=− µµ

1.215

40

5

25 22

2

2

2

1

2

1 =+=+nn

σσ

Thus,

( )[ ] ( ) 6808.047.01.21

100021 =−>=

−>=>− ZPZPXXP .

The Difference

Between Means 1x

2x 3 4 5 4 5 6 5 6 7

0

1

1

1

2

2

2

3

3

2

2

2

1

1

1

0

4

3

3

3

2

2

2

1

5

4

4

4

3

3

3

2

4

3

3

3

2

2

2

1

5

4

4

4

3

3

3

2

6

5

5

5

4

4

4

3

5

4

4

4

3

3

3

2

6

5

5

5

4

4

4

3

7

6

6

6

5

5

5

4

55

Exercises

1. Suppose that a random sample of n = 12 is obtained from a Normal population with µ

= 64 and σ = 17. Determine ( )3,67<XP .

2. The length of human pregnancies is approximately normally distributed with mean µ

= 266 days and standard deviation σ = 16 days.

a. What is the probability that a randomly selected pregnancy will last less than 260

days?

b. What is the probability that a random sample of 20 pregnancies has mean gestation

period of 260 days or less?

c. What is the probability that a random sample of 50 has a mean gestation period of

260 or less?

d. What is the probability that a random sample of size 15 will have a mean gestation

period within 10 days of the mean?

3. Suppose a simple random sample of size n = 36 is obtained from a population with

mean µ = 64 and σ = 18.

a. Describe the sampling distribution of X .

b. What is P( X < 62.6)?

c. What is P( X ≥ 68.7)?

d. What is P(59.8 < X < 65.9)?

4. The assistant dean of a business school claims that the number of job offers received

by MBA's whose major is finance is normally distributed with a mean of 12 and a

standard deviation of 2.5. Furthermore he states that job offers to marketing majors is

normally distributed with a mean of 10 and a standard deviation of 3. Find the

probability that in a random sample of 10 finance and 10 marketing majors the

average finance major receives more job offers than the average marketing major.

56




Topic 19 : Sampling Distribution (Sampling Distribution of Proportion and Variance)

Sampling Distribution of Proportion

The sampling distribution of a sample proportion is actually based on the binomial

distribution. However, the primary purpose of creating the sampling distribution is for

inference and the binomial distribution, which is discrete, makes inference somewhat

difficult. Consequently, we use the normal approximation to the binomial distribution. The

sampling distribution of p̂ is approximately normal with mean p and variance ��

�. Thus

( )n

pp

ppz

−

−=

1

ˆ

is approximately standard normally distributed.

Example 1

A fair coin is flipped 400 times. Find the probability that the proportion of heads falls

between 0.48 and 0.52.

Solution:

We wish to find ( )52.0ˆ48.0 << pP . We employ the approximate normal sampling

distribution. Because the coin is fair, p = 0.5.

( )( ) ( )

( ) 5762.08.08.04005.05.0

5.052.0

4005.05.0

5.048.052.0ˆ48.0 =<<−=

×

−<<

×

−=<< ZPZPpP

Sampling Distribution of Difference Between Two Proportions

When the samples are large, the sampling distribution of 21ˆˆ pp − is approximately normal

with

a. The mean of the sampling distribution of 21ˆˆ pp − is 21 pp − .

b. The standard deviation of the sampling distribution of 21ˆˆ pp − is

( ) ( )

2

22

1

11 11

n

pp

n

pp −+

−

Sampling Distribution of Variance

Since 2s cannot be negative we should suspect that this sampling distribution is not a normal

curve. It is called the chi-square distribution.

Theorem

If 2s is the variance of a random sample of size n taken from a normal population having the

variance 2σ , then

( )

2

22 1

σχ

sn −=

is the value of a random variable, having the chi-square distribution with the parameter

1−= nν .

57

Example 2

It is claimed that the variance of a normal population 2σ = 21.3 is rejected if the variance of a

random of a random sample of size 15 exceeds 39.74. What is the probability that the claim

will be rejected even though 2σ = 21.3?

Solution:

X is the claim will be rejected.

( )( ) ( ) 025.01202.26

3.21

74.3911574.39 22 =>=

−>=> χχ PPXP

Sampling Distribution of Ratio Between Two Variances

Sampling distribution of ratio between two variances should be the populations from which

the samples were obtained normally distributed and the samples must be independent of each

other.

Theorem

Let 1

,,1 nXX K , be a random sample of size n1 from a normal distribution with variance 2

1σ .

Let 2

,,1 nYY K , be another random sample of size n2, independent of the Xi ’s, from a normal

distribution with variance 2

2σ . Let 2

1s and 2

2s denote the two sample variances. Then the

random variable

2

2

2

2

2

1

2

1

σ

σ

s

sF =

has an F distribution with 111 −= nν and 122 −= nν .

Exercises

1. The proportion of defective units coming off a production line is 5%. Find the

probability that in a random sample of 100 units more than 10% are defective?

2. In the last election a local counselor received 52% of the vote. If her popularity level is

unchanged what is the probability that in a random sample of 200 voters less than 50%

would vote for her?

3. The mean sitting height of adult males may be assumed to be normally distributed,

with mean 35” and standard deviation 1.2”. For a sample size of n = 30 men,

determine the probabilities for a possible level of the sample standard deviation

s ≤ 1.1’’.

58




Topic 20 : Confidence Interval of Mean

A confidence interval estimate for µ is an interval of the form ul ≤≤ µ , where the

endpoints l and u are computed from the sample data. Because different samples will produce

different values of l and u, these end-points are values of random variables L and U,

respectively. Suppose that we can determine values of L and U such that the following

probability statement is true:

( ) αµ −=≤≤ 1ULP

where 10 ≤≤ α . There is a probability of α−1 of selecting a sample for which the

confidence interval will contain the true value of µ. Once we have selected the sample, so that

X1 = x1, X2 = x2, …, Xn = xn, and computed l and u, the resulting confidence interval for µ. is ul ≤≤ µ

Recall that the sampling distribution of mean is ( ) ( )nXZ σµ−= which has a standard

normal, we may write

ασ

µ

α

αα

αα

−=

≤

−≤−⇒

−=

≤≤−

1

1

22

22

zn

XzP

zZzP

Now manipulate the quantities inside the brackets by (1) multiplying through by nσ , (2)

subtracting X from each term, and (3) multiplying through by -1. This results in

ασ

µσ

αα −=

+≤≤− 1

22 nzX

nzXP

Definition 1 (Confidence Interval of Mean, Variance Known)

If x is the sample mean of a random sample of size n from a normal population with known

variance 2σ , a ( )%1100 α− confidence interval on µ is given by

n

zxn

zxσ

µσ

αα

22

+≤≤−

Example 1

A manufacturer produces piston rings for an automobile engine. It is known that ring diameter

is normally distributed with standard deviation 0.001 millimeters. A random sample of 15

rings has a mean diameter 74.036 millimeters. Construct a 95% confidence interval on the

mean piston ring diameter.

Solution:

n = 15, 036.74=x , 001.0=σ , 96.1025.0 =z

A 95% confidence interval for the mean piston ring diameter is

nzx

nzx

σµ

σ025.0025.0 +≤≤−

59

036.74035.74

00051.0036.7400051.0036.74

15

001.096.1036.74

15

001.096.1036.74

≤≤⇒

+≤≤−⇒

+≤≤−⇒

µ

µ

µ

With 95% confidence the population mean piston ring diameter is between 74.035 and 74.036

millimeters.

Definition 2 (Sample of Choice)

If x is used as an estimate of µ, we can be ( )%1100 α− confident that error µ−x will not

exceed a specified amount E when the sample size is

2

2

=E

z

n

σα

Example 2

The life in hours of a 75-watt light bulb is known to be normally with standard deviation 25

hours. Suppose that we wanted the total width of the two-sided confidence interval on mean

life to be six hours at 95% confidence. What sample size should be used?

Solution: z0.025 = 1.96

69.666

2596.12

2

2 =

×=

=E

z

n

σα

The sample size should be used is 67 bulbs.

Definition 3 (Confidence Interval of Mean, Variance Unknown but n is large)

When n is large (n ≥ 30) but variance σ2 is unknown, the formula

nS

X µ−

has an approximate standard normal distribution. Consequently,

n

szx

n

szx

22

αα µ +≤≤−

is large sample confidence interval ( )%1100 α− of µ.

Example 3

Thirty randomly selected students took the calculus final. If the sample mean was 82 and the

standard deviation was 12.2, construct a 99% confidence interval for the mean score of all

students.

Solution:

n = 30, 82=x , 2.12=s , 575.2005.0 =z

A 99% confidence interval for the mean score of all students is

n

szx

n

szx 005.0005.0 +≤≤− µ

60

736.87264.76

736.582736.582

30

2.12575.282

30

2.12575.282

≤≤⇒

+≤≤−⇒

+≤≤−⇒

µ

µ

µ

With 99% confidence the population mean score of all students is between 76 and 88.

For a ( )%1100 α− confidence interval on the mean of a normal distribution with unknown

variance, recall that the distribution of ( ) ( )nSXT µ−= is t with n – 1 degrees of freedom.

Thus,

ααα −=

≤≤−

−−1

1;2

1;2

nntTtP

or

αµ

αα −=

≤

−≤−

−−1

1;2

1;2

nnt

nS

XtP

Rearranging this last equation yields

αµ αα −=

+≤≤−

−− 1

1;22

1;n

StX

n

StXP

nn

Definiton 4 (Confidence Interval of Mean, Variance Unknown)

If x and s are the mean and standard deviation of a random sample from a normal population

with unknown variance 2σ , a ( )%1100 α− confidence interval on µ is given by

n

stx

n

stx

nn 1;2

1;2

−−+≤≤− αα µ

Example 4

Ten randomly selected automobiles were stopped, and the tread depth of a tire was measured.

The mean was 0.32 inches, and the standard deviation was 0.08 inches. Find the 95%

confidence interval of the mean tread depth. Assume the variable is normally distributed.

Solution:

σ is unknown and n < 30 so use the t-distribution. n = 10, 32,0=x , s = 0.08.

ν = 10 – 1 = 9. 1-α = 0.95, so t0.025(9) = 2.262.

95% CI for mean is

n

stx

n

stx

nn 1;2

1;2

−−+≤≤− αα µ

( ) ( )

38.026.0

10

08.0262.232.0

10

08.0262.232.0

≤≤⇒

+≤≤

−

µ

µ

With 95% confidence the population mean tire tread depth is between 0.26 and 0.38 inches.

61

Exercises

1. Among a sample of 65 students selected at random from one college, the mean

number of siblings is 1.3 with a standard deviation of 1.1. Find a 95% confidence

interval for the mean number of siblings for all students at this college.

2. The following data represent the number of house fires started by candles for the past

twenty years. Find the 99% confidence interval for the mean number of home fires

started by candles each year. Assume the variable is normally distributed.

Year 1980 1982 1984 1986 1990 1992 1994 1996 1998 2000 2002

Number

of home fires

8200 7300 6700 6700 5500 6100 7200 9900 12500 15700 18000

3. Fifteen randomly selected female athletes were asked to take a stress test. After three

minutes, their pulses were measured and the following data collected. Find the 95%

confidence interval about the true population mean µ. Assume the variable is normally

distributed.

117, 102, 98, 100, 116, 113, 91, 92, 96, 136, 134, 126, 104, 113, 102

4. The superintendent of a golf course believes that the mean score for professional

golfers on his course is above 72. He randomly samples the scores of 40 professional

golfers from the last tournament.

Here are the scores.

69 67 69 71 75 78 70 71 72 74

72 76 69 76 70 77 78 79 72 71

75 75 75 78 71 64 71 68 72 71

73 72 73 71 71 75 72 72 78 75

Construct a 95% confidence interval for the mean score for all professional golfers on

this course.

5. A civil engineer is analyzing the compressive strength of concrete. Compressive

strength is normally distributed with variance 1000 (psi)2. Suppose, it is desired to

estimate the compressive strength with an error that is less than 15 psi at 99%

confidence. What sample size is required?

6. A college president asked the statistics teacher to estimate the average age of the

students at their college. How large sample is necessary? The statistics teacher would

like to be 99% confidence that the estimate should be accurate within one year. From a

previous study, the standard deviation of the ages was known to be 3 year.

7. A certain medication is known to increase the pulse rate of its user. The standard

deviation of the pulse rate is known to be 5 beats per minute. A sample of 30 users had

an average pulse rate of 104 beats per minute. Find the 99% confidence interval of the

true mean.

8. The data above represent a sample of the number of home fires started by a candle for

the past several years. Find the 99% confidence interval for the mean number of home

fires started by a candle each year.

54 60 59 61 63 72 84 99

62




Topic 21 : Confidence Interval of Two Means

Confidence Interval of Two Means, Population Variances are Known

The 100(1-α)% confidence interval on the difference in two means 21 µµ − when the

variances are known can be found by recall that 111211 ,,, nXXX K , is a random sample of n1

observations from the first population and 222221 ,,, nXXX K , is a random sample of n2

observations from the second population. The difference in sample means 21 XX − is a point

estimator of 21 µµ − , and

( ) ( )

2

2

2

1

2

1

2121

nn

XXZ

σσ

µµ

+

−−−=

has a standard normal distribution if the two populations are normal or is approximately

standard normal if the conditions of the central limit theorem apply, respectively. This implies

that ( ) ααα −=≤≤− 122

zZzP , or

( ) ( )

ασσ

µµαα −=

≤

+

−−−≤− 1

22

2

2

2

1

2

1

2121 z

nn

XXzP

This can be arranged as

( ) ( ) ασσ

µµσσ

αα −=

++−≤−≤+−− 1

2

2

2

1

2

12121

2

2

2

1

2

121

22 nnzXX

nnzXXP

Definition 1

If 1x and 2x are the means of independent random samples of sizes n1 and n2 from two

independent normal populations with known variances 2

1σ and 2

2σ , respectively, a

100(1-αααα)% confidence interval for µµµµ1-µµµµ2 is

( ) ( )2

2

2

1

2

12121

2

2

2

1

2

121

22 nnzxx

nnzxx

σσµµ

σσαα ++−≤−≤+−−

Example 1

An experiment was conducted to compare the efficacies of two drugs in the prevention of

tapeworms in the stomachs of a new breed of sheep. Samples of size 5 and 8 from each breed

were given the drug and the two sample means were 28.6 and 40.0 worms/sheep. From

previous studies, it is known that the variances in the two groups are 198 and 232,

respectively, and that the number of worms in the stomachs has an approximate normal

distribution. A 95% confidence interval for the difference in the mean number of worms per

sheep.

63

Solution:

n1 = 5, n2 = 8, 1x =28.6, 2x = 40.0, 1982

1 =σ , 2322

2 =σ , z0.025 = 1.96.

A 95% confidence interval for the difference in the mean number of worms per sheep.

( ) ( )

834.4634.27

234.164.11234.164.11

8

232

5

19896.1406.28

8

232

5

19896.1406.28

21

21

21

≤−≤−⇒

+−≤−≤−−⇒

++−≤−≤+−−

µµ

µµ

µµ

Confidence Interval of Two Means, Population Variances are Unknown But n1 & n2 are

Large

Definition 2

If 1x and 2x are the means of independent random samples of sizes n1 and n2 from two

independent normal populations with unknown variances but the samples of sizes are greater

than 30 (ni ≥ 30), respectively, a 100(1-αααα)% confidence interval for µµµµ1-µµµµ2 is

( ) ( )2

2

2

1

2

1

2

2121

2

2

2

1

2

1

2

21n

s

n

szxx

n

s

n

szxx ++−≤−≤+−− αα µµ

Example 2

Do women tend to spend more time on housework than men? Based on data, the following summary

data was reported regarding the number of hours spent in housework per week. For women,

the sample size is 6764, with mean of 32.6 hours per week with a standard deviation of 18.2

hours per week. For men, the sample size is 4252, with mean 18.1 hours per week and

standard deviation of 12.9 hours. Do women spend more time on housework than men?

Address this question with a 95% confidence interval.

Solution: Let 1: women, 2: men

n1= 6764, 1x = 32.6, 1s = 18.2, n2= 4252, 2x = 18.1, 2s = 12.9, z0.025 = 1.96.

A 95% confidence interval for the difference in the mean of the numbers of hours spent in

housework per week is

( ) ( )

08.1592.13

5834.05.145834.05.14

4252

9.12

6764

2.1896.11.186.32

4252

9.12

6764

2.1896.11.186.32

21

21

22

21

22

≤−≤⇒

+≤−≤−⇒

++−≤−≤+−−

µµ

µµ

µµ

All plausible values for 21 µµ − are positive. This suggests that mean for females is greater

than that for males. This is evidence women spend more time on housework than men.

Confidence Interval of Two Means, Population Variances are Unknown but Assumed

Equal Variances

To develop the confidence interval for the difference in means 21 µµ − when both variances

are equal, note that the distribution of the statistic

( ) ( )

21

2121

11

nnS

XXT

p +

−−−=

µµ

is the t distribution with n1 + n2 – 2 degrees of freedom.

64

Therefore ( ) ααα −=≤≤−−+−+

12;2; 212212

nnnntTtP , so

( ) ( )

αµµ

αα −=

≤

+

−−−≤−

−+−+1

112;

21

2121

2; 212212nn

p

nnt

nnS

XXtP

This can be arranged as

( ) ( ) αµµ αα −=

++−≤−≤+−−

−+−+1

1111

21

2;2121

21

2;21212212 nn

StXXnn

StXXP pnnpnn

Definition 3

If 1x , 2x , 2

1s and 2

2s are the sample means and variances of two random samples of sizes n1 and

n2, respectively, from two independent normal populations with unknown but equal variances,

then a 100(1-αααα)% confidence interval for µµµµ1-µµµµ2 is

( ) ( )21

2;2

2121

212;

2

21

1111

2121 nnstxx

nnstxx p

nnp

nn++−≤−≤+−−

−+−+αα µµ

where ps is a “pooled estimator of 2σ given by:

( ) ( )

2

11

21

2

22

2

11

−+

−+−=

nn

snsns p

Example 3

Independent random sample of students from two classes in one middle school produced the

following scores for a state social-studies exam:

Class A: 78, 84, 81, 78, 76, 83, 79, 75, 85, 81

Class B: 85, 75, 83, 87, 80, 79, 88, 94, 87, 82

If we assume the same variance in the two populations, and the score is following Normal

Population. Do you think any difference in scores between Class A & Class B? Construct a

95% confidence interval for the difference of the mean scores.

Solution: Let 1: Class A and 2: Class B.

n1 = n2 = 10, 1x = 80, 1s = 3.3665, 2x = 84, 2s = 5.3955, ν = 10+10-2 = 18, t0.025(18) = 2.101

( )( ) ( )( )4969.4

21010

3955.51103665.311022

=−+

−+−=ps

( ) ( )( ) ( ) ( )( )

2253.02253.8

2253.442253.44

10

1

10

14969.4101.28480

10

1

10

14969.4101.28480

21

21

21

≤−≤−⇒

+−≤−≤−−⇒

++−≤−≤+−−

µµ

µµ

µµ

A 95% confidence interval for the difference of the mean scores between class A and class B

is -8.2253 to 0.2253. We can say that there is not difference in scores between Class A &

Class B because the confidence interval contain zero value.

65

Confidence Interval of Two Means, Population Variances are Unknown but Assumed

Unequal Variances

When the variances of population are unequal 2

2

2

1 σσ ≠ , we may still find a 100(1-α)%

confidence interval µ1-µ2 using the fact that

( ) ( )

2

2

2

1

2

1

2121*

n

S

n

S

XXT

+

−−−=

µµ

is distributed approximately as t with degrees of freedom ν.

Definition 4

If 1x , 2x , 2

1s and 2

2s are the sample means and variances of two random samples of sizes n1 and

n2, respectively, from two independent normal populations with unknown and unequal

variances, an approximate 100(1-αααα)% confidence interval for µµµµ1-µµµµ2 is

( ) ( )2

2

2

1

2

1

;2

2121

2

2

2

1

2

1

;2

21n

s

n

stxx

n

s

n

stxx ++−≤−≤+−−

να

να µµ

where

( ) ( )

11 2

2

2

2

2

1

2

1

2

1

2

2

2

2

1

2

1

−+

−

+

=

n

ns

n

ns

n

s

n

s

ν

Example 4

From example 3, if we assume the unequal variances in the two populations, and the score is

following Normal Population. Do you think any difference in scores between Class A & Class

B? Construct a 95% confidence interval for the difference of the mean scores.

Solution:

n1 = n2 = 10, 1x = 80, 1s = 3.3665, 2x = 84, 2s = 5.3955, t0.025(15) = 2.131

( ) ( )150826.15

110

103955.5

110

103655.3

10

3955.5

10

3655.3

2222

222

≈=

−+

−

+

=ν

( ) ( ) ( ) ( )

2856.02856.8

2856.442856.44

10

3955.5

10

3665.3131.28480

10

3955.5

10

3665.3131.28480

21

21

22

21

22

≤−≤−⇒

+−≤−≤−−⇒

++−≤−≤+−−

µµ

µµ

µµ

A 95% confidence interval for the difference of the mean scores between class A and class B

is -8.2856 to 0.2856. We can say that there is not difference in scores between Class A &

Class B because the confidence interval contain zero value.

66

Confidence Interval on µµµµ1-µµµµ2 for Paired Observations

To construct the confidence interval for 21 µµµ −=D , note that

nS

dT

d

dµ−=

follows a t distribution with n - 1 degrees of freedom. Then, since ( ) ααα −=≤≤−−−

11;1;

22nn

tTtP

, we can substitute for T in the above expression and perform the necessary steps to isolate

21 µµµ −=D between the inequalities. This leads to the following 100(1-α)% confidence

interval on 21 µµµ −=D .

Definition 5

If d and ds are the sample means and standard deviation of the difference of n random pairs of

normally distributed measurements, a 100(1-αααα)% confidence interval for µµµµD = µµµµ1-µµµµ2 is

n

std

n

std d

nd

d

n 1;2

1;2

−−+≤≤− αα µ

Example 5

A research wanted to find out the effect of a special diet on systolic blood pressure. She

selected a sample of 7 adults and put them on this dietary plan for 3 months. The following

table gives the systolic blood pressures of these 7 adults before and after the completion of

this plan.

Before 210 180 195 220 231 199 224

After 193 186 186 223 220 183 233

Construct a 95% confidence interval for the difference of the mean systolic blood pressures

before and after the completion of the dietary plan.

Solution:

Before 210 180 195 220 231 199 224

After 193 186 186 223 220 183 233

Difference -17 6 -9 3 -11 -16 9

difference = after - before

So d = -5, ds = 10.79, ν = 7-1= 6, t0.025(6) = 2.447

A 95 confidence interval for µD= µ1-µ2 is

( ) ( )

979.4979.14

979.95979.95

7

79,10447.25

7

79,10447.25

≤≤−⇒

+−≤≤−−⇒

+−≤≤

−−

D

D

D

µ

µ

µ

A 95% confidence interval for the difference of the mean systolic blood pressures before and

after the completion of the dietary plan is -14.979 to 4.979.

67

Exercises

1. A study of iron deficiency infants compared samples of infants whose mothers those

different ways of feeding their babies. One group contained breast-fed infants. The

babies in another group were fed a standard baby formula milk powder without iron

supplements. The result on blood hemoglobin levels (iron levels) for babies 12 months

of age are as follows:

Group Sample Size Sample S.D. Sample Mean

Breast-fed 25 3.1 15.3

Mike Powder 19 1.8 12.4

Is there evidence that the mean hemoglobin level is different between the two groups

of babies? Assume two groups have same population variances. Construct a 95%

confidence interval to support your claim.

2. In a study to test whether is a difference between the average heights of adult females

born in two different countries, random samples yielded the following results:

62.2,8.61,150

50.2,7.62,120

222

111

===

===

sxn

sxn

where the measurements are in inches. Find the 95% confidence interval for the mean

of their difference. What do you conclude?

3. The manager of a fleet of automobiles is testing two brands of radial tires and assigns

one tire of each brand at random to the two rear wheels of eight cars and runs until the

tires wear out. The data (in kilometer) follow. Find a 99% confidence interval on the

difference in mean life. Which brand would you prefer, based on this calculation.

Car Brand 1 Brand 2

1 36925 34318

2 45300 42280

3 36240 35500

4 32100 31950

5 37210 38015

6 48360 47800

7 38200 37810

8 33500 33215

4. Two machines are used for filling plastic bottles with a net volume of 16.0 ounces.

The fill volume can be assumed normal, with standard deviation 1σ = 0.020 dan 2σ =

0.025 ounces. A member of the quality engineering staff suspects that both machines

fill to the same mean net volume, whether or not this volume is 16.0 ounces. A

random sample of 10 bottles is taken from the output of each machine.

Machine 1 Machine 2

16.03

16.04

16.05

16.05

16.02

16.01

15.96

15.98

16.02

15.99

16.02

15.97

15.96

16.01

15.99

16.03

16.04

16.02

16.01

16.00

Find a 95% confidence interval on the difference in means. Provide a practical

interpretation of this interval.

68




Topic 22 : Confidence Interval of Proportion

It is often necessary to construct confidence intervals on a population proportion. For example,

suppose that a random sample of size n has been taken from a large (possibly infinite) population

and that X(≤ n) observations in this sample belong to a class of interest. Then is a point estimator

of the proportion of the population p that belongs to this class.

Note that n and p are the parameters of a binomial distribution. We know that the sampling

distribution of is approximately normal with mean p and variance p(1-p)/n, if p is not too close to

either 0 or 1 and if n is relatively large. Typically, to apply this approximation we require that np

and n(1 - p) be greater than or equal to 5.

Definition 1

If n is large, the distribution of

( ) ( )

n

pp

pP

pnp

npXZ

−

−=

−

−=

1

ˆ

1

is approximately standard normal distribution.

To construct the confidence interval of p, note that

( ) ααα −=≤≤− 122

zZzP

so

( )

ααα −=

≤−

−≤− 1

1

ˆ

22

z

n

pp

pPzP

This may be arranged as

( ) ( )ααα −≅

−+≤≤

−− 1

1ˆ1ˆ22 n

ppzPp

n

ppzPP

The quantity ( ) npp −1 is called the standard error of the point estimator P̂ . Unfortunately,

the upper and lower limits of the confidence interval contain the unknown parameter p. However,

as suggested that a satisfactory solution is to replace p by in the standard error, which results in

( ) ( )ααα −≅

−+≤≤

−− 1

ˆ1ˆˆ

ˆ1ˆˆ

22 n

PPzPp

n

PPzPP

This leads to the approximate 100(1-α)% confidence interval on p.

69

Definiton 2

If p̂ is the proportion of observations in a random sample of size n that belongs to a class of

interest, an approximate 100(1-α)% confidence interval on the proportion p of the population that

belongs to this class is

( ) ( )n

ppzpp

n

ppzp

ˆ1ˆˆ

ˆ1ˆˆ

22

−+≤≤

−− αα

Example

In a random sample of 1000 homes in a certain city, it is found that 228 are heated by oil. Find a

99% confidence interval for the proportion of homes from this city that are heated by oil.

Solution:

X is the number of homes which is heated by oil.

228.01000

228ˆ ===

n

xp , qp ˆˆ1 =− = 0.772, z0.005 = 2.575

A 99% confidence interval for the proportion of homes from this city that are heated by oil is

( )( ) ( )( )

262.0194.0

034.0228.0034.0228.0

1000

772.0228.0575.2228.0

1000

772.0228.0575.2228.0

≤≤⇒

+≤≤−⇒

+≤≤−

p

p

p

Exercises

1. Of 346 items tested, 12 are found to be defective. Construct a 98% confidence interval for

the percentage of all such items that are defective.

2. Of 81 adults selected randomly from one town, 64 have health insurance. Construct a

90% confidence interval for the percentage of all adults in the town who have health

insurance.

3. A study involves 634 randomly selected deaths, with 29 of them caused by accidents.

Construct a 98% confidence interval for the percentage of all deaths that are caused by

accidents.

70




Topic 23 : Confidence Interval of The Difference Between Two Proportions

The confidence interval for 21 pp − can be found directly, since we know that

( ) ( )

( ) ( )

2

22

1

11

2121

11

ˆˆ

n

pp

n

pp

ppPPZ

−+

−

−−−=

is a standard normal random variable. Thus ( ) ααα −≅≤≤− 122 zZzP , so we can substitute

for Z in this last expression and use an approach similar to the one employed previously to

find an approximate 100(1-α)% two-sided confidence interval for 21 pp − .

Definition

If 1p̂ and 2p̂ are the sample proportions of observation in two independent random samples

of sizes 1n and 2n that belong to a class of interest, an approximate two sided 100(1-α)%

confidence interval on the difference in the true proportions 21 pp − is

( )( ) ( )

( )( ) ( )

2

22

1

112121

2

22

1

1121

ˆ1ˆˆ1ˆˆˆ

ˆ1ˆˆ1ˆˆˆ

22 n

pp

n

ppzpppp

n

pp

n

ppzpp

−+

−+−≤−≤

−+

−−− αα

Example

A survey was carried out to study the relationship between the incidence of heart disease (the

proportion of people suffered from heart disease) and the smoking habit. A researcher

randomly selected a sample of 200 men who are 60 years old and asked them if they are

smokers and if they have ever suffered from heart disease. The results are given below.

Suffer from heart disease Non suffer from heart disease Total

Smoker 19 37 56

Nonsmoker 25 119 144

44 156 200

a. Find the proportion of people suffered from heart disease for the smoker and non-

smoker groups in the sample.

b. An insurance company decided to offer discounts on is life insurance policies to

nonsmokers, if the incidence of heart disease for smokers is higher than that for

nonsmokers. Construct a 95% confidence interval.

Solution:

Let 1: the smoker, 2: the non smoker

a. 3393.056/19ˆ1 ==p , 1736.0144/25ˆ

2 ==p

b. A 95% confidence interval for the difference of proportions between the smoker and

nonsmoker is

71

( )( ) ( )

( )( ) ( )

3043.00271.0

1386.01657.01386.01657.0

144

8264.01736.0

56

6607.03393.096.11736.03393.0

144

8264.01736.0

56

6607.03393.096,11736.03393.0

21

21

21

≤−≤⇒

+≤−≤−⇒

+−−≤−≤+−−

pp

pp

pp

0 is not inside on 95% confidence interval, lower and upper limit are positive values

so we can conclude that the proportion the incidence of heart disease for smoker is

higher than that for nonsmokers.

Exercise

In a winter of an epidemic flu, babies were surveyed by a well-known pharmaceutical

company to determine if the company’s new medicine was effective after two days. Among

120 babies who had the flu and were given the medicine, 29 were cured within two days.

Among 280 babies who had the flu but were not given the medicine, 56 were cured within

two days. Construct and interpret a 99% confidence interval for (p1 − p2).

72




Topic 24 : Confidence Interval of Variance

The construction of the 100(1-α)% confidence interval for 2σ is straightforward. Because

( )

2

22 1

σ

SnX

−=

is chi-square with n – 1 degrees of freedom, we may write

( ) αχχ αα −=≤≤−−−

12

1;

22

1;122

nnXP

so that

( )

αχσ

χ αα −=

≤

−≤

−−−1

1 2

1;2

22

1;122

nn

SnP

This last equation can be rearranged as

( ) ( )

αχ

σχ αα

−=

−

≤≤−

−−−

111

2

1;1

22

2

1;

2

22nn

SnSnP

This leads to the following of the confidence interval for 2σ .

Definition

If 2s is the sample variance from a random sample of n observations from a normal

distribution with unknown variance 2σ , then a 100(1-α)% confidence interval on 2σ is

( ) ( )2

1;1

22

2

1;

2

22

11

−−−

−≤≤

−

nn

snsn

αα χσ

χ

Example

A manufacturer of soft drink beverages is interested in the uniformity of the machine used to

fill cans. Specifically, it is desirable that the standard deviation, σ, of the filling process be

less than 0.2 fluid ounces; otherwise there will be a higher than allowable percentage of cans

that are under filled. We will assume that fill volume is approximately normally distributed. A

random sample of 20 cans result in a sample variance of 0.0225 (fluid ounces)2. Find a 95%

confidence interval on the variance and the standard deviation.

Solution:

σ = 0.2, n = 20, s = 0.0225, 852.322

19;025.0 =χ , 907.82

19;975.0 =χ .

A 95% confidence interval on the variance is

( )( ) ( )( )

001080.0000293.0

907.8

0225.0120

852.32

0225.0120

2

2

2

2

≤≤⇒

−≤≤

−

σ

σ

A 95% confidence interval on the standard deviation is

0329.00171.0

001080.0000293.0

≤≤⇒

≤≤

σ

σ

73

Exercises

1. The data collected 17 observations of breakdown voltage. We have n = 17 and s2

=

137324.3. Thus for 95% confidence interval for σ2.

2. The sugar content of the syrup in canned peaches is normally distributed. A random

sample of n = 10 cans yields a sample standard deviation of s = 4.8 milligrams. Find a

95% two-sided confidence interval for σ.

74




Topic 25 : Confidence Interval on The Ratio of Two Variances

To find the confidence interval on 2

2

2

1 σσ , recall that the sampling distribution of

2

2

2

2

2

1

2

1

σ

σ

S

Sf =

is an f with 111 −= nν and 122 −= nν degrees of freedom.

Therefore,( ) ( )

( ) ανννν αα −=≤≤

−1

212212,;,;1

fffP . Substitution for f and manipulation of the

inequalities will lead to the 100(1-α)% confidence interval for 2

2

2

1 σσ .

Definition

If 2

1s and 2

2s are the sample variances of random samples of sizes 1n and 2n , respectively,

from two independent normal populations with unknown variances 2

1σ and 2

2σ , then a

( )%1100 α− confidence interval on the ratio 2

2

2

1 σσ is

( )( )122

212

,;2

2

2

1

2

2

2

1

,;

2

2

2

1 1νν

νν

α

α σ

σf

s

s

fs

s≤≤

where 111 −= nν numerator and 122 −= nν denominator degrees of freedom respectively.

Example

In a batch chemical process used for etching printed circuit boards, two different catalysts are

being compared to determine whether they require different emersion times for removal of

identical quantities of photoresist material. Twelve batches were run with catalyst 1, resulting

in a sample mean emersion time of 24.6 minutes and a sample standard deviation of 0.85

minutes. Fifteen batches were run with catalyst 2, resulting in a mean emersion time of 22.1

minutes and a standard deviation of 0.98 minutes. Find a 90% confidence interval on the ratio

of variances.

Solution:

Let 1: catalyst 1, 2: catalyst 2

n1 = 12, s1 = 0.85, n2 = 15, s2 = 0.98, ν1 = 11, ν2 = 14, f0.05(11,14) = 2.565, f0.05(14,11) = 2.755.

A 90% confidence interval on the ratio of variances is

0726.22933.0

755.298.0

85.0

565.2

1

98.0

85.0

2

2

2

1

2

2

2

2

2

1

2

2

≤≤⇒

≤≤

σ

σ

σ

σ

75

Exercise

Suppose we had the grade-point averages (GPAs) of random samples of computer science

majors (Population 1) and other engineering majors (Population 2) at a university. Suppose

we know

144,100,01.0,36.3,0625.0,25.3 21

2

22

2

11 ====== nnsxsx

a. Find a 98% confidence interval for the ratio of the variance in GPAs of computer science

majors to the variance in GPAs of other engineering majors based on this data.

b. What can we say from the confidence interval in a) about which group has the larger

variance?

c. What can we say from the confidence interval in a) about which group has the larger

standard deviation?

76




Topic 26 : Hypothesis Testing (Hypothesis Testing for Mean)

The Basic Concepts of Hypothesis Testing

If we want to decide whether to accept or reject a statement about parameter. The statement

for accepting or is called a hypothesis, and the decision-making procedure about the

hypothesis is called hypothesis testing. Hypothesis is every hypothesis test begins with a

mathematical statement of the claim.

A statistical hypothesis is a statement about the parameter of one or more population.

Every hypothesis test begins with a mathematical statement of the claim. This claim must

then be identified as either the null or alternate hypothesis. The remaining hypothesis

statement (its conjugate) must be also given. Whether the claim turns out to be the null

hypothesis (H0) always includes the condition of equality. There are three possible

combinations for the null and alternate hypothesis statements.

Null Hypothesis – H0 Alternate Hypothesis – H1

= ≠≠≠≠

≤≤≤≤ >

≥≥≥≥ <

Initial Conclusion

Based on the location of test statistics, we will draw the initial conclusion. There are only two

possible statements for the initial conclusion.

1. Reject H0

2. Fail to reject H0

Final Conclusion

The null hypothesis is always assumed true in the beginning of a hypothesis test. The equality

in this hypothesis determines where to center the sampling distribution. This assumption is

much like the justice system in which the defendant is always assumed innocent at the

beginning of a trial.

Null Hypothesis (H0) : defendant is innocent

Alternate Hypothesis (H1) : defendant is guilty

There are four possible outcomes to a trial, depending on the status of the defendant and the

court decision. H0 is True

(Defendant is innocent)

H0 is False (H1 is True, Defendant is

guilty)

H0 is Rejected

• H1 is accepted

• Found Guilty

Incorrect Conclusion – Type I

Error

• Innocent found guilty

• Innocent sent to jail (?)

• Prob (Type I Error) = α

Correct Conclusion

• Guilty found guilty

H0 is Not Rejected

• Found Not

Guilty

Correct Conclusion

• Innocent found not guilty

Incorrect Conclusion -Type II Error

• Guilty found not guilty

• Guilty set free

• Prob (Type II Error) = β

77

In a trial, the probability of a Type I error is always minimized. We do not want to send

innocent people to prison. So, juries are instructed that they must believe “beyond all

reasonable doubt” that the defendant is guilty. As a trade off, we increase the chance that a

guilty person is set free. That is as the probability of a type I error, α, is decreased, the

probability of a type II error, β, is increased. In a hypothesis test, it is also try to minimize the

probability of a type I error.

Based on the relationship between the claim, null hypothesis and the initial conclusion, it will

be reach the final conclusion regarding the claim.

Reject H0 Fail to Reject H0

Claim is H0 Reject Claim: There is sufficient evidence

to reject the claim that …

Don’t Reject Claim: There is insufficient

evidence to reject the claim that…

Claim is H1 Support Claim: There is sufficient

evidence to support the claim that…

Don’t Support Claim: There is insufficient

evidence to support the claim that…

Procedures in Hypothesis Testing

From the problem context, identify the parameter of interest and then we continue to

procedure for hypothesis testing as follows:

1. Formulate the null and alternative hypothesis.

2. Choose a level of significance (α).

3. Determine an appropriate test statistics

4. State the rejection region for the statistic

5. Calculate any necessary sample quantities, substitute these into the equation for the

test statistic, and calculate that value

6. Decide whether or not H0 should be rejected, therefore state that there is not enough

evidence to suggest the truth of the alternative hypothesis

Hypothesis Testing For Mean, Variance Known

Suppose that we wish to test the hypotheses

H0 : µ = µ0

H1 : µ ≠ µ0

where µ0 is a specified constant. We have a random sample X1, X2, … , Xn from a normal

population. Since X has a normal distribution (i.e., the sampling distribution of is normal)

with mean µ0 and standard deviation nσ . If the null hypothesis is true, we could construct

a critical region based on the computed value of the sample mean. It is usually more

convenient to standardize the sample mean and use a test statistic based on the standard

normal distribution. That is, the test procedure for H0 : µ = µ0 uses the test statistic.

n

XZ

σ

µ0−=

Null Hypothesis Alternative Hypothesis Test Statistic Decision Calculate

H0 : µ = µ0 H1: µ ≠ µ0 Variance Known,

n

xz

σ

µ0−=

Rejected H0 if z > zα/2 or

z < - zα/2

H0 : µ = µ0

H0 : µ ≥ µ0

H1: µ < µ0 Rejected H0 if z < - zα

H0 : µ = µ0

H0 : µ ≤ µ0

H1: µ > µ0 Rejected H0 if z > zα

78

Example 1

We believe that systolic blood pressure averages 120 mmHg in the population with a standard

deviation of 10 mmHg. If a randomly sampled 50 people and found that mean systolic blood

pressure was 130, what could we say about the assumption that the population mean is 120?

Solution:

0µ = 120, σ = 10, n = 50, x = 130

Hypothesis:

H0 : µ = 120

H1 : µ ≠ 120

Significance Level : α = 0.05

Test Statistic:

n

xz

σ

µ0−=

Decision Calculate: z0.025 = 1.96

Rejected H0 if z > 1.96 or z < - 1.96

Calculate:

071.75010

120130=

−=z

Conclusion:

Because z = 7.071 > 1.96 then rejected H0. We can conclude that the population mean of

systolic blood pressure is not equal to 120 mmHg.

Hypothesis Testing For Mean, Variance Unknown, Large Sample

The test procedure for the hypothesis of mean assuming that the population is normally

distributed and that 2σ is known. In many if not most practical situations 2σ will be

unknown. Furthermore, we may not be certain that the population is well modeled by a

normal distribution. In these situations if n is large (say n ≥ 30) the sample standard deviation

s can be substituted for σ in the test procedures with little effect.


H0 : µ = µ0 H1: µ ≠ µ0 Variance Unknown,

n ≥ 30

ns

xz 0µ−

=


z < - zα/2

H0 : µ = µ0

H0 : µ ≥ µ0

H1: µ < µ0 Rejected H0 if z < - zα

H0 : µ = µ0

H0 : µ ≤ µ0

H1: µ > µ0 Rejected H0 if z > zα

Example 2

The weights of a fish in a certain pond that is regularly stocked are considered to be normally

distributed with a mean of 3.1 pounds. A random sample of size 50 is selected from the pond

and the sample mean is found to be 2.4 pounds and the sample standard deviation is 1.1

pounds. Is there sufficient evidence to indicate that the mean weight of the fish less than 3.1

pounds? Use a 10% level of significance.

Solution:

0µ = 3.1, n = 50, x = 2.4, s = 1.1

Hypothesis:

H0 : µ = 3.1

H1 : µ < 3.1

79


Test Statistic:

ns

xz 0µ−

=

Decision Calculate: z0.10 = 1.28

Rejected H0 if z < - 1.28

Calculate:

500.4501.1

1.34.2−=

−=z

Conclusion:

Because z = -4.500 < -1.28 then rejected H0. We can conclude that there is sufficient

evidence to indicate that the mean weight of the fish less than 3.1 pounds.

Hypothesis Testing For Mean, Variance Unknown

The case of hypothesis testing on the mean of a population with unknown variance 2σ , we

will assumed that the population distribution is at least approximately normal. If

nXXX ,,, 21 K is a random sample from a normal distribution with mean µ and variance 2σ ,

the random variable

nS

XT

µ−=

has a t distribution with n-1 degrees of freedom.

Now consider testing the hypotheses

H0 : µ = µ0

H1 : µ ≠ µ0

We will use the test statistic

nS

XT 0µ−

=

If the null hypothesis is true, T has a t distribution with n-1 degrees of freedom.


H0 : µ = µ0 H1: µ ≠ µ0 Variance Unknown,

ns

xt 0µ−

=

Rejected H0 if t > tα/2;n-1

or t < - tα/2;n-1

H0 : µ = µ0

H0 : µ ≥ µ0

H1: µ < µ0 Rejected H0 if t < - tα;n-1

H0 : µ = µ0

H0 : µ ≤ µ0

H1: µ > µ0 Rejected H0 if t > tα;n-1

Example 3

Average systolic blood pressure of a normal male is supposed to be about 129 mmHg.

Measurements of systolic blood pressure on a sample of 12 adult males from a community whose

dietary habits are suspected of causing high blood pressure are listed below:

115 134 131 143 130 154 119 137 155 130 110 138

Is it reasonable to claim that the mean of systolic blood pressure of a normal male is greater

than 129 mmHg? Using α = 0.05.

80

Solution:

0µ = 129, n = 12, x = 133, s = 13.941

Hypothesis:

H0 : µ = 129

H1 : µ > 129


Test Statistic:

ns

xt 0µ−

=

Decision Calculate: t0.05(11) = 1.796

Rejected H0 if t0 > 1.796

Calculate:

994.012941.13

129133=

−=t

Conclusion:

Because t = 0.994 < 1.796 then fail to reject H0. We can conclude that there is not a

reasonable to claim that the mean of systolic blood pressure of a normal male is greater than

129 mmHg.

Exercises

1. A random sample of 12 second-year university students enrolled in a business

statistics course was drawn. At the course's completion, each student was asked how

many hours he or she spent doing homework in statistics. The data are listed here. It is

known that the population standard deviation is 10. The instructor has recommended

that students devote 3 hours per week for the duration of the 12-week semester, for a

total of 36 hours. Test to determine whether there is evidence that the average student

spent less than the recommended amount of time. Use a α = 0.01.

31 45 35 30 36 38

29 40 38 30 35 38

2. A manufacturer of light bulbs advertises that, on average, its long-life bulb will last

more than 5000 hours. To test the claim, a statistician took a random sample of 100

bulbs and measured the amount of time until each bulb burned out and get the mean is

5200 hours. If we assume that the lifetime of this type of bulb has a standard deviation

of 400 hours, can we conclude at the 5% significance level that the claim is true?

3. The company claims that customers calling in to make trades on the stock market are

left on hold for an average of about 22 seconds. Suppose that an account executive

was concerned about the satisfaction level of the clients and selected a random sample

of 10 traders who phoned. The length of time that a client was left on hold was

recorded. Assume that the sample mean was 23.5 seconds with a standard deviation of

8 seconds. Is there sufficient evidence to indicate that customers are left on hold for an

average length of time greater than 22 seconds? Use a 5% level of significance.

4. In an advertisement, a pizza shop claims that its mean delivery time is less than 30

minutes. A random selection of 36 delivery times yields a sample mean of 28.5

minutes and a standard deviation of 3.5 minutes. Does this provide sufficient evidence

to support the claim at a significance level of α = 0.01?

81




Topic 27 : Hypothesis Testing (Hypothesis Testing for Two Means)

Hypothesis Testing for Two Means, Variance Known

Suppose a random sample of size n1 is to be taken from a population with mean µ1 and

standard deviation σ1, and a random sample of size n2 is to be taken from a population with

mean µ2 and standard deviation σ2. Further suppose the two samples are to be selected

independently. Then the random variable 21 XX − is approximately normally distributed and

heas mean 2121

µµµ −=− XX

and standard deviation ( ) ( )2

2

21

2

111

nnXX

σσσ +=−

. Thus the

standardized random variable,

( ) ( )

2

2

2

1

2

1

2121

nn

XXZ

σσ

µµ

+

−−−=

has approximately the standard normal distribution.

Null Hypothesis Alternative Hypothesis Test Statistic Decision Criteria

H0 : µ1-µ2 = d0 H1: µ1-µ2 ≠ d0 Variance Known,

( ) ( )

2

2

2

1

2

1

2121

nn

xxz

σσ

µµ

+

−−−=

Rejected H0 if z > zα/2

or z < - zα/2

H0 : µ1-µ2 = d0

H0 : µ1-µ2 ≥ d0

H1: µ1-µ2 < d0 Rejected H0 if z < - zα

H0 : µ1-µ2 = d0

H0 : µ1-µ2 ≤ d0

H1: µ1-µ2 > d0 Rejected H0 if z > zα

Example 1

Before a training session for call centre employees a sample of 50 calls to the call centre had

an average duration of 5 minutes, whereas after the training session a sample of 45 calls had

an average duration of 4.5 minutes. The population variance is known to have been 1.5

minutes before the course and 2 minutes afterwards. Has the course been effective? Use α =

0.01.

Solution:

Let 1: The duration before training

2: The duration after training

n1 = 50, 1x = 5, 2

1σ = 1.5, n2 = 45, 2x = 4.5, 2

2σ = 2

Hypothesis:

H0: µ1 - µ2 = 0

H1: µ1 - µ2 < 0

Significance Level: α = 0.01

Test Statistic:

( ) ( )

2

2

2

1

2

1

2121

nn

xxz

σσ

µµ

+

−−−=

Decision Criteria: z0.01 = 2.325

Rejected H0 if z < - 2.325

82

Calculate:

( )832.1

45

2

50

5.1

05.45=

+

−−=z

Conclusion:

Because z = 1.832 > -2.325 so H0 is not rejected.

We fail to reject the null hypothesis and conclude that there is insufficient evidence to

conclude that the course been effective. (The course has not been effective)

Hypothesis Testing for Two Means, Variance Unknown, Large Sample

If we have larges samples (say n1 ≥ 30 and n2 ≥ 30). The sample standard deviation 2

1s and 2

2s

can be substituted for 2

1σ and 2

2σ respectively, so the random variable is

( ) ( )

2

2

2

1

2

1

2121

n

S

n

S

XXZ

+

−−−=

µµ


H0 : µ1-µ2 = d0 H1: µ1-µ2 ≠ d0 Variance Unknown,

(n1 ≥ 30 and n2 ≥ 30)

( ) ( )

2

2

2

1

2

1

2121

n

s

n

s

xxz

+

−−−=

µµ


or z < - zα/2

H0 : µ1-µ2 = d0

H0 : µ1-µ2 ≥ d0

H1: µ1-µ2 < d0 Rejected H0 if z < - zα

H0 : µ1-µ2 = d0

H0 : µ1-µ2 ≤ d0

H1: µ1-µ2 > d0 Rejected H0 if z > zα

Example 2

Do employees perform better at work with music playing. The music was turned on during

the working hours of a business with 45 employees. There productivity level averaged 5.2

with a standard deviation of 2.4. On a different day the music was turned off and there were

40 workers. The workers productivity level averaged 4.8 with a standard deviation of 1.2.

What can we conclude at the 0.05 level?

Solution

Let 1: the music was turned on during the working hours of a business

2: the music was turned off during the working hours of a business

n1 = 45, 1x = 5.2 hours, s1 = 2.4 hours.

n2= 40, 2x = 4.8 hours, s2 = 1.2 hours.

Hypothesis:

H0: µ1 - µ2 = 0

H1: µ1 - µ2 > 0 Significance Level : α = 0.05

Test Statistic:

( ) ( )

2

2

2

1

2

1

2121

n

s

n

s

xxz

+

−−−=

µµ


Rejected H0 if z > 1.645

83

Calculate:

( )

( ) ( )988.0

40

2.1

45

4.2

08.42.5

22=

+

−−=z

Conclusion:

Because z = 0.988 < 1.645 so H0 is not rejected.

We fail to reject the null hypothesis and conclude that there is insufficient evidence to

conclude that workers perform better at work when the music is on.

Hypothesis Testing for Two Means, Variance Unknown and Assumed Equal

Suppose independent random samples of sizes n1 and n2 are to be taken from two normally

distributed populations with means µ1 and µ2, respectively. Further suppose the standard

deviation of the two populations are equal. Then the random variable is

( ) ( )

21

2121

11

nnS

XXT

p +

−−−=

µµ

where ( ) ( )

2

11

21

2

22

2

11

−+

−+−=

nn

SnSnS p

has the t distribution with degrees of freedom 221 −+= nnν .



( 2

2

2

1 σσ = )

( ) ( )

21

2121

11

nns

xxt

p +

−−−=

µµ

Rejected H0 if

t > tα/2;ν or t < - tα/2;ν

H0 : µ1-µ2 = d0

H0 : µ1-µ2 ≥ d0

H1: µ1-µ2 < d0 Rejected H0 if

t < - tα;ν

H0 : µ1-µ2 = d0

H0 : µ1-µ2 ≤ d0

H1: µ1-µ2 > d0 Rejected H0 if

t > tα;ν

Example 3

A company is interested in knowing if two branches have the same level of average

transactions. The company sample a small number of transactions and calculates the

following statistics:

Shop 1 1x = 130 7002

1 =s n1 = 12

Shop 2 2x = 120 8002

2 =s n2 = 15

Test whether or not the two branches have (on average) the same level of transactions.

Use α = 0.05. Assumed that 2

2

2

1 σσ = .

Solution:

Let 1: shop 1

2: shop 2

Hypothesis:

H0: µ1 - µ2 = 0

H1: µ1 - µ2 ≠ 0 Significance Level : α = 0.05

84

Test Statistic:

( ) ( )

21

2121

11

nns

xxt

p +

−−−=

µµ

Decision Criteria: ν = n1 + n2 – 2 = 12 + 15 – 2 = 25, t0.025(25) = 2.060

Rejected H0 if t > 2.060 or t < -2.060

Calculate:

( ) ( )

495.2721512

800115700112=

−+

×−+×−=ps

( )939.0

15

1

12

1495.27

0120130=

+

−−=t

Conclusion:

Because -2.060 < t = 0.939 < 2.060 so H0 is not rejected.

We fail to reject the null hypothesis and conclude that the two branches have (on average) the

same level of transactions.

Hypothesis Testing for Two Means, Variances Unknown and not Assumed Equal

Suppose independent random samples of sizes n1 and n2 are to be taken from two normally

distributed populations with means µ1 and µ2, respectively, and the variances are unknown but

not assumed equal ( 2

2

2

1 σσ ≠ ). Then the random variable,

( ) ( )

2

2

2

1

2

1

2121

n

S

n

S

XXT

+

−−−=

µµ

has approximately the t distribution with degrees of freedom ν, where

( ) ( )

11 2

2

2

2

2

1

2

1

2

1

2

2

2

2

1

2

1

−+

−

+

=

n

ns

n

ns

n

s

n

s

ν



( 2

2

2

1 σσ ≠ )

( ) ( )

2

2

2

1

2

1

2121

n

s

n

s

xxt

+

−−−=

µµ

Rejected H0 if

t > tα/2;ν or t < - tα/2;ν

H0 : µ1-µ2 = d0

H0 : µ1-µ2 ≥ d0

H1: µ1-µ2 < d0 Rejected H0 if

t < - tα;ν

H0 : µ1-µ2 = d0

H0 : µ1-µ2 ≤ d0

H1: µ1-µ2 > d0 Rejected H0 if

t > tα;ν

Example 4

A chain of record shops believes that its Northumberland Street store (Shop 1) is more

successful than its Metro Centre branch (Shop 2). The management take a random sample of

daily takings and obtains the following summary statistics:

Shop 1 1x = $15000 4002

1 =s n1 = 10

Shop 2 2x = $14250 6002

2 =s n2 = 20

85

Is the management’s belief correct? Use α = 0.05, assumed that 2

2

2

1 σσ ≠ .

Solution:

Let 1: shop 1

2: shop 2

Hypothesis:

H0: µ1 - µ2 = 0

H1: µ1 - µ2 > 0 Significance Level : α = 0.05

Test Statistic:

( ) ( )

2

2

2

1

2

1

2121

n

s

n

s

xxt

+

−−−=

µµ

Decision Criteria: t0.05(22) = 1.717

( ) ( )

22764.21

120

20600

110

10400

20

600

10

400

22

2

≈=

−+

−

+

=ν

Rejected H0 if t > 1.717

Calculate:

( )642.89

20

600

10

400

01425015000=

+

−−=t

Conclusion:

Because t = 89.642 > 1.717 so H0 is rejected.

We can conclude that the management’s belief correct (The Northumberland Street store

(Shop 1) is more successful than Metro Centre branch (Shop 2)).

Hypothesis Testing for Two Population Means Using Paired Samples

Suppose a random sample of n pairs is to be taken from population with means µ1 and µ2.

Further suppose the population of all paired differences is normally distributed. Then the

random variable

nS

dDT

d

0−=

has the t distribution with degrees of freedom, ν = n – 1 .


H0 : µD = d0 H1: µD ≠ d0

ns

ddt

d

0−=

Rejected H0 if

t > tα/2;ν or t < - tα/2;ν

H0 : µD = d0

H0 : µD ≥ d0

H1: µD < d0 Rejected H0 if

t < - tα;ν

H0 : µD = d0

H0 : µD ≤ d0

H1: µD > d0 Rejected H0 if

t > tα;ν

86

Example 5

A student of mine wanted to test whether husbands tend to be older than their wives on

average. He went to the county courthouse and took a sample of 24 couples who had applied

for marriage licenses, recording the ages of the man and woman in each case. Some more

summary statistics, with differences calculated as husband’s age minus wife’s age:

Sample size Sample mean Sample standard deviation

Differences 24 1.875 4.812

Can we conclude that these sample data provide evidence that the population mean age of

husbands exceeds that of wives? Use α = 0.01.

Solution:

Let 1: husband, 2: wife. 21 µµµ −=D .

Hypothesis:

H0: µD = 0

H1: µD > 0 Significance Level : α = 0.01

Test Statistic:

ns

ddt

d

0−=

Decision Criteria: t0.01(23) = 2.500

Rejected H0 if t > 2.500

Calculate:

909.124812.4

0875.1=

−=t

Conclusion:

Because t = 1.909 < 2.500 so H0 is not rejected.

We conclude that these sample data provide no evidence that the population mean age of

husbands exceeds that of wives. (the mean age of husband same as wives).

87

Exercises

1. A study is conducted to assess the differences in performance during the first years of

services between employees that stayed in a certain company during 15 years and

those who left the company. The performance is measured by the company's annual

performance appraisal which produce ratings on a 5 point scale; 1 for low performance

and 5 for high performance. The data are summarized in the Table

Stayers Leavers

n1 = 174

1x = 3.51

s1 = 0.51

n2 = 355

2x = 3.24

s2 = 0.52

Can we conclude that there is any differences in in performance during the first years

of services between employees that stayed in a certain company during 15 years and

those who left the company. Use α = 0.05.

2. Many people who own digital cameras prefer to have pictures printed. In a preliminary

study to determine spending patterns, a random sample of 8 digital camera owners and

8 standard camera owners were surveyed and asked how many pictures they printed in

the past month. The results are presented here. Can we infer that the two groups differ

in number of pictures that are printed? Use α = 0.01.

Digital 15 12 23 31 20 14 12 19

Standard 0 24 36 24 0 48 0 0

3. A social worker was interested in determining whether there is a significant difference

in the average daily cost per child for childcare outside the home between state

supported facilities and privately owned facilities. Two independent random samples

yielded the following information:

State Supported

Facilities

Privately Owned

Facilities

Sample Size 50 30

Sample Mean 25 22

Sample Standard Deviation 6 5

Perform the appropriate test of hypothesis to determine whether there is a significant

difference in the average daily cost per child for childcare between the two types of

facilities. Use α = 0.10.

4. A researcher wants to see if birds that build larger nests lay larger eggs. She selects

two random samples of nests: one of small nests and the other of large nests. She

weighs one egg from each nest. The data are summarized below.

small nests large nests

Sample Size 60 159

Sample Mean (gr) 37.2 35.6

Sample Variance 24.7 39.0

Can we conclude that the average mass of eggs in small at least same as the average

mass of eggs in large nests.

88




Topic 28 : Hypothesis Testing (Hypothesis Testing for Proportion)

Hypothesis Testing for One Population Proportion

Definition

A sample proportion, p̂ , is computed using the formula

n

xp =ˆ

where x denotes the number of members sampled that have the specified attribute and n

denotes the sample size.

Suppose a large random sample of size n is to be taken from a two category population with

population proportion p. The random p̂ is approximately normally distributed and has mean

pp =ˆµ and standard deviation ( ) nppp −= 1ˆσ . So the standardized random variable

( ) npp

pPZ

−

−=

1

ˆ

has approximately the standard normal distribution. Consequently, to perform a large sample

hypothesis test with the null hypothesis H0: p = p0, we can use the random variable

( ) npp

pPZ

00

0

1

ˆ

−

−=


H0 : p = p0 H1: p ≠ p0

( ) npp

ppz

00

0

1

ˆ

−

−=


z < - zα/2

H0 : p = p0

H0 : p ≥ p0

H1: p < p0 Rejected H0 if z < - zα

H0 : p = p0

H0 : p ≤ p0

H1: p > p0 Rejected H0 if z > zα

Example

In a recent year, 73% of 1st-year college students responding to a national survey identified

“being very well-off financially” as an important personal goal. In a random sample of 200 of

its 1st-year students, a state university finds that 132 students say that this goal is important.

Test the hypothesis that the proportion of 1st-year students at this university who think this

goal is important differs from the national value of 73% at the 0.10 significance level.

Solution:

X is the number of student who is being very well-off financially

0p = 0.73, n = 200, x = 132, p̂ = 132/200 = 0.66

89

Hypothesis:

H0 : p = 0.73

H1 : p ≠ 0.73


Test Statistic:

( ) npp

ppz

00

0

1

ˆ

−

−=


Rejected H0 if z > 1.645 or z < - 1.645

Calculate:

( )( )

23.220027.073.0

73.066.0−=

−=z

Conclusion:

Because z = -2.23 < -1.645 then rejected H0.

There is significant evidence to conclude that the proportion of students at this university,

who think being very well-off is important, differs from the national value of 73%.

Exercises

1. A study of 828 travellers showed that 567 of them purchased plane tickets on an

airline website in the past 12 months. The major airlines believe that more than 65%

of all travelers purchase their tickets on airline websites. Do the data support this? Test

at α = 0.10.

2. An insurance company states that 90% of its claims are settled within 30 days. A

consumer group selected a simple random sample of 75 of the company’s claims to

test this statement. The consumer group found that 55 of the claims were settled

within 30 days. At the 0.05 significance level, test the company’s claim that 90% of its

claims are settled within 30 days.

90




Topic 29 : Hypothesis Testing (Hypothesis Testing for Two Proportions)

The number of successes refers to the number of members sampled that have the specified

attribute are called the sample proportions, given by

1

11

ˆn

xp = and

2

22

ˆn

xp =

Suppose a random sample of size n1 and n2 is to be taken from a two-category population with

population proportion p1, and a random sample of size n2 is to be taken from a two-category

population with population proportion p1. Further suppose the two samples are to be selected

independently. Then for large samples, the random variable 21ˆˆ PP − is approximately

normally distributed and has mean 21ˆˆ21

ppPP

−=−

µ , and the standardized deviation

( )( ) ( )( )222111ˆˆ 1121

nppnppPP

−+−=−

σ . Thus the standardized random variable,

( ) ( )

( ) ( )

2

22

1

11

2121

11

ˆˆ

n

pp

n

pp

ppPPZ

−+

−

−−−=

has approximately the standard normal distribution.

Large Sample Hypothesis Test for Two Population Proportions Using Independent Samples

The null hypothesis for a hypothesis test to compare the proportions of two category populations

will be

H0: p1 = p2 (population proportions are equal)

If the null hypothesis is true, then p1 - p2 = 0, and so the standardized random variable becomes

( )

( )

+−

−=

21

21

111

ˆˆ

nnpp

PPZ

We cannot use this random variable as the test statistic since p is unknown. Consequently, we

must estimate p using sample information. The estimate of p is

21

21ˆnn

xxp

+

+=

So we get the standardized random variable which can be used as the test statistic,

( )

( )

+−

−=

21

21

11ˆ1ˆ

ˆˆ

nnpp

PPZ

91

Example 1 In a public opinion survey, 60 out of 100 high-income voters and 40 out of 75 low-income

voters supported a decrease in sales tax. Is the population proportion of high-income voters

favoring a decrease in the sales tax different from the population proportion of low-income

voters favoring a decrease in the sales tax? Use α = 0.01

Solution:

X1 is the number of high-income voters supported a decrease in sales tax

X2 is the number of low-income voters supported a decrease in sales tax

n1 = 100, x1 = 60, 1p̂ = 60/100 = 0.60, ( ) 111

ˆˆ1ˆ qpp =− = 0.40

n2 = 75, x2 = 40, 2p̂ = 40/75 = 0.53, ( ) 222

ˆˆ1ˆ qpp =− = 0.47

Hypothesis:

H0 : p1 = p2

H1 : p1 ≠ p2 Significance Level : α = 0.01

Test Statistic:

( )

( )

+−

−=

21

21

11ˆ1ˆ

ˆˆ

nnpp

ppz


Rejected H0 if z > 2.575 or z < -2.575

Calculate:

57.075100

4060ˆ

21

21 =+

+=

+

+=

nn

xxp

( )

926.0

75

1

100

143.057.0

53.060.0=

+×

−=z

Conclusion:

Because -2.575 < z = 0.926 < 2.575 then fail to rejected H0. The population proportion of high-income voters favoring a decrease in the sales tax is not different from

the population proportion of low-income voters favoring a decrease in the sales tax.

If the null hypothesis is true, then p1 - p2 = d0, d0 ≠≠≠≠ 0, we will use the test statistic,

( )

( ) ( )

2

22

1

11

021

ˆ1ˆˆ1ˆ

ˆˆ

n

pp

n

pp

dppz

−+

−

−−=


H0 : p1 = p2 H1: p1 ≠ p2 ( )

( )

+−

−=

21

21

11ˆ1ˆ

ˆˆ

nnpp

ppz


or z < - zα/2

H0 : p1 = p2

H0 : p1 ≥ p2

H1: p1 < p2 Rejected H0 if z < - zα

H0 : p1 = p2

H0 : p1 ≤ p2

H1: p1 > p2 Rejected H0 if z > zα

92

Example 2

In a survey of 200 female students at The University, 88 say that they would rather have an

early morning class than an evening class. Of 200 male students at The University, 80 say that

they would rather have an early morning class than an evening class. Are females more likely

than males to prefer an early morning class rather than an evening class for a half? Use α =

0.05.

Solution:

X1 is the number of female student would rather have an early morning than an evening class

X2 is the number of male student would rather have an early morning than an evening class

n1 = 200, x1 = 88, 1p̂ = 132/200 = 0.44, ( ) 111

ˆˆ1ˆ qpp =− = 0.56

n2 = 200, x2 = 80, 2p̂ = 80/200 = 0.40, ( ) 222

ˆˆ1ˆ qpp =− = 0.60

Hypothesis:

H0 : p1 - p2 = 0.50

H1 : p1 - p2 > 0.50


Test Statistic:

( )

( ) ( )

2

22

1

11

021

ˆ1ˆˆ1ˆ

ˆˆ

n

pp

n

pp

dppz

−+

−

−−=


Rejected H0 if z > 1.645

Calculate:

( )

( ) ( )328.9

200

60.040.0

200

56.044.0

50.040.044.0−=

+

−−=z

Conclusion:

Because z = -9.328 < 1.645 then fail to rejected H0.

There is significant no evidence to conclude that females more likely than males to prefer an

early morning class rather than an evening class for a half.

Exercises

In a sample of 500 users of toothpaste A, 100 said that they would never switch to another

toothpaste. In another sample of 400 users of toothpaste B, 68 said that they would never

switch. At the 1% significance level, can you conclude that the proportion of users of

toothpaste A who would never switch to another toothpaste is higher than the proportion of

users of toothpaste B who would never switch?

Null

Hypothesis

Alternative

Hypothesis

Test Statistic Decision Criteria

H0 : p1 - p2 = d0 H1: p1 - p2 ≠ d0 ( )

( ) ( )

2

22

1

11

021

ˆ1ˆˆ1ˆ

ˆˆ

n

pp

n

pp

dppz

−+

−

−−=

Rejected H0 if

z > zα/2 or z < - zα/2

H0 : p1 - p2 = d0

H0 : p1 - p2 ≥ d0

H1: p1 - p2 < d0 Rejected H0 if

z < - zα

H0 : p1 - p2 = d0

H0 : p1 - p2 ≤ d0

H1: p1 - p2 > d0 Rejected H0 if z > zα

93




Topic 30 : Hypothesis Testing (Hypothesis Testing for One and Two Variances)

Hypothesis Testing For Variance

Suppose a random sample of size n is to be taken from a normally distributed population with

variance σ2. Then the random variable

( )

2

22 1

σ

SnX

−=

has the chi-square distribution with n-1 degrees of freedom.


H0 : 2

0

2 σσ = H1: 2

0

2 σσ ≠ ( )2

0

22 1

σχ

sn −=

Rejected H0 if 2

2/1

2

αχχ −< or 2

2/

2

αχχ >

H0 : 2

0

2 σσ =

H0 : 2

0

2 σσ ≥

H1: 2

0

2 σσ < Rejected H0 if 2

1

2

αχχ −<

H0 : 2

0

2 σσ =

H0 : 2

0

2 σσ ≤

H1: 2

0

2 σσ > Rejected H0 if 22

αχχ >

Example 1

Tests in Mr. X past statistics classes have scores with a standard deviation equal to 14.1. One

of his current classes now has 27 test scores with a standard deviation of 9.3. Use a 0.01 level

of significance to test the claim that this current class has less variation than past classes.

Solution:

Hypothesis:

H0 : σ2 ≥ 14.1

2

H1 : σ2 < 14.1

2


Test Statistic:

( )

2

0

22 1

σχ

sn −=

Decision Criteria: ( ) 198.122

2699.0 =χ

Rejected H0 if 198.122 <χ

Calculate:

( )( )

( )311.11

1.14

3.91272

2

2 =−

=χ

Conclusion:

Because 311.112 =χ < 12.198 then rejected H0.

We can conclude that the sample data supports the claim that the variation of the current class

is less than 14.1

94

Hypothesis Testing For Two Variances

An F-test (Snedecor and Cochran, 1983) is used to test if the standard deviations of two

populations are equal. This test can be a two-tailed test or a one-tailed test. The two-tailed

version tests against the alternative that the standard deviations are not equal. The one-tailed

version only tests in one direction, that is the standard deviation from the first population is

either greater than or less than (but not both) the second population standard deviation. The

random variable is

2

2

2

2

2

1

2

1

σ

σ

S

Sf =

is an f with 111 −= nν and 122 −= nν degrees of freedom.


H0 : 2

2

2

1 σσ = H1: 2

2

2

1 σσ ≠ 2

2

2

1

s

sf =

Rejected H0 if

( )21 ,2/1 ννα−< ff or

( )21 ,2/ νναff >

H0 : 2

2

2

1 σσ =

H0 : 2

2

2

1 σσ ≥

H1: 2

2

2

1 σσ < Rejected H0 if

( )21 ,1 ννα−< ff

H0 : 2

2

2

1 σσ =

H0 : 2

2

2

1 σσ ≤

H1: 2

2

2

1 σσ > Rejected H0 if

( )21 ,νναff >

Example 2

A medical researcher wishes to see whether the variances of the heart rates (in beats per

minutes) of smokers are different from the variances of heart rates of people who do not

smoke. Two samples are selected, and the data are shown below.

Smokers Non-smokers

n1 = 26

s1= 6

n2 = 18

s2 = 3.16

Using α = 0.10, is there enough evidence to support the claim?

Solution:

Hypothesis:

H0 : 2

2

2

1 σσ =

H1 : 2

2

2

1 σσ ≠


Test Statistic:

2

2

2

1

s

sf =

Decision Criteria: ν1 = 25, ν2 = 17, f0.05(25,17) = 2.18, f0.95(25,17) = 1/ f0.05(17,25) = 1/2.06 = 0.49

Rejected H0 if f > 2.18 or f < 0.49

Calculate:

605.316.3

62

2

==f

Conclusion:

Because f = 3.605 > 2.18 then rejected H0.

95

There is enough evidence to support the claim that whether the variances of the heart rates (in

beats per minutes) of smokers are different from the variances of heart rates of people who do

not smoke

Exercises

1. For randomly selected adults IQ scores are normally distributed with a mean of 100

and a standard deviation of 15. A sample of 24 randomly selected college professors

resulted in IQ scores having a standard deviation of 10. Test the claim that the IQ

scores for college professors is the same as the general population, that is 15. Use a

0.05 level of significance.

2. Two college instructors are interested in whether or not there is any variation in the

way they grade math exams. They each grade the same set of 30 exams. The first

instructor's grades have a variance of 52.3. The second instructor's grades have a

variance of 89.9. Test the claim that the first instructor's variance is smaller. (In most

colleges, it is desirable for the variances of exam grades to be nearly the same among

instructors). The level of significance is 1%.

Documents

New HANDOUT ELEMENTARY STATISTICSstaff.uny.ac.id/sites/default/files/pendidikan... · 2011. 4. 21. · HANDOUT ELEMENTARY STATISTICS Kismiantini NIP. 19790816 200112 2 001 ... Interval