42
QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Embed Size (px)

Citation preview

Page 1: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

QMS 6351Statistics and Research Methods

Probability and Probability distributionsChapter 4, page 161Chapter 5 (5.1)Chapter 6 (6.2)

Prof. Vera Adamchik

Page 2: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Probability

Probability is a numerical measure of the likelihood that a specific event will occur.

Properties of Probability:

• The probability of an event always lies in the range zero to 1, including zero (an impossible event) and 1 (a sure event);

• The sum of all probabilities for an experiment is always 1.

Page 3: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Assigning Probabilities to Experimental Outcomes

• Classical Method Assigning probabilities based on the assumption of equally likely outcomes.

• Relative Frequency Method Assigning probabilities based on experimentation or historical data.

• Subjective Method Assigning probabilities based on the assignor’s judgment.

Page 4: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Classical Method If an experiment has n possible

outcomes, this method would assign a probability of 1/n to each outcome.

• Example

Experiment: Rolling a die

Sample Space: S = {1, 2, 3, 4, 5, 6}

Probabilities: Each sample point has a 1/6 chance of occurring.

Page 5: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Relative Frequency as an Approximation of Probability

• If an experiment is repeated n times and an event A is observed f times, then, according to the relative frequency concept of probability

n

fAP )(

Page 6: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Law of Large Numbers

Relative frequencies are not probabilities but approximate probabilities. However, if the experiment is repeated a very large number of times, the approximate probability of an outcome obtained from the relative frequency will approach the actual probability of that outcome. This is called the Law of Large Numbers.

Page 7: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Subjective Method• Subjective probability is the probability

assigned to an event based on subjective judgement, experience, information, and belief.

• The best probability estimates often are obtained by combining the estimates from the classical or relative frequency approach with the subjective estimates.

Page 8: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Objective and Subjective Probability

When probabilities are assessed in

ways that are consistent with the

classical or relative frequency

determination of probability, we call

them objective probabilities. Objective

and subjective probabilities are

fundamentally different.

Page 9: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

• If a number of people assign the probability of an event objectively, each individual will arrive at the same answer, provided they did the calculations properly.

• If a number of people assign the probability of an event subjectively, each individual will arrive at his or her own answer.

• As a consequence, not all probability theory and methods that can be applied to objective probabilities can be applied to subjective ones.

Page 10: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Random variables

• A random variable is a numerical description of the outcome of an experiment.

Page 11: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Discrete random variable

• A random variable is discrete if the set of outcomes is either finite in number (e.g., tail & head; 1,2,3,4,5,6 face value of a die) or countably infinite (e.g., the number of children in a family 0,1,2,3,…).

Page 12: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Continuous random variable

• A random variable is continuous if the set of outcomes is infinitely divisible and, hence, not countable. A continuous random variable may assume any numerical value in an interval. For example, temperature may assume any value between 42F and 56F, or between 42F and 45F, or between 42F and 43F, or between 42F and 42,5F etc.

Page 13: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

QuestionQuestion Random Variable Random Variable xx TypeType

FamilyFamily

sizesize

xx = Number of dependents = Number of dependents

reported on tax returnreported on tax return

DiscreteDiscrete

Distance fromDistance from

home to storehome to store

xx = Distance in miles from = Distance in miles from

home to the store sitehome to the store site

ContinuousContinuous

Own dogOwn dog

or cator cat

xx = 1 if own no pet; = 1 if own no pet;

= 2 if own dog(s) only; = 2 if own dog(s) only;

= 3 if own cat(s) only; = 3 if own cat(s) only;

= 4 if own dog(s) and cat(s)= 4 if own dog(s) and cat(s)

DiscreteDiscrete

Page 14: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Discrete probability function

• The probability function (denoted by f(x))

for a discrete random variable lists all the

possible values that the random variable

can assume and their corresponding

probabilities. For example, f(head) = 0.5,

f(tail) = 0.5.

• We can describe a discrete probability

distribution with a table, graph, or equation.

Page 15: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Continuous probability function

• With continuous random variables, the counterpart of the probability function is the probability density function, also denoted by f(x).

• However, important difference exists between probability distributions for discrete and continuous variables:

Page 16: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Difference (1)

• in the continuous case, f(x) is a

counterpart of probability function f(x),

but is called probability density function,

p.d.f;

Page 17: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Difference (2)

• p.d.f. provides the value of the function

at any particular value of x; it does not

directly provide the probability of the

random variable assuming some

specific value;

Page 18: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Difference (3)• Probability is represented by the area

under the graph. Because the area under the curve (line) above any single point is 0, P(x = value) = 0.

In the continuous case,

P(a < x < b) =

= P(a < x < b) = P(a < x < b) =

= P(a < x < b).

Page 19: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

The probability of the random variable assuming a value within some given interval from x1 to x2 is defined to be the area under the graph of the probability density function between x1 and x2.

xx

f f ((xx)) NormalNormal

xx11 xx11 xx22 xx22

Page 20: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Continuous Probability Distributions• It is not possible to talk about the probability

of the random variable assuming a particular value.

• Instead, we talk about the probability of the random variable assuming a value within a given interval.

• The probability of the random variable assuming a value within some given interval from x1 to x2 is defined to be the area under the graph of the probability density function between x1 and x2.

Page 21: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

f (x)f (x)

x x

UniformUniform

xx

f f ((xx)) NormalNormal

xx

f (x)f (x) ExponentialExponential

Page 22: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

The Normal Probability Distribution

• The normal probability distribution is the most important distribution for describing a continuous random variable.

• It is widely used in statistical inference.• It has been used in a wide variety of

applications including:– Heights of people– Test scores– Rainfall amounts

Page 23: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

The Normal Probability Distribution

• The Normal Curve– The shape of the normal curve is often

illustrated as a bell-shaped curve. – The highest point on the normal curve is at

the mean, which is also the median and mode of the distribution.

– The normal curve is symmetric about the mean.

– The tails of a normal distribution curve extend indefinitely in both directions without touching or crossing the horizontal axis.

Page 24: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

The Normal Probability Distribution

• Graph of the Normal Probability Density Function

xx

f f ((x x ))

Page 25: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

The Normal Probability Distribution

• The entire family of normal probability distributions is defined by its mean μ and its standard deviation σ .

Page 26: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

The Normal Probability Distribution• Normal Probability Density Function

where

= mean

= standard deviation

= 3.14159

e = 2.71828

22 2/)(

2

1)(

xexf

Page 27: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

-10-10 00 2525xx

The mean determines the position on the curve with respect to other normal curves. The mean can be any numerical value: negative, zero, or positive.

Page 28: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

= 15= 15

= 25= 25

xx

The standard deviation determines the width of the curve: larger values result in wider, flatter curves.

Page 29: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

.5.5 .5.5

xx

Probabilities for the normal random variable are given by areas under the curve. The total area under the curve is 1 (0.5 to the left of the mean and 0.5 to the right).

Page 30: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Standard Normal Probability Distribution

• A random variable that has a normal distribution with a mean of zero and a standard deviation of one is said to have a standard normal probability distribution.

• The letter z is commonly used to designate this normal random variable.

Page 31: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Converting to the Standard Normal Distribution

• For a normal variable x, a particular value can be converted to a z value

• We can think of z as a measure of the number of standard deviations x is from .

x

z

Page 32: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

00zz

Standard Normal Probability Distribution

Page 33: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Example 1: P/E Ratio The price-earnings (P/E) ratio for a company

is an indication of whether the stock of that company is undervalued (P/E is low) or overvalued (P/E is high). Suppose the P/E ratios of all companies have a normal distribution with a mean 15 and a standard deviation of 6.

If a P/E ratio of more than 20 is considered to be a relatively high ratio, what percentage of all companies have high P/E ratios?

P(x > 20) = ?

Page 34: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Example 1: Solution steps

• Step 1: Convert x to the standard normal distribution.

zz = ( = (xx - - μμ)/)/σσ

= (20 - 15)/6= (20 - 15)/6

= 0.83= 0.83

• Step 2: Find the area under the standard normal curve to the left of z = 0.83.

Page 35: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

. . . . . . . . . . .

.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224

.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549

.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852

.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133

.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389

. . . . . . . . . . .

PP((zz << .83) .83)

Cumulative probability table for the standard normal distribution

Page 36: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

PP((z z > .83) = 1 – > .83) = 1 – PP((zz << .83) .83) = 1- .7967= 1- .7967

= .2033= .2033

PP((z z > .83) = 1 – > .83) = 1 – PP((zz << .83) .83) = 1- .7967= 1- .7967

= .2033= .2033

PP((xx > > 20)20)

• Step 3: Compute the area under the standard normal curve to the right of z = 0.83.

Page 37: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

00 .83.83

Area = .7967Area = .7967Area = 1 - .7967Area = 1 - .7967

= .2033= .2033

zz

Page 38: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Example 2: GMAT score Most business schools require that every

applicant for admission to a degree program take the GMAT. Suppose the GMAT scores of all students have a normal distribution with a mean of 50 and a standard deviation of 90. What should your score be so that only 5% of all the examinees score higher than you do?

x0.05 = ?

Page 39: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

00

Area = .9500Area = .9500

Area = .0500Area = .0500

zzzz.05.05

Page 40: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

Example 2: Solution steps

• Step 1: Find the z-value that cuts off an area of .05 in the right tail of the standard normal distribution.

Page 41: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

. . . . . . . . . . .

1.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441

1.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545

1.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .9633

1.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706

1.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767

. . . . . . . . . . .

We look upWe look upthe the

complement complement of the tail areaof the tail area(1 - .05 = .95)(1 - .05 = .95)

Page 42: QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

• Step 2: Convert z0.05 to the

corresponding value of x:

x = + z.05

= 550 + 1.645*90

= 698.05

Your score should be 698 so that only 5% of all the examinees score higher than you do.