QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik

QMS 6351Statistics and Research Methods

Probability and Probability distributionsChapter 4, page 161Chapter 5 (5.1)Chapter 6 (6.2)

Prof. Vera Adamchik

Probability

Probability is a numerical measure of the likelihood that a specific event will occur.

Properties of Probability:

• The probability of an event always lies in the range zero to 1, including zero (an impossible event) and 1 (a sure event);

• The sum of all probabilities for an experiment is always 1.

Assigning Probabilities to Experimental Outcomes

• Classical Method Assigning probabilities based on the assumption of equally likely outcomes.

• Relative Frequency Method Assigning probabilities based on experimentation or historical data.

• Subjective Method Assigning probabilities based on the assignor’s judgment.

Classical Method If an experiment has n possible

outcomes, this method would assign a probability of 1/n to each outcome.

• Example

Experiment: Rolling a die

Sample Space: S = {1, 2, 3, 4, 5, 6}

Probabilities: Each sample point has a 1/6 chance of occurring.

Relative Frequency as an Approximation of Probability

• If an experiment is repeated n times and an event A is observed f times, then, according to the relative frequency concept of probability

n

fAP )(

Law of Large Numbers

Relative frequencies are not probabilities but approximate probabilities. However, if the experiment is repeated a very large number of times, the approximate probability of an outcome obtained from the relative frequency will approach the actual probability of that outcome. This is called the Law of Large Numbers.

Subjective Method• Subjective probability is the probability

assigned to an event based on subjective judgement, experience, information, and belief.

• The best probability estimates often are obtained by combining the estimates from the classical or relative frequency approach with the subjective estimates.

Objective and Subjective Probability

When probabilities are assessed in

ways that are consistent with the

classical or relative frequency

determination of probability, we call

them objective probabilities. Objective

and subjective probabilities are

fundamentally different.

• If a number of people assign the probability of an event objectively, each individual will arrive at the same answer, provided they did the calculations properly.

• If a number of people assign the probability of an event subjectively, each individual will arrive at his or her own answer.

• As a consequence, not all probability theory and methods that can be applied to objective probabilities can be applied to subjective ones.

Random variables

• A random variable is a numerical description of the outcome of an experiment.

Discrete random variable

• A random variable is discrete if the set of outcomes is either finite in number (e.g., tail & head; 1,2,3,4,5,6 face value of a die) or countably infinite (e.g., the number of children in a family 0,1,2,3,…).

Continuous random variable

• A random variable is continuous if the set of outcomes is infinitely divisible and, hence, not countable. A continuous random variable may assume any numerical value in an interval. For example, temperature may assume any value between 42F and 56F, or between 42F and 45F, or between 42F and 43F, or between 42F and 42,5F etc.

QuestionQuestion Random Variable Random Variable xx TypeType

FamilyFamily

sizesize

xx = Number of dependents = Number of dependents

reported on tax returnreported on tax return

DiscreteDiscrete

Distance fromDistance from

home to storehome to store

xx = Distance in miles from = Distance in miles from

home to the store sitehome to the store site

ContinuousContinuous

Own dogOwn dog

or cator cat

xx = 1 if own no pet; = 1 if own no pet;

= 2 if own dog(s) only; = 2 if own dog(s) only;

= 3 if own cat(s) only; = 3 if own cat(s) only;

= 4 if own dog(s) and cat(s)= 4 if own dog(s) and cat(s)

DiscreteDiscrete

Discrete probability function

• The probability function (denoted by f(x))

for a discrete random variable lists all the

possible values that the random variable

can assume and their corresponding

probabilities. For example, f(head) = 0.5,

f(tail) = 0.5.

• We can describe a discrete probability

distribution with a table, graph, or equation.

Continuous probability function

• With continuous random variables, the counterpart of the probability function is the probability density function, also denoted by f(x).

• However, important difference exists between probability distributions for discrete and continuous variables:

Difference (1)

• in the continuous case, f(x) is a

counterpart of probability function f(x),

but is called probability density function,

p.d.f;

Difference (2)

• p.d.f. provides the value of the function

at any particular value of x; it does not

directly provide the probability of the

random variable assuming some

specific value;

Difference (3)• Probability is represented by the area

under the graph. Because the area under the curve (line) above any single point is 0, P(x = value) = 0.

In the continuous case,

P(a < x < b) =

= P(a < x < b) = P(a < x < b) =

= P(a < x < b).

The probability of the random variable assuming a value within some given interval from x1 to x2 is defined to be the area under the graph of the probability density function between x1 and x2.

xx

f f ((xx)) NormalNormal

xx11 xx11 xx22 xx22

Continuous Probability Distributions• It is not possible to talk about the probability

of the random variable assuming a particular value.

• Instead, we talk about the probability of the random variable assuming a value within a given interval.

• The probability of the random variable assuming a value within some given interval from x1 to x2 is defined to be the area under the graph of the probability density function between x1 and x2.

f (x)f (x)

x x

UniformUniform

xx

f f ((xx)) NormalNormal

xx

f (x)f (x) ExponentialExponential

The Normal Probability Distribution

• The normal probability distribution is the most important distribution for describing a continuous random variable.

• It is widely used in statistical inference.• It has been used in a wide variety of

applications including:– Heights of people– Test scores– Rainfall amounts


• The Normal Curve– The shape of the normal curve is often

illustrated as a bell-shaped curve. – The highest point on the normal curve is at

the mean, which is also the median and mode of the distribution.

– The normal curve is symmetric about the mean.

– The tails of a normal distribution curve extend indefinitely in both directions without touching or crossing the horizontal axis.


• Graph of the Normal Probability Density Function

xx

f f ((x x ))


• The entire family of normal probability distributions is defined by its mean μ and its standard deviation σ .

The Normal Probability Distribution• Normal Probability Density Function

where

= mean

= standard deviation

= 3.14159

e = 2.71828

22 2/)(

2

1)(

xexf

-10-10 00 2525xx

The mean determines the position on the curve with respect to other normal curves. The mean can be any numerical value: negative, zero, or positive.

= 15= 15

= 25= 25

xx

The standard deviation determines the width of the curve: larger values result in wider, flatter curves.

.5.5 .5.5

xx

Probabilities for the normal random variable are given by areas under the curve. The total area under the curve is 1 (0.5 to the left of the mean and 0.5 to the right).

Standard Normal Probability Distribution

• A random variable that has a normal distribution with a mean of zero and a standard deviation of one is said to have a standard normal probability distribution.

• The letter z is commonly used to designate this normal random variable.

Converting to the Standard Normal Distribution

• For a normal variable x, a particular value can be converted to a z value

• We can think of z as a measure of the number of standard deviations x is from .

x

z

00zz

Standard Normal Probability Distribution

Example 1: P/E Ratio The price-earnings (P/E) ratio for a company

is an indication of whether the stock of that company is undervalued (P/E is low) or overvalued (P/E is high). Suppose the P/E ratios of all companies have a normal distribution with a mean 15 and a standard deviation of 6.

If a P/E ratio of more than 20 is considered to be a relatively high ratio, what percentage of all companies have high P/E ratios?

P(x > 20) = ?

Example 1: Solution steps

• Step 1: Convert x to the standard normal distribution.

zz = ( = (xx - - μμ)/)/σσ

= (20 - 15)/6= (20 - 15)/6

= 0.83= 0.83

• Step 2: Find the area under the standard normal curve to the left of z = 0.83.

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

. . . . . . . . . . .

.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224

.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549

.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852

.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133

.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389

. . . . . . . . . . .

PP((zz << .83) .83)

Cumulative probability table for the standard normal distribution

PP((z z > .83) = 1 – > .83) = 1 – PP((zz << .83) .83) = 1- .7967= 1- .7967

= .2033= .2033

PP((z z > .83) = 1 – > .83) = 1 – PP((zz << .83) .83) = 1- .7967= 1- .7967

= .2033= .2033

PP((xx > > 20)20)

• Step 3: Compute the area under the standard normal curve to the right of z = 0.83.

00 .83.83

Area = .7967Area = .7967Area = 1 - .7967Area = 1 - .7967

= .2033= .2033

zz

Example 2: GMAT score Most business schools require that every

applicant for admission to a degree program take the GMAT. Suppose the GMAT scores of all students have a normal distribution with a mean of 50 and a standard deviation of 90. What should your score be so that only 5% of all the examinees score higher than you do?

x0.05 = ?

00

Area = .9500Area = .9500

Area = .0500Area = .0500

zzzz.05.05

Example 2: Solution steps

• Step 1: Find the z-value that cuts off an area of .05 in the right tail of the standard normal distribution.

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

. . . . . . . . . . .

1.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441

1.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545

1.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .9633

1.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706

1.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767

. . . . . . . . . . .

We look upWe look upthe the

complement complement of the tail areaof the tail area(1 - .05 = .95)(1 - .05 = .95)

• Step 2: Convert z0.05 to the

corresponding value of x:

x = + z.05

= 550 + 1.645*90

= 698.05

Your score should be 698 so that only 5% of all the examinees score higher than you do.

Documents

QMS 6351 Statistics and Research Methods Probability and Probability distributions Chapter 4, page 161 Chapter 5 (5.1) Chapter 6 (6.2) Prof. Vera Adamchik