44
Discrete Random Variables

Discrete Random Variables. Probability Mass Function

Embed Size (px)

Citation preview

Discrete Random Variables

Probability Mass Function

Probability Mass Function Pr(RI)

Parameters of a distribution

Cumulative Distribution Function

Cumulative Distribution Function CDF(RI)

CDF (cont’d)

Expectation Value

• an average value is:

Discrete outcomes, like rolling dice

Chebychev’s Inequality

In words: If X is any random variable with mean μ = E(X) and standard deviation σ then the probability it falls farther than k standard deviations of the mean, is less than or equal to 1/k2

Chebychev’s Inequality

Random sample of 1000 measurements, with μ = 0, σ = 2.7

k = 1, 2 or 3

1/k2

Chebychev held up under this simulation

Binomial Probability Distribution

• A binomial random variable X results from the following type of experiment.1. There is a fixed number, n, of trials.

2. Each trials results in one of two possible outcomes, usually called ‘success’ or ‘failure.

3. The trials are independent of one another.

4. The probability, p, of success is the same for each trial.

(This means we have a sequence of n bernoulli trials.)

• Then X is the number of successes. X has probability mass function

• The binomial distribution is implemented in R as:• dbinom(x,n,p)

for x = 0, 1, 2, 3, …

• A binomially distributed random variable X, with parameters n (number of trials) and p (probability od a “success”) has• Expected value:

• E(X) = np

• Variance: • Var(X) = np(1 – p)

Binomial Probability Distribution

• For example a binomial with n = 15 and p = 0.3 has expected value 15*0.3 = 4.5 “successes” and standard deviation of sqrt(15*0.3*0.7) = 1.77 “successes”

Binomial distribution in R

A particular tennis racquet comes in a midsize version and an oversize version. 60% of customers at Shenkin Sports want the oversize version. Among 10 randomly selected customers who want this racquet what is the probability that at least 6 want the oversize version?

Example

• Suppose an army sniper hits his target 20% of the time at a distance of 1000 yards. Due to variations in wind speed and direction, time of day, and cloud cover successive shots may be considered to be independent of one another. Assuming the sniper takes 15 shots on a certain day how likely is it that he hits his target 0,1,2,3,…15 times.

• Binomial problem. n = 15 trials, probability of success on each trial p = 0.2, independence between trials, interest is in total number of hits.

• X= number of hits in 15 shots.

Example

Calculations with the binomial distribution using R

Note that dbinom() is the probability mass function of the binomial. All code from file binomialExampleB(15,.2).R

Calculations with the binomial distribution using R-Continued

Note that pbinom() is the cumulative distribution function of the binomial

Calculations with the binomial distribution using R-Continued

Hypergeometric distribution• The binomial probability distribution is not technically applicable

in a situation of sampling without replacement.• The Hypergeometric distribution counts “successes” like the

binomial distribution, but allows sampling without replacement.• Consider a bucket of N objects.• There are M “successes” and N – M “failures” in the bucket.• Select n objects from the bucket without replacement• The probability of drawing x “successes” in n tries is:

for x = 0, 1, 2, 3, …

• A hypergeometrically distributed random variable X, with parameters N (bucket size), M (number of “successes” in the bucket), n (number of trials) has• Expected value:• E(X) = np

• Variance:•

Hypergeometric Distribution

Hypergeometric distribution 2• R provides the following built-in functions to

calculate hypergeometric probabilities:

• dhyper is the density function defined on the previous slide.• phyper is the CDF, • qhyper the quantile function• rhyper generates random hypergeometric observations.

x, q number of “successes” drawnm total the number of “successes” in the bucketn total the number of “failures” in the bucketk the number of balls drawn from the urn.

p probability of a “success”

Hypergeometric Properties

Each of 12 refrigerators of a certain type have been returned to the distributor because of a loud noise. Suppose that 7 of these have a defective compressor and 5 have a broken ice cube tray. The refrigerators will be examined one by one in random order. Let X be the number among the first 6 examined that have a defective compressor. Calculate:

Pr(X =5)Pr(X <= 4) The expected value of X

Hypergeometric Properties

Each of 12 refrigerators of a certain type have been returned to the distributor because of a loud noise. Suppose that 7 of these have a defective compressor and 5 have a broken ice cube tray. The refrigerators will be examined one by one in random order. Let X be the number among the first 6 examined that have a defective compressor. Calculate:

Pr(X =5)

Hypergeometric Properties

Each of 12 refrigerators of a certain type have been returned to the distributor because of a loud noise. Suppose that 7 of these have a defective compressor and 5 have a broken ice cube tray. The refrigerators will be examined one by one in random order. Let X be the number among the first 6 examined that have a defective compressor. Calculate:

Pr(X <= 4)

E(X) = np = n(M/N) = 6 (7/12) = 3.5 units on average

Hypergeometric Example

A college professor is teaching two sections of Math 301. Section 1 has 24 students registered and section 2 has 19 students registered. Each student hands in a term project. The professor grades these in a random order. What is the probability that exactly 4 out of the first 10 graded papers are from section 1? How about 4 or fewer of the first 10 papers are from section 1?

m = 24, n = 19 and k =10

Exactly 4

4 or fewer

Geometric Distribution

• Suppose we have a sequence of independent Bernoulli trials with probability of success p. • Let X be the random variable representing the number of failures until the 1st

success. • Then X is a geometric random variable and the probability density of X is:

Note that the R, dgeom(x,p) function represents the probability of the number of failures until the 1st success which is always one less than the number of trials until the 1st success.

Mr. S has 14 keys on his keychain. He is looking for the one key that opens his office. He randomly picks a key, tries it and replaces it until he gets his door open.

• What is the probability that he picks the correct key on the 6th try?

Example

Geometric Distribution Example

A fair die is tossed until the first ‘3’ appears. What is the probability that the first ‘3’ appears on the 5th toss? On the rth toss?

Solution: The first ‘3’ appears on the 5th toss only if the first 4 tosses are failures (not a ‘3’) and the 5th toss is a success: This probability is:

In R we have

Note the 4 representing the failure count before the 1st success. In general if the success probability is p, then the probability of the 1st success on the rth trial is:

• A negative binomial random variable X, is the number of “failures” x that precede the rth “success”. The probability of a “success” is p.• In contrast to the binomial random variable, the number of

successes is fixed and the number of trials is random.

Negative Binomial Distribution

for x = 0, 1, 2, 3, …

Probability mass function

Expectation Variance

Negative Binomial (cont’d)

If the probability of getting getting a head when flipping a coin is p = 0.3, what is the probability of getting 7 tails before the 3rd head?

Example

The probability of successfully hailing a cab in the rain is p = 0.2. What is the probability that 15 hails must be attempted before 5 hails result in as driver pulling over?

Example

What is the probability that at most 10 hails must be attempted before 5 hails result in as driver pulling over?

Poisson process

Poisson distribution

Example

During the Sandy disaster, NYC 9-1-1 received about 20,000 calls/hr at its peak.

What is the probability of exactly 3 calls in 1 hour?What is the probability of 3 or fewer calls in one hour?What is the probability of more than 3 calls in one hour?What is the probability of 7 calls in 2 hours?

Example

During the Sandy disaster, NYC 9-1-1 received about 20,000 calls/hr at its peak.

What is the probability of exactly 20,165 calls in 1 hour?

Example

During the Sandy disaster, NYC 9-1-1 received about 20,000 calls/hr at its peak.

What is the probability of 20081 or fewer calls in one hour?

Example

During the Sandy disaster, NYC 9-1-1 received about 20,000 calls/hr at its peak.

What is the probability of more than 20081 calls in one hour?

Example

During the Sandy disaster, NYC 9-1-1 received about 20,000 calls/hr at its peak.

What is the probability of 40031 calls in 2 hours?