42
Probability Distributions Dr. Catherine Haase, Montana State University

Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Probability Distributions

Dr. Catherine Haase, Montana State University

Page 2: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Measurement Scales

Page 3: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Nominal Data• Nominal scales are used for labeling

variables, without any quantitative value

• Notice that all of these scales are mutually exclusive (no overlap) and none of them have any numerical significance

• A good way to remember all of this is that “nominal” sounds a lot like “name” and nominal scales are kind of like “names”

Page 4: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Ordinal Data• With ordinal scales, it is the order of the values is what’s important and

significant, but the differences between each one is not really known

• Ordinal scales are typically measures of non-numeric concepts like satisfaction, happiness, discomfort, etc.

• “Ordinal” is easy to remember because itsounds like “order” and that’s the key to remember with “ordinal scales

Page 5: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Interval Data• Interval scales are numeric scales in which we know not only the order, but

also the exact differences between the values

• Central tendency can be measured by mode, median, or mean; standard deviation can also be calculated

• They don’t have a “true zero.” For example, there is no such thing as “no temperature.” Without a true zero, it is impossible to compute ratios

Page 6: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Ratio Data• Ratio scales tell us about the order, they tell us the exact value between units,

AND they also have an absolute zero

• Ratio scales provide a wealth of possibilities when it comes to statistical analysis – they can be meaningfullyadded, subtracted, multiplied, divided

• Central tendency can be measured by mode, median, or mean; measures of dispersion, such as standard deviation and coefficient of variation

Page 7: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Measurement Scales

Page 8: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Types of Data

Page 9: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Random VariablesThe mathematical function that assigns a given numerical value to each possible outcome of an experiment in the sample space is called a random variable.

Random variables come in three forms:

• Categorical: represent types of data which may be divided into groups• Age, sex, gender

• Discrete: finite or countable values; ex: integers • Presence/absence, number of offspring

• Continuous: take on any value within a smooth intervals• Leaf area consumed by an herbivore

Page 10: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Variance• Measures data spread: the average of the squared differences from the mean

• A value of zero means that there is no variability; all the numbers in the data set are the same

• Example:• The data set 12, 12, 12, 12, 13 has a

var. of 0.167 = a very small variance

• The data set 12, 12, 12, 12, 13,013 has a variance of 28171000; a large change in the numbers equals a very large number.

Page 11: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Standard Deviation• Tells you how tightly your data is clustered around the mean (SD or σ)

• Calculated as the square root of the variance

• When the bell curve is flattened, you have a large standard deviation

• When the bell curve is very steep, your data has a small standard deviation

Page 12: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Standard Error• The standard error(SE) is very similar to standard deviation

• While the standard error uses statistics (sample data) standard deviations use parameters (population data)

• The SE tells you how far the sample statistic deviates from the actual population mean

• The larger the sample size, the smaller the SE. In other words, the larger thesample size, the closer the sample mean is to the actual population mean

Page 13: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Discrete Distributions

Distribution Probability Value E(X) σ2(X) Comments Example

Bernoulli 𝑃 𝑋 = 𝑝 p p(1-p)Dichotomous outcomes

To reproduceor not

Binomial 𝑃 𝑋 =𝑛𝑋

𝑝𝑥 1 − 𝑝 𝑛−𝑋 np np(1-p)Number of successes in independent trials

Presence or absence

Poisson 𝑃 𝑥 =λ𝑥

𝑥!𝑒−λ 𝜆 𝜆

Independent rare events, where 𝜆is the rate of

events

Distribution of rare species

Page 14: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Bernoulli TrialSimplest experiment with only two outcomes

Notation: 𝑋 ~ 𝐵𝑒𝑟𝑛𝑜𝑢𝑙𝑙𝑖 𝑝

Probability of a single outcome (X) is:

𝑃 𝑋 = 𝑝

Examples:• Presence or absence of hippos

• Coin flips heads or tails

• Did the buckback reproduce or not

Page 15: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Probability Density Function (PDF) vs. Cumulative Distribution Function (CDF)

Page 16: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Binomial DistributionThe number of successful results in n independent Bernoulli trials, where the probability of obtaining X successful outcomes of any given event is p

Notation: 𝑋 ~ 𝐵𝑖𝑛 𝑛, 𝑝

Probability of X successful outcomes:

𝑃 𝑋 =𝑛!

𝑋! 𝑛−𝑋 !𝑝𝑥 1 − 𝑝 𝑛−𝑋

Example in R: Flip a coin 25 times.What is the probability of getting:0 heads, 10 heads, 25 heads?

Page 17: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example Problem

Suppose there are 17 study sites that you survey for the dibatag.

From a previous study, each location has a 20% chance of having dibatags.

Find the probability of four or less sites having dibatags if you survey them at random.

Page 18: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example Problem: SolutionWe can find the probability of having 4 sites occupied by dibatags as follows:

> dbinom(4, size=17, prob=0.2)

To find the probability of having four or less sites occupied, we apply the function dbinom with x = 0,…,4.

> dbinom(0, size=17, prob=0.2) ++ dbinom(1, size=17, prob=0.2) ++ dbinom(2, size=17, prob=0.2) ++ dbinom(3, size=17, prob=0.2) ++ dbinom(4, size=172, prob=0.2)

Alternatively, we can use the cumulative probability function for binomial distribution pbinom.

> pbinom(4, size=17, prob=0.2)

Answer: The probability of four or less sites occupied out of 17 sites is 75.82%.

Page 19: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example Problem

Suppose there are 53 study sites that you survey for the dibatag.

From a previous study, each location has a 80% chance of having dibatags.

Find the probability of 21 or less sites having dibatags if you survey them at random.

Page 20: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Poisson DistributionThe number of occurrences (X) of an event recorded in a sample of a fixed area or during a fixed interval of time. Used when occurrences are rare – that is the most common sample is 0.

Notation: 𝑋 ~ 𝑃𝑜𝑖𝑠𝑠𝑜𝑛(λ)

Probability of X successful outcomes:

𝑃 𝑥 =λ𝑥

𝑥!𝑒−λ

Example in R: What is the probability of observing2 lesser kudus at a site when the mean number of individuals per site (λ) is 0.5? 0.1? 4.2?

Page 21: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example ProblemIf the Ethiopian wolf has been observed to cross a particular trail at a rate of 4 individuals per hour on average, find the probability of having at least two wolves cross a trail in front of you in a particular hour.

Page 22: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example Problem: SolutionThe probability of having 2 or less wolves crossing the trail in a particular hour is given by the function ppois.

> ppois(2, lambda=4) # lower tail

Hence the probability of having 2 or more wolves crossing the trail in an hour is in the upper tail of the cumulative density function.

> ppois(2, lambda=4, lower=FALSE) # upper tail

Answer: If there are 4 wolves crossing the trail per hour, on average, the probability of having 2 or more wolves cross the trail in a particular hour is 76.18%.

Page 23: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example ProblemThe average number of homes sold by a local realty company is 2 homes per day. What is the probability that exactly 3 homes will be sold tomorrow?

Page 24: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example Problem: SolutionThe probability of having exactly 3 homes sold tomorrow is give by the probability density function dpois.

> dpois(3, lambda=2)

Answer: If there are 2 homes sold per day, on average, the probability of having exactly 3 homes sold tomorrow is 18.0%.

Page 25: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Continuous Distributions

Distribution Probability Value E(X) σ2(X) Comments Example

Uniform 𝑃 𝑎 < 𝑋 < 𝑏 = 1.0𝑏 + 𝑎

2

𝑏 + 𝑎 2

12

Equiprobableoutcomes

Even distributionof resources

Normal1

𝜎 2𝜋𝑒−12𝑋−𝜇𝜎

2

𝜇 𝜎2Generates “bell curve” for continuous data

Distribution of cattle heights

Exponential 𝑃 𝑋 = 𝛽𝑒𝛽𝑋 1/𝛽 1/𝛽2Continuous distribution of Poisson

Seeddispersal distance

Page 26: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Uniform DistributionThe probability distribution of random number selection from the continuous interval between a and b.

Notation: 𝑋 ~ 𝑈𝑛𝑖𝑓𝑜𝑟𝑚(𝑎, 𝑏)

Probability of X successful outcomes is:

𝑃 𝑎 < 𝑋 < 𝑏 = 1.0

Example in R: Select 10 random numbers between the interval of 50 and 75

Page 27: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Normal or Gaussian DistributionThe mean of the distribution determines the location of the center of the graph, and the standard deviation determines the height and width. The total area under the curve is 1.0.

Notation: 𝑋 ~ 𝑁(𝜇, 𝜎)

Probability of X successful outcomes is:

𝑃 𝑥 =1

𝜎 2𝜋𝑒−1

2

𝑋−𝜇

𝜎

2

Page 28: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example ProblemAssume that the test scores of a college entrance exam fits a normal distribution.

The mean test score is 72, and the standard deviation is 15.2.

What is the percentage of students scoring 84 or more in the exam?

Page 29: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example Problem: SolutionWe apply the function pnorm of the normal distribution with mean 72 and standard deviation 15.2. Since we are looking for the percentage of students scoring higher than 84, we are interested in the upper tail of the normal distribution.

> pnorm(84, mean=72, sd=15.2, lower.tail=FALSE)

Answer: The percentage of students scoring 84 or more in the college entrance exam is 21.5%.

Page 30: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example ProblemAn average light bulb manufactured lasts 300 days with a standard deviation of 50 days. Assuming that bulb life is normally distributed, what is the probability that the light bulb will last at most 365 days?

Page 31: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example Problem: SolutionWe apply the function dnorm of the normal distribution with mean 300 and standard deviation 50. Since we are looking for the probability at most, we use the lower tail of the normal distribution.

> pnorm(84, mean=72, sd=15.2)

Answer: The probability of a light bulb will burn out within 365 is 90%.

Page 32: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Standard Normal Random Variables• Type of normal distribution when 𝜇 = 0 and 𝜎 = 1

• Used for standardizing data – i.e. “centering” and “scaling”• Center the mean: subtract the mean of all data points from each data point

• Scale the standard deviation: divide the centered points by the standard deviation

• Example in R: Center and scale the generated dataset, then plot the histogram. How are the “normal” and “standard normal” data different from each other?

Page 33: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Exponential DistributionThe exponential distribution describes the arrival time of a randomly recurring independent event sequence.

Notation: 𝑋 ~ 𝐸𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙(λ)

Probability of X successful outcomes is:

𝑃 𝑋 = λ𝑒λ𝑋

Page 34: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example ProblemSuppose the mean time for a coffee ceremony is three hours.

Find the probability of a coffee being completed in less than two hours.

Page 35: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example Problem: SolutionThe processing rate is equals to one divided by the mean completion time.

Hence the rate is 1/3 completion per hour.

We then apply the function pexp of the exponential distribution with rate=1/3.

> pexp(2, rate=1/3)

Answer: The probability of finishing a coffee ceremony in under two hours is 48.7%

Page 36: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example ProblemThe average number of days ahead travelers purchase their airline tickets is 15 days.

Find the probability that a traveler will purchase a ticket ten days in advance.

Page 37: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example Problem: SolutionThe processing rate is equals to one divided by the mean purchase time.

Hence the rate is 1/15 day.

We then apply the function dexp of the exponential distribution with rate=1/15.

> dexp(10, rate=1/15)

Answer: The probability of purchasing a plane ticket 10 days in advance is 3.4%.

Page 38: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Other Distributions • Chi-squared

• t distribution

• F distribution

• Log-normal

• Negative binomial

• Gamma

Page 39: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example Problems: Distributions• Determine the distribution that best describes the data:

1. The body mass measurements of a herd of cattle

2. The composition of the gender of the professors in the university

3. The number of seeds counted as a function of distance from a plant

4. The rate at which a cashier checks out a costumer

• Would you use a probability density function (pdf) or cumulative distribution function (CDF) to determine the below probabilities? Lower or upper tail?1. Probability of getting a boy for your fourth child

2. Probability of obtaining at least a 75 on your statistics exam

3. Probability of counting no more than 25 sheep in the field

Page 40: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example ProblemGiven the mean number of births per day is 353,000, what is the probability that at least 300,000 babies will be born tomorrow?

Page 41: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example ProblemYou arrive into a building and are about to take an elevator to the your floor. Once you call the elevator, it will take between 0 and 40 seconds to arrive to you. What is the probability that it will take less than 15 seconds to arrive?

Page 42: Probability Distributionscatherineghaase.weebly.com/.../8/9/37898363/probability_distributio… · Continuous Distributions Distribution Probability Value E(X) σ2(X) Comments Example

Example ProblemA radar unit is used to measure speeds of cars on a motorway. The speed mean is 90 km/hr and the standard deviation of 10 km/hr. What is the probability that a car picked at random is travelling 100 km/hr?