30
Common Continuous Distributions Statistics 104 Autumn 2004 Taken from Statistics 110 Lecture Notes Copyright c 2004 by Mark E. Irwin

Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

  • Upload
    trandan

  • View
    228

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

Common Continuous Distributions

Statistics 104

Autumn 2004

Taken from Statistics 110 Lecture Notes

Copyright c©2004 by Mark E. Irwin

Page 2: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

Common Continuous Distributions

• Uniform

• Exponential

• Normal

• Gamma

• Cauchy

Continuous Distributions 1

Page 3: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

Uniform Distributions

This distribution describes events that are equally likely in a range (a, b).As mentioned before, it is what people often consider as a random number.

The PDF for the for the uniform distribution (U(a, b)) is

f(x) =1

b− a; a ≤ x ≤ b

The CDF is

F (x) =

0 x < ax−ab−a a ≤ x ≤ b

1 x > b

The first two moments of the uniform are

E[X] =a + b

2; Var(X) =

(b− a)2

12; SD(X) =

√(b− a)2

12

Continuous Distributions 2

Page 4: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

1.0

Uniform Densities

x

f(x)

U(0,1)U(2,3)

−4 −2 0 2 4

0.0

0.5

1.0

1.5

2.0

U(−k,k) Densities

x

f(x)

k=3k=2k=1k=0.25

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

1.0

Uniform CDFs

x

F(x

)

U(0,1)U(2,3)

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

1.0

U(−k,k) CDFs

x

f(x)

k=3k=2k=1k=0.25

Continuous Distributions 3

Page 5: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

Exponential Distribution

The exponential distribution is often used to describe the time to an event.

The PDF for the for the exponential distribution (Exp(λ)) is

f(x) = λe−λx; x ≥ 0

The CDF is

F (x) ={

1− e−λx x ≥ 00 x < 0

The first two moments of the exponential are

E[X] =1λ; Var(X) =

1λ2

; SD(X) =1λ

Continuous Distributions 4

Page 6: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

0 2 4 6 8 10

0.0

0.5

1.0

1.5

2.0

Exponential Densities

x

f(x)

lambda=0.5lambda=1lambda=2

0 2 4 6 8 10

0.0

0.4

0.8

Exponential CDFs

x

f(x)

lambda=0.5lambda=1lambda=2

The exponential distribution has an interesting property. It is said to bememoryless. That is, it satisfies

P [T > t + s|T > s] =P [T > t + s and T > s]

P [T > s]

=P [T > t + s]

P [T > s]=

e−λt+s

e−λs= e−λt

Continuous Distributions 5

Page 7: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

So the chance that what we are observing survives another t units of timedoesn’t depend on how long we have observed it so far.

This property limits where the exponential distribution can be used. Forexample, we wouldn’t want to use it to model human lifetimes.

Another potential drawback is that the parameter λ describes thedistribution completely. Once you know this, you also know the standarddeviation, skewness, kurtosis, etc.

Even with these drawbacks, the exponential distribution is widely used.Examples where it may be appropriate are

• Queuing theory - times between customer arrivals

• Times to relapse in leukemia patients

• Times to equipment failures

• Distances between crossovers during meiosis

Continuous Distributions 6

Page 8: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

The parameter λ can often be thought of as a rate or a speed parameter.The exponential distribution can be parameterized in terms of the meantime to the event, µ = 1

λ. With this parameterization the PDF and CDFare

f(x) = e−x/µ

µ ; x ≥ 0 F (x) ={

1− e−x/µ x ≥ 00 x < 0

Continuous Distributions 7

Page 9: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

Normal Distributions

The normal distribution is almost surelythe most common distribution used inprobability and statistics. It is alsoreferred to as the Gaussian distribution,as Gauss was an early promoter of itsuse (though not the first, which wasprobably De Moivre). It is also whatmost people mean when they talk about bell curve. It is used to describeobserved data, measurement errors, an approximation distribution (CentralLimit Theorem).

The PDF for the for the normal distribution (N(µ, σ2)) is

f(x) =1

σ√

2πe−(x−µ)2/2σ2

Continuous Distributions 8

Page 10: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

The two parameters of the distribution are the mean (µ) and the variance(σ2). A special case is the standard normal density which has µ = 0and σ2 = 1 and its PDF is often denoted by φ(x). As we shall see, oncewe understand the standard normal (N(0, 1)), we understand all normaldistributions.

The CDF for the normal distribution doesn’t have a nice form. The CDFfor the standard normal is often denoted by Φ(x) which is of the form

Φ(x) =∫ x

−∞

1√2π

e−u2/2du

The CDF for any other normal distribution is based on Φ(x).

Continuous Distributions 9

Page 11: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

N(mu,1) Densities

x

f(x)

0−11

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

N(0,sigma^2) Densities

x

f(x)

sigma = 1sigma = 0.5sigma = 2

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

1.0

N(mu,1) CDFs

x

f(x)

mu = 0mu = −1mu = 1

−4 −2 0 2 4

0.0

0.2

0.4

0.6

0.8

1.0

N(0,sigma^2) CDFs

x

f(x)

sigma = 1sigma = 0.5sigma = 2

Continuous Distributions 10

Page 12: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

The normal distribution is an example of a symmetric distribution, with thepoint of symmetry being the mean µ. For a symmetric distribution withmean 0

P [X ≤ −x] = P [X ≥ x]

−4 −2 0 2 4−x x

Theorem. Let X ∼ N(µ, σ2) and let Y = aX + b. Then Y ∼ N(aµ +b, (aσ)2)

Continuous Distributions 11

Page 13: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

Proof.

FY (y) = P [Y ≤ y]

= P [aX + b ≤ y]

= P

[X ≤ y − b

a

]

= FX

(y − b

a

)

Therefore

fY (y) =d

dyFX

(y − b

a

)

=1|a|fX

(y − b

a

)

Note that to this point, we haven’t made any assumptions about thedistribution of X. Any linear transformation of a RV gives this relationship.

Continuous Distributions 12

Page 14: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

Now if X ∼ N(µ, σ2), then

fY (y) =1

|a|σ√2πexp

{−1

2

(y − b− aµ

)}

which is the density for a N(aµ + b, (aσ)2). 2

From this all normal density curves musthave the same basic shape and if X ∼N(µ, σ2)

f(x) =1σφ

(x− µ

σ

)

and

F (x) = Φ(

x− µ

σ

)

One consequence of this, is that to get probabilities involving normaldistributions we only need a single function, or table.

Continuous Distributions 13

Page 15: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

−4 −2 0 2 4

X ~ N(mu,sigma^2)

Z ~ N(0,1)

x = mu + z sigma

z

If X ∼ N(µ, σ2), then

Z =X − µ

σ∼ N(0, 1)

The values Z are sometimesreferred to as the standard scores.

Suppose for example that bloodpressure (X) can be modelled(approximately) by a normaldistribution with µ = 120 andσ = 20.

If we are interested in theP [X ≤ 140], this is the same as the P [Z ≤ 1] since

z =140− 120

20= 1

Table 2 of Rice gives the CDF of the standard normal.

Continuous Distributions 14

Page 16: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

Note that this tableis actually taken fromMoore and McCabe’sIntroduction to thePractice of Statistics.However it is the samea Table 2 in Rice.

z is the standardnormal variable. Thevalue of P for −zequals 1 minus thevalue of P for +z; forexample P for −1.62equals 1 − 0.9474 =0.0526.

Continuous Distributions 15

Page 17: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

60 80 100 120 140 160 180

0.00

00.

010

0.02

0

Blood Pressure Density

Blood Pressure

f(x)

95

We can use the table to get thefollowing probabilities

• P [X ≤ 140]

P [X ≤ 140] = P [Z ≤ 1] = 0.8413

• P [X ≥ 95]

P [X ≥ 95] = P

[Z ≥ 95− 120

20

]

= P [Z ≥ −1.25]

= 1− P [Z ≤ −1.25]

Continuous Distributions 16

Page 18: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

Using the fact that P [Z ≤ −1.25] = 1− P [Z ≤ 1.25] (table flip),P [Z ≤ −1.25] = 1− 0.8944 = 0.1056. Therefore

P [X ≥ 95] = 1− 0.1056 = 0.8944

60 80 100 120 140 160 180

0.00

00.

010

0.02

0

Blood Pressure Density

Blood Pressure

f(x)

95

• P [95 ≤ X ≤ 140]

P [95 ≤ X ≤ 140] = P [X ≤ 140]− P [X ≤ 95]

= P [Z ≤ 1]− P [Z ≤ −1.25]

= 0.8413− 0.1056 = 0.7357

Continuous Distributions 17

Page 19: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

Gamma Distributions

The gamma distribution can be used model a wide range of non-negativeRVs. It has been used to model times between earthquakes, the size ofautomobile insurance claims, rainfall amounts, plant yields.

The PDF for the for the gamma distribution (G(α, λ)) is

f(x) =λα

Γ(α)xα−1e−λx; x ≥ 0

The parameter α is the shape parameter of the gamma distribution and 1λ

is the scale parameter.

The gamma distribution is a generalization of exponential distribution asExp(λ) = G(1, λ).

Continuous Distributions 18

Page 20: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

0 2 4 6 8

0.0

0.4

0.8

1.2

G(alpha,1) Densities

x

f(x)

alpha=1alpha=0.75alpha=2alpha=4

0 2 4 6 8

0.0

0.4

0.8

1.2

G(2,lamdba) Densities

x

f(x)

alpha=1alpha=0.75alpha=2alpha=4

0 2 4 6 8

0.0

0.2

0.4

0.6

0.8

1.0

G(alpha,1) CDFs

x

f(x)

alpha=1alpha=0.75alpha=2alpha=4

0 2 4 6 8

0.0

0.2

0.4

0.6

0.8

1.0

G(2,lamdba) CDF

x

f(x)

alpha=1alpha=0.75alpha=2alpha=4

Continuous Distributions 19

Page 21: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

The “normalization constant”

Γ(α) =∫ ∞

0

xα−1e−xdx

is the Gamma function evaluated at α.

Some useful properties of the Gamma function are

1. Γ(1) =∫∞0

e−xdx = 1

2. Γ(α) = (α− 1)Γ(α− 1) (Prove by integration by parts).

3. If n is a positive integer, then Γ(n) = (n − 1)! (Direct consequence ofthe first two facts).

4. Γ(0.5) =√

π (Useful in showing that the variance for a normal is σ2).

The CDF of the gamma doesn’t have a nice closed form so you need tablesor a computer to find probabilities involving the gamma.

Continuous Distributions 20

Page 22: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

The first two moments of the gamma are

E[X] =α

λ; Var(X) =

α

λ2; SD(X) =

√α

λ

Proof. Let X ∼ G(α, 1). Then

E[Xn] =∫ ∞

0

xnxα−1e−x

Γ(α)dx

=∫ ∞

0

xα+n−1e−x

Γ(α)dx

=Γ(α + n)

Γ(α)

So

E[X] =Γ(α + 1)

Γ(α)=

αΓ(α)Γ(α)

= α

Continuous Distributions 21

Page 23: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

and

E[X] =Γ(α + 2)

Γ(α)=

(α + 1)αΓ(α)Γ(α)

= (α + 1)α

This implies that

Var(X) = E[X2]− (E[X])2 = α

Let Y = Xλ . Then Y ∼ G(α, λ) as

fY (y) =λ(λy)α−1e−λy

Γ(α)

Thus

E[Y ] =E[X]

λ=

α

λ; Var(Y ) =

Var(X)λ2

λ2;

2

Continuous Distributions 22

Page 24: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

Beta Distributions

Beta distributions are useful for data that occur in fixed, finite intervals

The PDF for the for the beta distribution (Beta(α, β)) is

f(x) =xa−1(1− x)b−1

β(a, b); 0 ≤ x ≤ 1

The function β(a, b) is known as the Beta function and is

β(a, b) =∫ 1

0

xa−1(1− x)b−1dx

=Γ(a)Γ(b)Γ(a + b)

Continuous Distributions 23

Page 25: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

2.5

Beta(a,a) Densities

x

f(x)

0.81246

0.0 0.2 0.4 0.6 0.8 1.0

01

23

45

6

Beta(a,6)Densities

x

f(x)

12348

Like many continuous distributions, the CDF for the beta does not have anice form and must be determined through tables or software.

Continuous Distributions 24

Page 26: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

The first two moments of the uniform are

E[X] =a

a + b; Var(X) =

ab

(a + b + 1)(a + b)2

Proof.

E[Xn] =1

β(a, b)

∫ 1

0

xnxa−1(1− x)b−1

=β(a + n, b)

β(a, b)

So

E[X] =β(a + 1, b)

β(a, b)=

Γ(a + 1)Γ(b)Γ(a + b + 1)

Γ(a + b)Γ(a)Γ(b)

=a

a + b

and

E[X2] =β(a + 1, b)

β(a, b)=

(a + 1)a(a + b + 1)(a + b)

Continuous Distributions 25

Page 27: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

which implies

Var(X) =a

a + b

(a + 1

a + b + 1− a

a + b

)=

a

a + b

b

(a + b + 1)(a + b)

2

Continuous Distributions 26

Page 28: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

Cauchy Distributions

The Cauchy distribution (also known as the Lorentzian distribution), isoften used for describing resonance behavior. It can also be used to describeoutliers in data sets. However it is more commonly used in probability andstatistics as a distribution that can be used for counter examples.

The PDF for the for the Cauchy distribution (C(µ, σ)) is

f(x) =1

πσ(1 + (x−µ

σ )2)

The CDF is

F (x) =12

+1πArctan

(x− µ

σ

)

Continuous Distributions 27

Page 29: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

−4 −2 0 2 4

0.05

0.15

0.25

C(mu,1) Densities

x

f(x)

0−11

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

0.6

C(0,sigma) Densities

x

f(x)

10.52

The Cauchy is another example of a location-scale distribution (the normalis the first we’ve discussed). If X ∼ C(0, 1), then Y = µ + σX is C(µ, σ).

Continuous Distributions 28

Page 30: Common Continuous Distributions - Mark Irwin · Common Continuous Distributions ... Beta distributions are useful for data that occur in flxed, flnite intervals The PDF for the

The Cauchy distribution is known as a heavy tailed distribution. Its tailsdecay to 0 very slowly.

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

Normal vs Cauchy

x

f(x)

N(0,1)C(0,1)

So slowly in fact, that the Cauchy has no moments. For all n,

∫ ∞

−∞|xn|f(x)dx = ∞

Continuous Distributions 29