of 30/30
Continuous Probability, RVs, Distributions EECS 126 Fall 2019 September 17, 2019

Continuous Probability, RVs, Distributionsee126/fa19/slides/... · 2019-12-23 · Probability Densities In a continuous space, we describe distributions with probability density functions

  • View
    4

  • Download
    0

Embed Size (px)

Text of Continuous Probability, RVs, Distributionsee126/fa19/slides/... · 2019-12-23 · Probability...

  • Continuous Probability, RVs, Distributions

    EECS 126 Fall 2019

    September 17, 2019

  • Agenda

    Announcements

    ReviewContinuous Probability DefinitionsCumulative Distribution Functions

    DistributionsUniformExponentialGaussian

    Analogs to Discrete Probability / RVs

    Derived Distributions

  • Announcements

    I HW3 AND Lab2 are due Friday (9/20).

    I Feel free to come to Lab Party with HW questions onThursday!

    I HW4 will be optional to give you more time to study. We stillrecommend reading and attempting the problems.

    I Midterm 1 is coming up quick on 9/26! You can find pastexams on the Exams page of the website.

  • Probability Densities

    In a continuous space, we describe distributions with probabilitydensity functions (PDFs) rather than assigned probability values.

    A valid probability density of a continuous random variable X in R,fX (x), requires

    I Non-negativity: ∀x ∈ R fX (x) ≥ 0I Normalized:

    ∫R fX (x)dx = 1

  • Continuous Probability DefinitionsGetting probabilities from densities:

    I P(X ∈ B) =∫B fX (x)dx

    I P(X ∈ [a, b]) = P(a ≤ X ≤ b) =∫ ba fX (x)dx

    (Note: P(X = a) = 0, so open and closed intervals do notmatter here)

    Figure: Geometric interpretation of the PDF

  • Questions

    Suppose we uniformly sample a point in a ball of radius 1. What isthe

    I Probability of picking the origin?

    I Probability density of picking the origin?

    I Probability of picking a point on the surface?

    I Probability of picking a point within a radius of 12?

  • Answers

    I Probability of picking the origin?0.

    I Probability density of picking the origin?Volume of ball is 43πr

    3 = 43π. Density is34π .

    I Probability of picking a point on the surface?0. A 2D surface has 0 volume in a 3D object.

    I Probability of picking a point within a radius of 12?Since the we’re uniformly picking a point in the ball, we can

    just look at the ratio of the volumes.4π3( 12)3

    4π3

    = 18 .

  • Cumulative Distribution Functions (CDFs)

    In both discrete and continuous distributions, the cumulativedistribution is defined as FX (x) := P(X ≤ x). However, they arecomputed slightly differently.

    FX (x) =

    ∫ x−∞

    f (t)dt

    Consequently (by the Fundamental Theorem of Calculus),

    fX (x) =d

    dxFX (x)

  • More familiar definitions

    Expectation:

    I E [X ] :=∫R xfX (x)dx

    I E [g(X )] :=∫R g(x)fX (x)dx

    I Linearity of expectation holds due to the linearity of integrals:E [X + Y ] = E [X ] + E [Y ]

    Variance stays the same

    Var(X ) = E [(X − E [X ])2] = E [X 2]− E [X ]2

  • Questions

    Let R be equal to the distance from the origin of a point randomlysampled on a unit ball. What is the

    I CDF of R?

    I PDF of R?

    I Expectation of R?

  • Answers

    Let R be the distance from the origin of a point randomly sampledon a unit ball. What is the

    I CDF of R?FR(r) =

    34π ·

    43πr

    3 = r3.

    I PDF of R?ddr r

    3 = 3r2.

    I Expectation of R?∫ 10 r · 3r

    2 = 34 .

  • Uniform Distribution

    The density is uniform across a bounded interval (a, b). ForX ∼ Unif (a, b)

    fX (x) =1

    b − a, a < x < b

    E [X ] =a + b

    2, Var(X ) =

    (b − a)2

    12

    Easy to work with distribution. Many problems can reduce to auniform distribution!

  • Uniform Variance Proof

    Var(X ) = E [X 2]− E [X ]2

    E [X ] =

    ∫ ba

    x1

    b − adx

    =x2

    2(b − a)|ba

    =a + b

    2

    E [X 2] =

    ∫ ba

    x21

    b − adx

    =x3

    3(b − a)|ba

    =b3 − a3

    3(b − a)

    Var(X ) =b3 − a3

    3(b − a)− (a + b)

    2

    4=

    (b − a)2

    12

  • Exponential DistributionThe exponential distribution PDF:

    fX (x) = λe−λx , x > 0

    The exponential distribution CDF:

    FX (x) = 1− e−λx , x > 0

    E [X ] =1

    λ,Var(X ) =

    1

    λ2

    Figure: Exponential distribution for varying λ

  • Memoryless Property

    The defining characteristic of the exponential is the memorylessproperty. Recall the memoryless property is:

    P(X > x + a|X > x) = P(X > a)

    Think about banging your head on the wall.

    What distribution does this remind you of?

  • Connection to Geometric

    One can think of the exponential distribution as the continuousanalog to the geometric distribution.

    Remark: These are the only distributions in discrete andcontinuous spaces respectively with the memoryless property!

    Figure: Relating the Exponential dist. to the Geometric dist.

  • Connection to Geometric cont.

    Intuition that the geometric distribution approaches theexponential distribution as trials per second approaches infinity.

    Let X ∼ Geo(p),Y ∼ Expo(λ). Recall the CDF of the geometricdistribution

    FX (n) = 1− (1− p)n

    If we let δ = −ln(1−p)λ , we have e−λδ = 1− p. Thus,

    FX (n) = FY (nδ). If we drive δ down, we can interpret this as ageometric r.v. holding infinitely many trials per second whilemaking sure that the expected number of trials passed stays thesame. As δ → 0, we approach a continuous exponentialdistribution.

  • Normal / Gaussian Distribution

    The Gaussian is seen abundantly in nature (e.g. exam scores).This can be explained by the Central Limit Theorem (CLT), whichwe will go over later in the course.

    Gaussian PDF and CDF for mean µ and variance σ2:

    fX (x) =1√

    2πσ2e−(x−µ)

    2/2σ2

    FX (x) = Φ(x), (cannot be expressed in elementary functions)

  • Properties of the Gaussian

    I The sum of two independent Gaussians is Gaussian. IfX ∼ N(µ1, σ21), Y ∼ N(µ2, σ22), and Z = X + Y , then

    Z ∼ N(µ1 + µ2, σ21 + σ22)

    I The sum of two dependent Gaussians isn’t always Gaussian.Consider the following example.

    X = N(0, 1)

    Y =

    {X w.p. 12−X w.p. 12

    They are both Gaussian but X + Y is not Gaussian.

    I A Gaussian multiplied by a constant is Gaussian. IfX ∼ N(µ, σ2) and Y = aX , then

    Y ∼ N(a · µ, a2 · σ2)

  • Scaling to the Standard Gaussian

    I The properties on the previous slide allow us to convert anyGaussian into the standard Gaussian.

    I If X ∼ N(µ, σ2), then

    Z =X − µσ

    is distributed with Z ∼ N(0, 1).I Intuition: I got 1 SD on midterm 1.

  • Joint PDFs

    Just how multiple discrete RVs have a joint PMF, multiplecontinuous RVs have a joint PDF.

    I Discrete

    pX ,Y (x , y)

    I Continuous

    fX ,Y (x , y)

    I Still needs to be non-negative.

    I Still needs to integrate to 1.

  • Joint CDFs

    I Single RV

    FX (x) = P(X ≤ x)

    I Multiple RVs

    FX ,Y (x , y) = P(X ≤ x ,Y ≤ y)

    I Single RV

    d

    dxFX (x) = fX (x)

    I Multiple RV

    ∂2

    ∂x∂yFX ,Y (x , y) = fX ,Y (x , y)

  • Marginal Probability Density

    I Discrete

    pX (x) =∑y∈Y

    pX ,Y (x , y)

    I Continuous

    fX (x) =

    ∫ ∞−∞

    fX ,Y (x , y) dy

    I fX (x) is still a density, not a probability.

  • Conditional Probability Density

    I Discrete

    pX |Y (x | y) =pX ,Y (x , y)

    pY (y)

    I Continuous

    fX |Y (x | y) =fX ,Y (x , y)

    fY (y)

    I By definition, Multiplication Rule still holds.

  • Independence

    Similar to discrete, 3 equivalent definitions.

    I For all x and y ,

    fX ,Y (x , y) = fX (x)fY (y)

    I For all x and y ,

    fX |Y (x | y) = fX (x)

    I For all x and y ,

    fY |X (y | x) = fY (y)

  • Bayes Rule

    I Discrete (simple form)

    pX |Y (x | y) =pY |X (y | x)pX (x)

    pY (y)

    I Discrete (extended form)

    pX |Y (x | y) =pY |X (y | x)pX (x)∑

    x ′∈X pY |X (y | x ′)pX (x ′)

    I Continuous (simple form)

    fX |Y (x | y) =fY |X (y | x)fX (x)

    fY (y)

    I Continuous (extended form)

    fX |Y (x | y) =fY |X (y | x)fX (x)∫∞

    −∞ fY |X (y | t)pX (t) dt

  • Conditional Expectation

    I Discrete

    E [Y | X = x ] =∑y∈Y

    y · pY |X (y | x)

    I Continuous

    E [Y | X = x ] =∫ ∞−∞

    y · fY |X (y | x) dy

  • Combining Discrete and Continuous RVs

    I You can also have discrete and continuous RVs defined jointly.

    I Ex. let X be the outcome of a dice roll and Y be Exp(X ).

    pX (x) =1

    6fY |X (y | x) = xe−xy

  • Change of Variables / Derived Distributions

    I Let X ∼ U[0, 1], and Y = 2X . Then is it true that

    fY (y) = P(Y = y) = P(2X = y) = P(X =y

    2) = fX (

    y

    2)

    I No, this won’t integrate to 1.

    I You have to use the CDF.

    FY (y) = P(Y ≤ y) = P(2X ≤ y) = P(X ≤y

    2) = FX (

    y

    2)

    I

    fY (y) =d

    dyFX (

    y

    2) = fX (

    y

    2) · 1

    2

  • References

    Introduction to probability. DP Bertsekas, JN Tsitsiklis - 2002

    AnnouncementsReviewContinuous Probability DefinitionsCumulative Distribution Functions

    DistributionsUniformExponentialGaussian

    Analogs to Discrete Probability / RVsDerived Distributions