Probability Theory Presentation 07

Embed Size (px)

Citation preview

  • 8/8/2019 Probability Theory Presentation 07

    1/56

    BST 401 Probability Theory

    Xing Qiu Ha Youn Lee

    Department of Biostatistics and Computational BiologyUniversity of Rochester

    September 22, 2009

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    2/56

    Outline

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    3/56

    Review of last lecture

    The distribution function of a measure on R:

    F(x) = ((, x]).

    By definition, determines F.

    On the other hand, ((a, b]) = F(b) F(a), which thendeterminesthe value of this measure on all Borel sets

    through countable infinite set operations. (Carathodory

    extension theorem).

    Lebesgue-Stieltjes measures. Their distribution functions

    are a) increasing; b) right-continuous. (Most literature

    requires F to be non-negative as well).

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    4/56

    Review of last lecture

    The distribution function of a measure on R:

    F(x) = ((, x]).

    By definition, determines F.

    On the other hand, ((a, b]) = F(b) F(a), which thendeterminesthe value of this measure on all Borel sets

    through countable infinite set operations. (Carathodory

    extension theorem).

    Lebesgue-Stieltjes measures. Their distribution functions

    are a) increasing; b) right-continuous. (Most literature

    requires F to be non-negative as well).

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    5/56

    Review of last lecture

    The distribution function of a measure on R:

    F(x) = ((, x]).

    By definition, determines F.

    On the other hand, ((a, b]) = F(b) F(a), which thendeterminesthe value of this measure on all Borel sets

    through countable infinite set operations. (Carathodory

    extension theorem).

    Lebesgue-Stieltjes measures. Their distribution functions

    are a) increasing; b) right-continuous. (Most literature

    requires F to be non-negative as well).

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    6/56

    Review of last lecture

    The distribution function of a measure on R:

    F(x) = ((, x]).

    By definition, determines F.

    On the other hand, ((a, b]) = F(b) F(a), which thendeterminesthe value of this measure on all Borel sets

    through countable infinite set operations. (Carathodory

    extension theorem).

    Lebesgue-Stieltjes measures. Their distribution functions

    are a) increasing; b) right-continuous. (Most literature

    requires F to be non-negative as well).

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    7/56

    Review of last lecture (II)

    Compare to the Lebesgue measure: both are defined on

    B; both are finite on bounded intervals;

    L-S measure does not have to be uniform, i.e., ,

    ((a, b]) = b a in general. L-S measure may contain

    discrete measures, or jump points of F. i.e., , for certainsingle point set {a}, ({a}) > 0.

    Restriction of a measure.

    Rn generalizations of distribution functions and L-S

    measures.Basic definition of measurable functions. Sometimes called

    coding functions.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    8/56

    Review of last lecture (II)

    Compare to the Lebesgue measure: both are defined on

    B; both are finite on bounded intervals;

    L-S measure does not have to be uniform, i.e., ,

    ((a, b]) = b a in general. L-S measure may contain

    discrete measures, or jump points of F. i.e., , for certainsingle point set {a}, ({a}) > 0.

    Restriction of a measure.

    Rn generalizations of distribution functions and L-S

    measures.Basic definition of measurable functions. Sometimes called

    coding functions.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    9/56

    Review of last lecture (II)

    Compare to the Lebesgue measure: both are defined on

    B; both are finite on bounded intervals;

    L-S measure does not have to be uniform, i.e., ,

    ((a, b]) = b a in general. L-S measure may contain

    discrete measures, or jump points of F. i.e., , for certainsingle point set {a}, ({a}) > 0.

    Restriction of a measure.

    Rn generalizations of distribution functions and L-S

    measures.Basic definition of measurable functions. Sometimes called

    coding functions.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    10/56

    Review of last lecture (II)

    Compare to the Lebesgue measure: both are defined on

    B; both are finite on bounded intervals;

    L-S measure does not have to be uniform, i.e., ,

    ((a, b]) = b a in general. L-S measure may contain

    discrete measures, or jump points of F. i.e., , for certainsingle point set {a}, ({a}) > 0.

    Restriction of a measure.

    Rn generalizations of distribution functions and L-S

    measures.Basic definition of measurable functions. Sometimes called

    coding functions.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    11/56

    Review of last lecture (II)

    Compare to the Lebesgue measure: both are defined on

    B; both are finite on bounded intervals;

    L-S measure does not have to be uniform, i.e., ,

    ((a, b]) = b a in general. L-S measure may contain

    discrete measures, or jump points of F. i.e., , for certainsingle point set {a}, ({a}) > 0.

    Restriction of a measure.

    Rn generalizations of distribution functions and L-S

    measures.Basic definition of measurable functions. Sometimes called

    coding functions.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    12/56

    General measurable functions

    We need two measure spaces (1,F1, 1) and(2,F2, 2).

    For random variables, the first one is an arbitrary

    probability space, the second one is a good measure

    space, e.g., Lebesgue measure space of real numbers.

    A function h : 1 2 is called measurable relative toF1,F2 if for every A in F2, its inverse h

    1(A) ismeasurable. In mathematical notation:

    h1

    (A) F

    1, A F

    2. (1)

    Borel measurable functions are real functions (i.e.,

    f(x) : R R) which are measurable relative to the Borelsets of the two Rs.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    13/56

    General measurable functions

    We need two measure spaces (1,F1, 1) and(2,F2, 2).

    For random variables, the first one is an arbitrary

    probability space, the second one is a good measure

    space, e.g., Lebesgue measure space of real numbers.

    A function h : 1 2 is called measurable relative toF1,F2 if for every A in F2, its inverse h

    1(A) ismeasurable. In mathematical notation:

    h1

    (A) F

    1, A F

    2. (1)

    Borel measurable functions are real functions (i.e.,

    f(x) : R R) which are measurable relative to the Borelsets of the two Rs.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    14/56

    General measurable functions

    We need two measure spaces (1,F1, 1) and(2,F2, 2).

    For random variables, the first one is an arbitrary

    probability space, the second one is a good measure

    space, e.g., Lebesgue measure space of real numbers.

    A function h : 1 2 is called measurable relative toF1,F2 if for every A in F2, its inverse h

    1(A) ismeasurable. In mathematical notation:

    h1

    (A) F

    1, A F

    2. (1)

    Borel measurable functions are real functions (i.e.,

    f(x) : R R) which are measurable relative to the Borelsets of the two Rs.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    15/56

    General measurable functions

    We need two measure spaces (1,F1, 1) and(2,F2, 2).

    For random variables, the first one is an arbitrary

    probability space, the second one is a good measure

    space, e.g., Lebesgue measure space of real numbers.

    A function h : 1 2 is called measurable relative toF1,F2 if for every A in F2, its inverse h

    1(A) ismeasurable. In mathematical notation:

    h

    1

    (A) F

    1, A F

    2. (1)

    Borel measurable functions are real functions (i.e.,

    f(x) : R R) which are measurable relative to the Borelsets of the two Rs.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    16/56

    -algebra generated by a measurable function

    A measurable function (random variable) can generate a

    -algebra of in this way

    (h) :=

    h1

    (B) : B B

    .

    where B denotes the Borel -algebra.

    It is not hard to show that (h) F.

    In general, (h) F

    and there are some information lostin this process.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    17/56

    -algebra generated by a measurable function

    A measurable function (random variable) can generate a

    -algebra of in this way

    (h) :=

    h1

    (B) : B B

    .

    where B denotes the Borel -algebra.

    It is not hard to show that (h) F.

    In general, (h) F

    and there are some information lostin this process.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    18/56

    -algebra generated by a measurable function

    A measurable function (random variable) can generate a

    -algebra of in this way

    (h) :=

    h1

    (B) : B B

    .

    where B denotes the Borel -algebra.

    It is not hard to show that (h) F.

    In general, (h) F

    and there are some information lostin this process.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    19/56

    A binomial example

    A binomial example. Toss a coin for three times. = {H, T}3. Define X() : R to be the numberof heads. What is the -algebra generated by X? Is setA = {(H, H, H), (H, T, H)} a member of this -algebra?

    So (X) is coarser than 2. From this aspect, some

    information is lost.

    Another interpretation: if we know a particular observation

    , we can compute X. But if we only know X, we cant besure what might be.

    However, X is the minimal sufficient statistic of a Binomialmodel. In other words, it captures all the information which

    is relevant to the unknown parameter.

    I will sent you some slides on this topic later.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    20/56

    A binomial example

    A binomial example. Toss a coin for three times. = {H, T}3. Define X() : R to be the numberof heads. What is the -algebra generated by X? Is setA = {(H, H, H), (H, T, H)} a member of this -algebra?

    So (X) is coarser than 2. From this aspect, some

    information is lost.

    Another interpretation: if we know a particular observation

    , we can compute X. But if we only know X, we cant besure what might be.

    However, X is the minimal sufficient statistic of a Binomialmodel. In other words, it captures all the information which

    is relevant to the unknown parameter.

    I will sent you some slides on this topic later.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    21/56

    A binomial example

    A binomial example. Toss a coin for three times. = {H, T}3. Define X() : R to be the numberof heads. What is the -algebra generated by X? Is setA = {(H, H, H), (H, T, H)} a member of this -algebra?

    So (X) is coarser than 2. From this aspect, some

    information is lost.

    Another interpretation: if we know a particular observation

    , we can compute X. But if we only know X, we cant besure what might be.

    However, X is the minimal sufficient statistic of a Binomialmodel. In other words, it captures all the information which

    is relevant to the unknown parameter.

    I will sent you some slides on this topic later.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    22/56

    A binomial example

    A binomial example. Toss a coin for three times. = {H, T}3. Define X() : R to be the numberof heads. What is the -algebra generated by X? Is setA = {(H, H, H), (H, T, H)} a member of this -algebra?

    So (X) is coarser than 2. From this aspect, some

    information is lost.

    Another interpretation: if we know a particular observation

    , we can compute X. But if we only know X, we cant besure what might be.

    However, X is the minimal sufficient statistic of a Binomialmodel. In other words, it captures all the information which

    is relevant to the unknown parameter.

    I will sent you some slides on this topic later.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    23/56

    A binomial example

    A binomial example. Toss a coin for three times. = {H, T}3. Define X() : R to be the numberof heads. What is the -algebra generated by X? Is setA = {(H, H, H), (H, T, H)} a member of this -algebra?

    So (X) is coarser than 2. From this aspect, some

    information is lost.

    Another interpretation: if we know a particular observation

    , we can compute X. But if we only know X, we cant besure what might be.

    However, X is the minimal sufficient statistic of a Binomialmodel. In other words, it captures all the information which

    is relevant to the unknown parameter.

    I will sent you some slides on this topic later.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    24/56

    Indicators and simple functions (I)

    We would like to study the properties of all random

    variables/Borel measurable functions. Again, let us start with

    the most trivial building blocks.

    Indicator function: 1A() =

    1 for A,

    0 for / A.is called the

    indicatorfunction of A.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    25/56

    Indicators and simple functions (II)

    If A is a Borel set, 1A is a Borel-measurable function. vice

    versa.

    1A takes only two values, 0, 1.

    If h a) is Borel-measurable; b) takes only finitely manyvalues, h is called a simple function.

    Equivalently, h is the finite sum of indicator functions:

    h() =

    ri=1 xi1Ai, Ais are disjoint Borel sets.

    +,,, / of simple functions are simple functions.

    Remark: step functions are simple functions, but simple

    functions may not be step functions!

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    26/56

    Indicators and simple functions (II)

    If A is a Borel set, 1A is a Borel-measurable function. vice

    versa.

    1A takes only two values, 0, 1.

    If h a) is Borel-measurable; b) takes only finitely manyvalues, h is called a simple function.

    Equivalently, h is the finite sum of indicator functions:

    h() =

    ri=1 xi1Ai, Ais are disjoint Borel sets.

    +,,, / of simple functions are simple functions.

    Remark: step functions are simple functions, but simple

    functions may not be step functions!

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    27/56

    Indicators and simple functions (II)

    If A is a Borel set, 1A is a Borel-measurable function. vice

    versa.

    1A takes only two values, 0, 1.

    If h a) is Borel-measurable; b) takes only finitely many

    values, h is called a simple function.

    Equivalently, h is the finite sum of indicator functions:

    h() =

    ri=1 xi1Ai, Ais are disjoint Borel sets.

    +,,, / of simple functions are simple functions.

    Remark: step functions are simple functions, but simple

    functions may not be step functions!

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    28/56

    Indicators and simple functions (II)

    If A is a Borel set, 1A is a Borel-measurable function. vice

    versa.

    1A takes only two values, 0, 1.

    If h a) is Borel-measurable; b) takes only finitely many

    values, h is called a simple function.

    Equivalently, h is the finite sum of indicator functions:

    h() =

    ri=1 xi1Ai, Ais are disjoint Borel sets.

    +,,, / of simple functions are simple functions.

    Remark: step functions are simple functions, but simple

    functions may not be step functions!

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    29/56

    Indicators and simple functions (II)

    If A is a Borel set, 1A is a Borel-measurable function. vice

    versa.

    1A takes only two values, 0, 1.

    If h a) is Borel-measurable; b) takes only finitely many

    values, h is called a simple function.

    Equivalently, h is the finite sum of indicator functions:

    h() =

    ri=1 xi1Ai, Ais are disjoint Borel sets.

    +,,, / of simple functions are simple functions.

    Remark: step functions are simple functions, but simple

    functions may not be step functions!

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    30/56

    Indicators and simple functions (II)

    If A is a Borel set, 1A is a Borel-measurable function. vice

    versa.

    1A takes only two values, 0, 1.

    If h a) is Borel-measurable; b) takes only finitely many

    values, h is called a simple function.

    Equivalently, h is the finite sum of indicator functions:

    h() =

    ri=1 xi1Ai, Ais are disjoint Borel sets.

    +,,, / of simple functions are simple functions.

    Remark: step functions are simple functions, but simple

    functions may not be step functions!

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    31/56

    Borel-measurable functions are limits of simple

    functions

    Point-wise convergence: hn() h() for all . Wecan simply say h is the limit of hn.

    Ashs book, Thm 1.5.5: All Borel-measurable function arelimits of simple functions. Use a figure to illustrate this

    point.

    Ashs book, Thm 1.5.4: limits of sequence of

    Borel-measurable functions are again Borel-measurablefunctions. Analogy: set limits of Borel sets are Borel sets.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    32/56

    Borel-measurable functions are limits of simple

    functions

    Point-wise convergence: hn() h() for all . Wecan simply say h is the limit of hn.

    Ashs book, Thm 1.5.5: All Borel-measurable function arelimits of simple functions. Use a figure to illustrate this

    point.

    Ashs book, Thm 1.5.4: limits of sequence of

    Borel-measurable functions are again Borel-measurablefunctions. Analogy: set limits of Borel sets are Borel sets.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    33/56

    Borel-measurable functions are limits of simple

    functions

    Point-wise convergence: hn() h() for all . Wecan simply say h is the limit of hn.

    Ashs book, Thm 1.5.5: All Borel-measurable function arelimits of simple functions. Use a figure to illustrate this

    point.

    Ashs book, Thm 1.5.4: limits of sequence of

    Borel-measurable functions are again Borel-measurablefunctions. Analogy: set limits of Borel sets are Borel sets.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    34/56

    Properties of Borel-measurable functions

    The set of Borel measurable functions are a) closed under

    point-wise limit operation; b) generated by just simple

    functions through the (pointwise) limit operation.

    R, Q analogy again. R is closed under the limit operation.

    R can be generated by its dense subset Q.

    The set of Borel-measurable functions is closed under

    +,,, /, and function composition because step

    functions are closed under these operations.

    Qiu, Lee BST 401

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    35/56

    Properties of Borel-measurable functions

    The set of Borel measurable functions are a) closed under

    point-wise limit operation; b) generated by just simple

    functions through the (pointwise) limit operation.

    R, Q analogy again. R is closed under the limit operation.R can be generated by its dense subset Q.

    The set of Borel-measurable functions is closed under

    +,,, /, and function composition because step

    functions are closed under these operations.

    Qiu, Lee BST 401

    P i f B l bl f i

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    36/56

    Properties of Borel-measurable functions

    The set of Borel measurable functions are a) closed under

    point-wise limit operation; b) generated by just simple

    functions through the (pointwise) limit operation.

    R, Q analogy again. R is closed under the limit operation.R can be generated by its dense subset Q.

    The set of Borel-measurable functions is closed under

    +,,, /, and function composition because step

    functions are closed under these operations.

    Qiu, Lee BST 401

    D fi iti f th I t l

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    37/56

    Definition of the Integral

    Now we are going to define the abstract Lebesgue integral

    For indicators:

    1Ad = (A).

    For simple functions:

    hd =

    ri=1 xi(Ai). (other

    notations: h()d(), h()(d).)Motivation 1. Mathematical expectation of a discrete

    random variable.

    Motivation 2. Riemann integral. Velocity and distance.

    Classical rectangle representation.

    Roughly speaking, an integral w.r.t. is just a weightedRiemann integral/summation.

    Qiu, Lee BST 401

    D fi iti f th I t l

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    38/56

    Definition of the Integral

    Now we are going to define the abstract Lebesgue integral

    For indicators:

    1Ad = (A).

    For simple functions:

    hd =

    ri=1 xi(Ai). (other

    notations: h()d(), h()(d).)Motivation 1. Mathematical expectation of a discrete

    random variable.

    Motivation 2. Riemann integral. Velocity and distance.

    Classical rectangle representation.

    Roughly speaking, an integral w.r.t. is just a weightedRiemann integral/summation.

    Qiu, Lee BST 401

    D fi iti f th I t l

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    39/56

    Definition of the Integral

    Now we are going to define the abstract Lebesgue integral

    For indicators:

    1Ad = (A).

    For simple functions:

    hd =

    ri=1 xi(Ai). (other

    notations: h()d(), h()(d).)Motivation 1. Mathematical expectation of a discrete

    random variable.

    Motivation 2. Riemann integral. Velocity and distance.

    Classical rectangle representation.

    Roughly speaking, an integral w.r.t. is just a weightedRiemann integral/summation.

    Qiu, Lee BST 401

    Definition of the Integral

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    40/56

    Definition of the Integral

    Now we are going to define the abstract Lebesgue integral

    For indicators:

    1Ad = (A).

    For simple functions:

    hd =

    ri=1 xi(Ai). (other

    notations: h()d(), h()(d).)Motivation 1. Mathematical expectation of a discrete

    random variable.

    Motivation 2. Riemann integral. Velocity and distance.

    Classical rectangle representation.

    Roughly speaking, an integral w.r.t. is just a weightedRiemann integral/summation.

    Qiu, Lee BST 401

    Definition of the Integral

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    41/56

    Definition of the Integral

    Now we are going to define the abstract Lebesgue integral

    For indicators:

    1Ad = (A).

    For simple functions:

    hd =

    ri=1 xi(Ai). (other

    notations: h()d(), h()(d).)Motivation 1. Mathematical expectation of a discrete

    random variable.

    Motivation 2. Riemann integral. Velocity and distance.

    Classical rectangle representation.

    Roughly speaking, an integral w.r.t. is just a weightedRiemann integral/summation.

    Qiu, Lee BST 401

    Definition of the integral (II)

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    42/56

    Definition of the integral (II)

    For arbitrary Borel-measurable functions (random

    variables), we use limit/approximation to define the

    integral. Technically, it involves four steps. Please go

    through the textbook, page 452-458.An analogy: in measure extension, we first have a very

    simple algebra F0 and a simple 0. Then we extend 0 to1 defined on G, which includes all limiting sets of F0. 1

    is defined by taking limits.

    Qiu, Lee BST 401

    Definition of the integral (II)

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    43/56

    Definition of the integral (II)

    For arbitrary Borel-measurable functions (random

    variables), we use limit/approximation to define the

    integral. Technically, it involves four steps. Please go

    through the textbook, page 452-458.An analogy: in measure extension, we first have a very

    simple algebra F0 and a simple 0. Then we extend 0 to1 defined on G, which includes all limiting sets of F0. 1

    is defined by taking limits.

    Qiu, Lee BST 401

    Integrability

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    44/56

    Integrability

    A Borel-measurable function is called integrable if|h|d

    is finite.

    It is equivalent to say that the positive/negative branches of

    h have finite values of integral. Show a picture of these twobranches.

    For a random variable X, this means EX exists and is finite.

    Infinity is a nuisance in math. Almost all theorem about the

    integral (expectation) of random variables/meas. functions

    need the integrability condition.

    Qiu, Lee BST 401

    Integrability

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    45/56

    Integrability

    A Borel-measurable function is called integrable if|h|d

    is finite.

    It is equivalent to say that the positive/negative branches of

    h have finite values of integral. Show a picture of these twobranches.

    For a random variable X, this means EX exists and is finite.

    Infinity is a nuisance in math. Almost all theorem about the

    integral (expectation) of random variables/meas. functions

    need the integrability condition.

    Qiu, Lee BST 401

    Integrability

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    46/56

    Integrability

    A Borel-measurable function is called integrable if|h|d

    is finite.

    It is equivalent to say that the positive/negative branches of

    h have finite values of integral. Show a picture of these twobranches.

    For a random variable X, this means EX exists and is finite.

    Infinity is a nuisance in math. Almost all theorem about the

    integral (expectation) of random variables/meas. functions

    need the integrability condition.

    Qiu, Lee BST 401

    Integrability

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    47/56

    Integrability

    A Borel-measurable function is called integrable if|h|d

    is finite.

    It is equivalent to say that the positive/negative branches of

    h have finite values of integral. Show a picture of these twobranches.

    For a random variable X, this means EX exists and is finite.

    Infinity is a nuisance in math. Almost all theorem about the

    integral (expectation) of random variables/meas. functions

    need the integrability condition.

    Qiu, Lee BST 401

    The notion of almost everywhere

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    48/56

    The notion of almost everywhere

    A mathematical condition (such as two functions are equal,a sequence of functions converges, etc) is said to hold

    almost everywherew.r.t. (simply denoted as either a.e.or a.s.) if this condition is true up to a zero measure set.

    For example, 1Q = 0 is true almost everywhere w.r.t. theLebesgue measure. But it is not true w.r.t. many discrete

    probabilities.

    From the measure/probability theory point of view, almost

    everywhere/almost sure conclusions are good enough.

    Almost all the theorems in probability theory (and almost allother branches of statistics) are just as good if we replace

    pointwise properties by almost everywhere properties.

    Qiu, Lee BST 401

    The notion of almost everywhere

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    49/56

    The notion of almost everywhere

    A mathematical condition (such as two functions are equal,a sequence of functions converges, etc) is said to hold

    almost everywherew.r.t. (simply denoted as either a.e.or a.s.) if this condition is true up to a zero measure set.

    For example, 1Q = 0 is true almost everywhere w.r.t. theLebesgue measure. But it is not true w.r.t. many discrete

    probabilities.

    From the measure/probability theory point of view, almost

    everywhere/almost sure conclusions are good enough.

    Almost all the theorems in probability theory (and almost allother branches of statistics) are just as good if we replace

    pointwise properties by almost everywhere properties.

    Qiu, Lee BST 401

    The notion of almost everywhere

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    50/56

    The notion of almost everywhere

    A mathematical condition (such as two functions are equal,a sequence of functions converges, etc) is said to hold

    almost everywherew.r.t. (simply denoted as either a.e.or a.s.) if this condition is true up to a zero measure set.

    For example, 1Q = 0 is true almost everywhere w.r.t. theLebesgue measure. But it is not true w.r.t. many discrete

    probabilities.

    From the measure/probability theory point of view, almost

    everywhere/almost sure conclusions are good enough.

    Almost all the theorems in probability theory (and almost allother branches of statistics) are just as good if we replace

    pointwise properties by almost everywhere properties.

    Qiu, Lee BST 401

    The notion of almost everywhere

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    51/56

    e ot o o a ost e e y e e

    A mathematical condition (such as two functions are equal,a sequence of functions converges, etc) is said to hold

    almost everywherew.r.t. (simply denoted as either a.e.or a.s.) if this condition is true up to a zero measure set.

    For example, 1Q = 0 is true almost everywhere w.r.t. theLebesgue measure. But it is not true w.r.t. many discrete

    probabilities.

    From the measure/probability theory point of view, almost

    everywhere/almost sure conclusions are good enough.

    Almost all the theorems in probability theory (and almost allother branches of statistics) are just as good if we replace

    pointwise properties by almost everywhere properties.

    Qiu, Lee BST 401

    The four steps approach

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    52/56

    p pp

    1 Define integral for the simple functions.

    2 Define integral for measurable functions that are a)

    non-zero only on a set E with finite measure; b) bounded.

    3 Extend the above definition to positive functions without

    the two restrictions. The positiveness is important because

    it ensures that the limiting process in the definition (first

    equation, page 456) is a monotonic process.

    4 Extend the above definition to arbitrary integrable functions

    by break the function into two branches: f = f+

    f

    , andtake integrals separately for these two branches.

    Qiu, Lee BST 401

    The four steps approach

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    53/56

    p pp

    1 Define integral for the simple functions.

    2 Define integral for measurable functions that are a)

    non-zero only on a set E with finite measure; b) bounded.

    3 Extend the above definition to positive functions without

    the two restrictions. The positiveness is important because

    it ensures that the limiting process in the definition (first

    equation, page 456) is a monotonic process.

    4 Extend the above definition to arbitrary integrable functions

    by break the function into two branches: f = f+

    f

    , andtake integrals separately for these two branches.

    Qiu, Lee BST 401

    The four steps approach

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    54/56

    p pp

    1 Define integral for the simple functions.

    2 Define integral for measurable functions that are a)

    non-zero only on a set E with finite measure; b) bounded.

    3 Extend the above definition to positive functions without

    the two restrictions. The positiveness is important because

    it ensures that the limiting process in the definition (first

    equation, page 456) is a monotonic process.

    4 Extend the above definition to arbitrary integrable functions

    by break the function into two branches: f = f+

    f

    , andtake integrals separately for these two branches.

    Qiu, Lee BST 401

    The four steps approach

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    55/56

    p pp

    1 Define integral for the simple functions.

    2 Define integral for measurable functions that are a)

    non-zero only on a set E with finite measure; b) bounded.

    3 Extend the above definition to positive functions without

    the two restrictions. The positiveness is important because

    it ensures that the limiting process in the definition (first

    equation, page 456) is a monotonic process.

    4 Extend the above definition to arbitrary integrable functions

    by break the function into two branches: f = f+

    f

    , andtake integrals separately for these two branches.

    Qiu, Lee BST 401

    Homework

    http://find/http://goback/
  • 8/8/2019 Probability Theory Presentation 07

    56/56

    Page 43, number 2, 3.

    Qiu, Lee BST 401

    http://find/http://goback/