30
David Shiung Chapter 3 Estimation and Decision Theory 1

Chapter 3 Estimation and Decision Theory

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chapter 3 Estimation and Decision Theory

David Shiung

Chapter 3

Estimation and Decision Theory

1

Page 2: Chapter 3 Estimation and Decision Theory

Introduction of Estimation and

Decision (1/1)

2

Suppose we want to estimate a parameter vector θ=(θ1,…,

θn)T. Suppose further that our measurements are corrupted

by random measurement errors that we call noise.

What is a good estimator for θ that is based only on the

acquired data?

This class of problems falls within the realm of estimation

theory. A second class of problems closely related to

estimation are those involved with making decisions in a

random environment

Page 3: Chapter 3 Estimation and Decision Theory

Parameter Estimation (1/3)

3

The problem of estimating the parameters μ and σ2 is a

problem of parameter estimation

θ estimation

1. θ : an unknown scalar that we wish to estimate

2. x1,.....,xn: n measurements (observations)

xi=θ+ εi i=1,…..,n

εi : the value of measurement noise on the ith observation

3. A reasonable estimation of θ

is the value of measurement

n

i

ixn 1

1

Page 4: Chapter 3 Estimation and Decision Theory

Parameter Estimation (2/3)

4

The estimate of the mean μ

1. : a normal r.v. takes in n trials

2. An estimate of μ that is the mean of the pdf :

3. the estimate of μ based on the second set

:

4. and are probably different

)1()1(

1 ,....., nxx

n

i

ixn 1

)1()1( 1

)2()2(

1 ,....., nxx

n

i

ixn 1

)2()2( 1

)1( )2(

Page 5: Chapter 3 Estimation and Decision Theory

Parameter Estimation (3/3)

5

We should call the estimate of a particular value of a r.v. as

an estimator based on the n sample values x1,…,xn

1. Each measurement can be viewed as an observation on a

generic r.v. X with pdf

2. Xi , i=1,…..,n, n i.i.d. observations on X , and each has

pdf

3. The estimate is a particular estimator :

4. The estimator in above equation is often used to estimate

E[X]

)(xfX

)()( iXiX xfxfi

n

i

iXn 1

Page 6: Chapter 3 Estimation and Decision Theory

Definition (1/3)

6

Definition: An estimator is a function of the observation

vector X= that estimates θ

Definition: An estimator for θ is said to be unbiased if and

only if . The bias in estimating θ with is

Definition: Let be an estimator computed from n samples

for every . Then is said to be consistent if

T

nXX ),.....,( 1

]ˆ[E

]ˆ[E

nXX ,.....,1 1n

0 , 0]ˆPr[lim

nn

n

n

Page 7: Chapter 3 Estimation and Decision Theory

Definition (2/3)

7

Definition: An estimator is called a minimum mean-square

error estimator if

where is any other estimator

])ˆ[(])ˆ[( 2'2 EE'

Page 8: Chapter 3 Estimation and Decision Theory

Definition (3/3)

8

What is the relationship between unbiasedness and consistency?

Definition of unbiasedness :

Definition of consistency:

We denote the all the estimators satisfy consistency and

unbiasedness as A and B, respectively. There are four possible

relations between the set A and the set B.

The relation between the set A and the set B belongs to the

third type.

0][^

E

0]|Pr[|lim^

nn

Page 9: Chapter 3 Estimation and Decision Theory

Estimation of E[X] (1/1)

9

Estimate

1. X is a r.v. with pdf and finite variance

2. Repeat the experiment n times with denoting the ith

outcome

3. The are drawn independently from , and

i=1,…..,n.

4.The sample mean estimator is

][XE)(xfX

2

ix

ix )(xfX )()( xfxf XX i

n

i

iXn 1

Page 10: Chapter 3 Estimation and Decision Theory

Unbiasedness (1/1)

10

Since , i=1,…,n, we have

][ iXE

)(1

][1

]1

[]ˆ[

1

1

nn

XEn

Xn

EE

n

i

i

n

i

i

Page 11: Chapter 3 Estimation and Decision Theory

Consistency (1/2)

11

The chebyshev inequality:

(A)

The variance of is obtained as

(B)

2

]ˆ[]ˆPr[

Var

n

Xn

XXn

Xn

E

Xn

EVar

n

i

i

ji

ji

n

i

i

n

i

i

2

1

2

21

2

2

2

1

)]1

(211

[

])1

[(]ˆ[

Page 12: Chapter 3 Estimation and Decision Theory

Consistency (2/2)

12

With (A) and (B), we obtain

2

2

]ˆPr[

n

0]ˆPr[lim

n

Page 13: Chapter 3 Estimation and Decision Theory

Estimation of Var[X] (1/1)

13

X: A random variable with pdf with mean μ and

variance σ2

: n i.i.d. observations

Estimator

where

)(xfX

nXX ,.....,1

2

1

)ˆ(1

n

i

iXn

n

i

iXn 1

1

Page 14: Chapter 3 Estimation and Decision Theory

Unbiasedness (1/1)

14

Unbiasedness of

2

1 1

2 2 2

2 21 1 1

2

1ˆ( 1) [ ] [ ( ) ]

2 2 1 1[ { }]

( 1)

n n

i i

i i

n n n n n

i i i j j j i

i j j j ij i

n E E X Xn

E X X X X X X Xn n n n

n

Page 15: Chapter 3 Estimation and Decision Theory

Consistency (1/2)

15

Consistency of

1.

n

mm

nE

Xn

EXnn

E

Xn

XXXn

EVar

n

i

i

n

ji

j

n

i

n

i

i

nn

44

44

2

2

1

24

22

1

4

2

22

]1

[

])ˆ(1

[])ˆ()1(

1[

])ˆ(1

2

})ˆ()ˆ()ˆ({)1(

1E[

]ˆ[]ˆ[

Page 16: Chapter 3 Estimation and Decision Theory

Consistency (2/2)

16

Consistency of

2. With the chebychev inequality

assuming that exists .

2

4

2

2 ]ˆ[]ˆPr[

n

mVar nn

4m

Page 17: Chapter 3 Estimation and Decision Theory

Exercise 1: Consider the estimation of the mean and variance

of a random noise source X. Assume the random noise source

is Gaussian distributed with mean 0 and variance 1. Design

an estimator that is based on n observations of X. What are

your estimations of its mean and variance if you choose

n=1,10, 100, 1000, and 10000. Is your estimator unbiased?

Consistent?

17

Page 18: Chapter 3 Estimation and Decision Theory

Confidence Intervals (1/13)

18

Make n observations on a Gaussian r.v. X with mean μ and

variance σ2. These observations are drawn from n i.i.d. r.v.’s

is an unbiased estimator of the true mean μ. Since it

involves the sum of independent Gaussian r.v.’s, it also

obeys the Gaussian probability law with mean μ and

variance (Why?)

Normalization Y into N(0,1) (Why?)

n/2

nY

/

ˆ

Page 19: Chapter 3 Estimation and Decision Theory

Confidence Intervals (2/13)

19

Compute the probability of events of the form

whose probability is 95 percent,

Set , then there exists a number b0.95 such that

From table of Q(x) and using linear interpolation we find that

}{ bYa

95.0}Pr{ bYa

ba

95.0]Pr[ 95.095.0 bYb

96.195.0 b

Page 20: Chapter 3 Estimation and Decision Theory

Confidence Intervals (3/13)

20

(D)

The interval of (D) is called the 95 percent confidence

interval for the mean. For every value of that we observe,

we shall usually obtain a different interval. However in 95

out of 100 cases, the interval so generated will cover the true

mean

Exercise 2: Please use Matlab to find b0.95. If you estimate the

mean μ for 100 times, how many estimations lie in the

interval of (D)?

]96.1ˆ96.1[

])96.1(ˆ)96.1[(]96.196.1[

nn

nnY

Page 21: Chapter 3 Estimation and Decision Theory

Confidence Intervals (4/13)

21

Example: Let X be a Gaussian r.v. with σ2 =9. The mean is

estimated from 10 independent observations on X and found

to be 3.5. Find 95 percent confidence interval for μ.

Solution: A 95 percent confidence interval is given by

With σ=3, n=10, and we find that the 95 percent

confidence interval is [1.64, 5.36]; i.e., the probability of the

event is 0.95.

].96.1ˆ96.1[nn

5.3ˆ

}36.564.1{

Page 22: Chapter 3 Estimation and Decision Theory

Confidence Intervals (5/13)

22

The previous discussion assumed that Var(X)= σ2 was known.

If σ is unknown, we can’t use equation (D) to obtain a

confidence interval. Is there a way to obtain a confidence

interval even if μ is unknown? The answer is yes and the

solution is as follows.

We first define a r.v. Z and say that it has a Student t(n)

distribution with degrees of freedom n. The PDF of t(n) is

given by

where .)2/(

]2/)1[()(

,)/1(

)(),()(

2/)1(2

nn

nnk

nz

nknzfzf

ntZ

Page 23: Chapter 3 Estimation and Decision Theory

Confidence Intervals (6/13)

23

The gamma function is defined as

If α is a positive integer then

).()1( ,)( 1

dyey y

o

!)1(

Page 24: Chapter 3 Estimation and Decision Theory

Confidence Intervals (7/13)

24

The Student t r.v.

We can find number and such that

2/1

1

2

)1(

)ˆ(

)( , )(

ˆ)1(

nn

X

nAnA

ntt

n

i

i

t t

t

t

t dznzfttt 21)1,(]Pr[

Page 25: Chapter 3 Estimation and Decision Theory

Confidence Intervals (8/13)

25

,tf z n

t t

Figure 6.2-1 The number tς . The area between –tς and tς is 1-2 ς.

Page 26: Chapter 3 Estimation and Decision Theory

Confidence Intervals (9/13)

26

The number are called the ζ percent level of t

which cut-off ζ percent of the area under .

)}(ˆ)(ˆ{}{ nAtnAtttt

t

)1,( nzft

(E) ])Pr[1(2

1),(

1),(2)1,(]Pr[

),(),(

tttntF

ntFdznzfttt

dznzfntF

t

t

t

t

t

t

tt

Page 27: Chapter 3 Estimation and Decision Theory

Confidence Intervals (10/13)

27

We can find tζ and further find the confidence interval

)](ˆ),(ˆ[ nAtnAt

Page 28: Chapter 3 Estimation and Decision Theory

Confidence Intervals (11/13)

28

Example: Twenty-one independent observations are made on

a Gaussian r.v. X. Call these observations X1,…..,X21 . Based

on the data the realization of is 3.5 and the realization of

A 90 percent confidence interval on μ is desired.

45.0)1(

)ˆ(

)(

2/1

1

2

nn

X

nA

n

i

i

Page 29: Chapter 3 Estimation and Decision Theory

29

Solution: Since , we obtain from

equation (E), . From table of the Student t

distribution, for n=20, we obtain . The

corresponding confidence interval is

9.0]Pr[ 05.005.0 ttt

95.0)20,( 05.0 tFt

725.105.0 t

]28.4,72.2[

)]45.0(725.15.3),45.0(725.15.3[

Confidence Intervals (12/13)

Page 30: Chapter 3 Estimation and Decision Theory

Confidence Intervals (13/13)

30

F n

0.6 0.75 0.9 0.95 0.975 0.99 0.995 0.9995

1 0.325 1 3.078 6.314 12.706 31.821 63.657 636.619

2 0.289 0.816 1.886 2.92 4.303 6.965 9.925 31.598

3 0.277 0.765 1.638 2.353 3.182 4.541 5.841 12.924

4 0.271 0.741 1.533 2.132 2.776 3.747 4.604 8.61

5 0.267 0.727 1.476 2.015 2.571 5.365 4.032 6.569

6 0.265 0.718 1.44 1.943 2.447 3.143 3.707 5.959

7 0.263 0.711 1.415 1.895 2.365 2.998 3.499 5.408

8 0.262 0.706 1.397 1.86 2.306 2.896 3.355 5.041

9 0.261 0.703 1.372 1.833 2.262 2.821 3.25 4.781

10 0.26 0.7 1.363 1.812 2.228 2.764 3.169 4.587

11 0.259 0.697 1.356 1.796 2.201 2.718 3.106 4.437

12 0.259 0.695 1.35 1.782 2.179 2.681 3.055 4.318

13 0.258 0.694 1.345 1.771 2.16 2.65 3.012 4.221

14 0.258 0.692 1.341 1.761 2.145 2.624 2.977 4.14

15 0.258 0.691 1.337 1.753 2.131 2.602 2.947 4.073

16 0.257 0.69 1.333 1.746 2.12 2.583 2.921 4.015

17 0.257 0.689 1.33 1.74 2.11 2.567 2.898 3.965

18 0.257 0.688 1.328 1.734 2.101 2.552 5.878 3.922

19 0.257 0.688 1.325 1.729 2.093 2.539 2.861 3.883

20 0.257 0.687 1.325 1.725 2.086 2.528 2.845 3.85

Table 6.2-1 Student t distribution

, ,t

t tF t n f z n dz