Upload
others
View
14
Download
0
Embed Size (px)
Citation preview
Elements of Estimation Theory
Jan Sykora
Czech Technical University in Prague
Czech Republic
Synchronization and Equalization in Digital CommunicationscourseA-SEK
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 1 / 49
c© Jan Sykora, 2010
E-mail: [email protected], http://radio.feld.cvut.cz/~sykora/This text can be exclusively used by students of Synchronization and Equalization in Digital Communications course at CzechTechnical University in Prague, faculty of electrical engineering as a support material for their preparation to this course. Thesestudents are allowed to make an electronic or printed copy solely for their personal use. Redistribution of the document in anyelectronic or printed form is prohibited. The only distribution source is the one maintained by the author. Any other utilizationof the document requires permission from the author.
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 2 / 49
Outline
1 Parameter model
2 Estimator criterionGoal & metricML (Maximum Likelihood)Bayesian estimatorsLS (Least Squares)Method of Moments
3 Performance parameters of the estimator
4 Performance limits
5 Sufficient statistics
6 Estimator equation and solver
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 3 / 49
Parameter model
Vector parameter representation
All parameters can be represented as vector parameters — θ
constant parameter (scalar, vector)
continuous time-variant parameters
signal expansion (Signal Space, Sampling)
discrete time parameters
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 5 / 49
Parameter model
Parameter value domain modelA priori known properties
Parameter value domain model
stochastic parameter
Random Parameter with Known PDFRandom Parameter with Unknown PDF
parameterized PDF from given classparameter range
deterministic parameter
parameter range
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 6 / 49
Parameter model
Dynamic model of the parameterA priori known properties
Dynamic model
Linear models
Auto-Regressive Moving Average (ARMA(p, q))
p∑
i=0
aiθ[k − i]
︸ ︷︷ ︸
AR
=
q∑
i=0
biu[k − i]
︸ ︷︷ ︸
MA
excitation u[k] zero mean uncorrelated sequence, usually a0 = 1State Space Description (excitation u)
continuous time ∂θ∂t
= X(θ(t), u(t), t)discrete time θ[k+ 1] = X(θ[k], u[k], k)
Power Spectrum density (PSD)Correlation propertiesParameter subspace specification
Nonlinear dynamic models
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 7 / 49
Estimator criterion Goal & metric
Estimator criterion
Estimator criterion
associates estimator with given performance goal/metric
e.g. Maximum A posteriori Probability
each performance goal requires specific parameter model
e.g. criteria for random/deterministic parameters
Estimator equation
typically maximizes (minimizes) given performance goal/metric
θ = argmaxθρ(θ) or θ=argmin
θρ(θ)
Estimator solver
solver finds the solution of the criterion
typically as a solution of µ(θ) = 0
solver’s own approximations and imperfections
create additional “layer” of performance degradation
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 10 / 49
Estimator criterion Goal & metric
Criterion goal/metric domains
Criterion goal/metric domains
stochastic domain
goal metric = mean value of some performance metric
achieves the goal when repeated many timese.g. Mean Square Error (Bayesian estimator class)
goal metric = other stochastic based value
e.g. Likelihood (conditional probabilistic outcome)
time (series) domain
goal metric = parametric approximation of the functionachieves the goal on single realization
e.g. Least Squares
ad-hoc
e.g. Method of Moments
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 11 / 49
Estimator criterion ML (Maximum Likelihood)
ML (Maximum Likelihood)
ML goal metric
conditional probabilistic outcome
ML estimator
θ = argmaxθ
p(x|θ)
it can be always constructed
usually good performance
asymptotically unbiased and efficient (attains CRLB)
θasympt∼ N (θJ−1(θ))
very popular choice
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 13 / 49
Estimator criterion Bayesian estimators
Bayesian estimators
Bayesian goal metric
Goal metric
mean value of some performance metric (Loss function)
Loss functionL(θ, θ)
performance metric
Bayesian riskR(θ) = Ex,θ
[L(θ, θ)
]
mean value of the loss functionmean over all random influences affecting observation
including parameter itself — requires knowledge of PDF
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 15 / 49
Estimator criterion Bayesian estimators
MAP (Maximum A posteriori Probability)Derivation
Loss function
L(θ, θ) =
{0, ‖θ − θ‖ < ∆θ
1, ‖θ − θ‖ ≥ ∆θ
, ∆θ → 0+
Bayes risk
R(θ) = E[L(θ, θ)
]
=
∫
{x}
∫
{θ}
L(θ, θ)p(θ|x)p(x) dθ dx
=
∫
{x}
(
1−
∫
{∆θ(θ)}
p(θ|x) dθ
)
p(x) dx
lim∆θ→0+
∫
{∆θ(θ)}
p(θ|x) dθ = lim∆θ→0+
∫
{∆θ(θ)}
dθ
︸ ︷︷ ︸
a1>0
p(θ|x) = a1p(θ|x)
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 16 / 49
Estimator criterion Bayesian estimators
MAP (Maximum A posteriori Probability)
MAP estimator
p(x) ≥ 0
θ = argminθ
R(θ) = argmaxθ
p(θ|x)
Special case of uniform pθ(θ) PDF
MAP≡ML (constrained to the permissible range of parameter)
θ = argmaxθ
p(θ|x)
= argmaxθ
p(x|θ)pθ(θ)
px(x)
= argmaxθ
p(x|θ)
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 17 / 49
Estimator criterion Bayesian estimators
MSE (Mean Square Error)Derivation
Loss function
L(θ, θ) = ‖θ − θ‖2
Bayes risk (special case: real scalar parameter)
R(θ) = E[L(θ, θ)
]=
∫
{x}
∫
{θ}
(θ − θ)2p(θ|x) d θ p(x) dx
argminθ
R(θ) = argminθ
∫
{θ}
(θ − θ)2p(θ|x) d θ
︸ ︷︷ ︸
R(θ|x)
∂
∂θR(θ|x) =
∂
∂θ
∫
{θ}
(θ − θ)2p(θ|x) d θ = −
∫
{θ}
2(θ − θ)p(θ|x) d θ
∂
∂θR(θ|x)
∣∣θ=θ
= 0
θ =
∫
{θ}
θ p(θ|x) d θ
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 18 / 49
Estimator criterion Bayesian estimators
MSE (Mean Square Error)
MSE estimator
general case: complex vector parameter
θ = E [θ|x]
Note
do not confuse MSE estimator with MSE as performance metric for arbitraryother estimator
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 19 / 49
Estimator criterion LS (Least Squares)
LS (Least Squares)
LS goal metric
minimizes difference between measurement and assumed signal model
no statistical description of measurement needed
observation with symmetric fluctuations E[x|θ] = s(θ)
ModelSignal model s = s(θ)Observation x = x(s(θ))
Least Squares estimator
θ = argminθ
‖x− s(θ)‖2
Special case — Linear LS
Linear signal model s = Hθ , H observation matrix
Solutionθ = (HH
H)−1H
Hx
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 21 / 49
Estimator criterion Method of Moments
Method of Moments (MM)
MM goal metric
None (ad-hoc)
No general optimality
Simple to derive and to implement
Assume that some moment of the measurement depends on parameter in theknown way
µ = h(θ)
θ = h−1(µ)
Estimator is obtained by replacement of the moment by its estimate µ → µ
MM estimator
θ = h−1(µ)
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 23 / 49
Performance parameters of the estimator
Deterministic Parameter — Bias
Bias
b = E[θ − θ] = E[θ]− θ
θ is unbiased if b = 0
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 25 / 49
Performance parameters of the estimator
Deterministic Parameter — Estimator variance
Parameter vector θ = [. . . , θi, . . .]T , b = [. . . , bi, . . .]
T
Theorem (MSE as a performance criterion is related to variance)
If DP is unbiased then
MSE = E[|θi − θi|2]
︸ ︷︷ ︸
MSE
= var[θi]
Proof
E[|θi − θi|2] = E[|θi − E[θi] + E[θi]− θi|
2]
= E[|θi − E[θi]|2] + E[|E[θi]− θi|
2]
+2ℜ[
E[(θi − E[θi])(E[θi]− θi)∗]]
= var[θi] + |bi|2 + 2ℜ
[
|E[θi]|2 − E[θi]θ
∗i − |E[θi]|
2 + E[θi]θ∗i
]
= var[θi] + |bi|2
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 26 / 49
Performance parameters of the estimator
Minimum Variance Unbiased Estimator (MVU)
Minimum Variance Unbiased Estimator (MVU)
Variance as the estimator design criterion
MSE describes quality of estimator
b depends on θ
if b 6= 0 then estimator θ based on MSE depends on θ and is generallyunrealizable
useful only for b = 0 then MSE=variance
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 27 / 49
Performance parameters of the estimator
Random Parameter — Mean Square Error
MSE
Estimator design and performance metric criterion
E[|θi − θi|2]
Estimate is close to real value in average for any parameter realization.
Well defined interpretation as a quality of performanceNo problems with evaluation—known a priori PDFNo simple relation between MSE and var[θi] for RP
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 28 / 49
Performance parameters of the estimator
Random Parameter—Mean Estimator Bias andVariance
Mean Estimator Bias and Variance for RP
E[θ − θ], var[θ]
mean estimator bias and variance are senseless as the performance criterionfor RP
Example
Let x[k] = a+ w[k] is the measurement where w is WGN and a ∈ (−1, 1) isuniformly distributed RP
Let the estimator be a = 0
Then E[a] = E[a] = 0 and the variance var[a] = 0 which should indicate“perfect” estimator but in fact it is useless.
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 29 / 49
Performance parameters of the estimator
Conditional MSE, Bias, Variance for RP
Conditional characteristics
properties for any particular parameter realizationbetter performance criterion for RP
Conditional Biasb(θ) = E[θ − θ|θ] = E[θ|θ]− θ
Conditional MSEMSE(θi) = E[|θi − θi|
2|θi]
Similarly as for DP it can be shown that
MSE(θi) = E[|θi − θi|2|θi] = var[θi|θi] + |bi(θi)|
2
If the estimator is conditionally unbiased b(θ) = 0 then
MSE(θi) = var[θi|θi]
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 30 / 49
Performance parameters of the estimator
Acquisition time
Acquisition time
time in continuous domain or number of iterations necessary to reach the“locked” state of synchronizer/equalizer with given probability
Ta(Pa) : Pr{(θ(t)− θ(t)) ∈ A, ∀t > Ta} = Pa
Used for iterative (feed-back, sequential) estimators
Locked state region A defined in various ways
it should suitably reflect “close vicinity” of estimate to the real valueusually such that, when reached, the synchronizer can switch from acquisitionmode to tracking mode
Initial condition at t = 0 (k = 0) are random
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 31 / 49
Performance parameters of the estimator
Synchronization failure rate
Synchronization failure rate
Mean time between synchronizer locked state exits
Tracking mode usually exited — “locked” state lost
New acquisition necessary
Detector operation interrupted
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 32 / 49
Performance parameters of the estimator
PDF of estimation error
PDF of estimation error
p(θǫ), where θǫ = θ(x) − θ
Influence on BER
Let the conditional probability of data detection error (message, symbol, bit)be
Pe(θǫ) = Pr{
d(x) 6= d|θǫ
}
Average probability of data detection error
Pe =
∫
{θǫ}
Pe(θǫ) p(θǫ) dθǫ
Pe does not have to depend only on θǫ
often depends also on θ . . . average Pe will be parameterized
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 33 / 49
Performance parameters of the estimator
Other performance criteria
Amount and form of a priori known information needed in the signal
Preamble/Training sequence length
Robustness
Level of performance decay when channel model and other assumptionsdiverts from the assumed one
Range of nuisance parameter values
Range of nuisance parameter values the synchronizer can cope with
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 34 / 49
Performance limits
Cramer-Rao Lower Bound (CRLB)
Applicable to DP estimation, θ ∈ R
Applicable to any unbiased estimator
Theorem (Cramer-Rao Lower Bound (CRLB))
If the regularity condition holds
E
[∂ ln p(x|θ)
∂θ
]
= 0
then the variance of any unbiased estimator is
var[θi] ≥[J−1(θ)
]
i,i
where J is the Fisher information matrix
Jk,i(θ) = −E
[∂2 ln p(x|θ)
∂θk∂θi
]
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 36 / 49
Performance limits
Conditional Cramer-Rao Lower Bound
Applicable to RP estimation, θ ∈ R
Applicable to any conditionally unbiased estimatorAll expectations replaced by conditional expectations
performance bound for any particular parameter realization
Theorem (Conditional Cramer-Rao Lower Bound (Cond-CRLB))
If the regularity condition holds
E
[∂ ln p(x|θ)
∂θ
∣∣∣∣θ
]
= 0
then the variance of any conditionally unbiased estimator is
var[θi|θi] ≥[J−1(θ)
]
i,i
where J is the conditional Fisher information matrix
Jk,i(θ) = −E
[∂2 ln p(x|θ)
∂θk∂θi
∣∣∣∣θ
]
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 37 / 49
Performance limits
Phase estimation in discrete AWGN channelExample
System modelx[k] = ejϕ + w[k], k = 0, . . . (N − 1)
where w[k] RPKP—discrete CWGN with variance σ2w
Likelihood function
RPKP w eliminated
p(x|ϕ) =1
πNσ2Nw
exp
{
−1
σ2w
‖x − iejϕ‖2
}
=1
πNσ2Nw
exp
{
−1
σ2w
(
‖x‖2 + ‖i‖2 − 2ℜ[
(x · i)e−jϕ])}
i = [1, 1, . . . 1], ‖i‖2 = N
Λ(d, ϕ) = a1p(x|d, ϕ) = exp
{2
σ2w
ℜ[
(x · i)e−jϕ]}
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 38 / 49
Performance limits
Phase estimation in discrete AWGN channelExample
Fisher Information matrix
dimension 1× 1, w eliminated
∂ ln p(x|ϕ)
∂ϕ=
∂
∂ϕ
1
σ2w
(
(x · i)e−jϕ + (x · i)∗ejϕ)
=1
σ2w
(
−j(x · i)e−jϕ + j(x · i)∗ejϕ)
∂2 ln p(x|ϕ)
∂ϕ2=
∂
∂ϕ
1
σ2w
(
−j(x · i)e−jϕ + j(x · i)∗ejϕ)
=1
σ2w
(
−(x · i)e−jϕ − (x · i)∗ejϕ)
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 39 / 49
Performance limits
Phase estimation in discrete AWGN channelExample
Expectation
x = ejϕi+w
∂2 ln p(x|ϕ)
∂ϕ2
∣∣x=ejϕ i+w =
1
σ2w
(
−((ejϕi+w) · i)e−jϕ − ((ejϕi+w) · i)∗ejϕ)
−Ew
[∂2 ln p(x|ϕ)
∂ϕ2
∣∣x=ejϕi+w
]
=2
σ2w
‖i‖2
Jϕ,ϕ(ϕ) = −E
[∂2 ln p(x|ϕ)
∂ϕ2
]
=2N
σ2w
[J−1(ϕ)]ϕ,ϕ =σ2w
2N
var[ϕ] ≥σ2w
2N
�
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 40 / 49
Sufficient statistics
Sufficient statistics
Sufficient statistics
function T (x) of the measurement that contains all available informationcontained in the original measurement x about θ necessary for parameterestimator θ i.e.
θ(x) = θ (T (x))
p(x|T (x), θ) must not depend on θ
p(x|T (x), θ) = p(x|T (x))
Theorem (Neyman-Fisher Factorization theorem)
T (x) is the sufficient statistics ⇔ ∃g, h : p(x|θ) = g(T (x), θ)h(x)
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 42 / 49
Estimator equation and solver
Estimator equation
Criterion has usually form
θ = argminθ
ρ(θ)
or
θ = argmaxθ
ρ(θ)
Estimator equation
Objective function extreme search task (scalar case)
µ(θ) = ρ(θ) =∂ρ(θ)
∂θ
∣∣∣∣θ=θ
= 0
and (min, max)∂2ρ(θ)
∂θ2
∣∣∣∣θ=θ
> 0 (or < 0)
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 44 / 49
Estimator equation and solver
Direct solution (Feed-Forward (FF), single-shot)
Direct solver
Closed-form expressionθ = θ(x)
theoretically straightforwardit can be found only in a special casesusually complicated expression
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 45 / 49
Estimator equation and solver
Iterative solution (Feed-Back (FB), recursive)
Iterative solver
First derivative ρ(θ′)
used as an error (update, correction) signal for feed-back solver
“Hopefully” θ′asympt→ θ
towards higher values
loop error signal
objective function
ρ(θ′)
θ θ′
∇θ′ρ(θ′)
towards lower valuescorrect next iterationcorrect next iteration
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 46 / 49
Estimator equation and solver
Iterative solution (Feed-Back (FB), recursive)
Discrete time (G[.] is operator)
θ′[k + 1] = θ′[k] + G[
ρ(θ′[k])]
Continuous time (G[.] is operator)
∂θ′(t)
∂t= G
[
ρ(θ′(t))]
Properties
it does not have to convergelocal/global extremeinitial guess problemit can be always constructedusually relatively simple implementationperformance—Theory of Feed-Back systemsperformance is additionally influenced by convergence/tracking properties ofthe sequential solver
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 47 / 49
Estimator equation and solver
Expectation-Maximization ML iterative solver
Application of the EM algorithm to the ML channel parameter estimation
unavailable observation . . . data d
estimator based on complete observation x,d
data aided (DA) estimator
EM iterative ML solver
Approximation (replacement) marginalizing the unavailable observation d
ln p(x,d|θ) ≈ Ed|x,θk [ln p(x,d|θ)]
Expectation + Maximization step
EM iterator (arbitrary encoding stage)
θk+1 = argmax
θ
∑
q:d7→q
ln p(x|q, θ)p(q|x, θk)
averaging uses the a posteriori PDF
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 48 / 49
Estimator equation and solver
The End
Jan Sykora (CTU in Prague) ElemEstTh(EET) A-SEK, A3.0.0 (10.10.2010) 49 / 49