The Scalar Kalman Filter

Embed Size (px)

Citation preview

  • 7/27/2019 The Scalar Kalman Filter

    1/20

    1

    The Kalman Filter

    The Scalar Kalman Filter

    This document gives a brief introduction to the derivation of a Kalman filter when the input is a

    scalar quantity. It is split into several sections:

    Defining the Problem Finding K, the Kalman Filter Gain

    Finding the a priori covariance

    Finding the a posteriori covariance

    Review of Pertinent Results

    Alternate, More Common, Notation

    Examples

    Going further

    References

    Defining the ProblemDiscrete time linear systems are often represented in a state variable format given by the

    equation:

    = + Equation 1where the state, xj, is a scalar, a and b are constants and the input ujis a scalar; jrepresents the

    time variable. Note that many texts don't include the input term (it may be set to zero), and most

    texts use the variable kto represent time. I have chosen to usejto represent the time variable

    because we use the variable k for the Kalman filter gain later in the text. Equation 1 can be

    represented pictorially as shown below, where the block with T in it represents a time delay.

  • 7/27/2019 The Scalar Kalman Filter

    2/20

    2

    Figure 1

    Now imagine some noise is added to the process such that:

    = + + Equation 2The noise, wj, is white noise source with zero mean and covariance Q and is uncorrelated with the

    input. The process can now be represented as shown:

    Figure 2

    Given a situation like the one shown above, a typical question might be: Can we filter the signal x

    so that the effects of the noise w are minimized? The answer, it turns out is yes. However, with

    Kalman filters we can go one step further.

    Let us assume that the signalxis not directly measured, but instead we measure z.

    = + Equation 3The measured value z depends on the current value of x, as determined by the gain

    h. Additionally, the measurement has its own noise, v, associated with it. The noise, v, is white

    noise source with zero mean and covariance R that is uncorrelated with the input or with the noise

    w. The two noise sources are independent of each other and independent of the input.

  • 7/27/2019 The Scalar Kalman Filter

    3/20

    3

    Figure 3

    The task of the Kalman filter can now be stated as: Given a system such as the one shown above,

    how can we filter z so as to estimate the variable x while minimizing the effects of w and v?

    It seems reasonable to achieve an estimate of the state (and the output) by simply reproducing

    the system architecture. This simple (and ultimately useless) way to get an estimate of xj (which

    we will call x^j), is diagrammed below.

    Figure 4

    This approach has two glaring weakness. The first is that there is no correction. If we don't know

    the quantities a, b or h exactly (or the initial valuex0), the estimate will not track the exact value of

    x. Secondly, we don't compensate for the addition of the noise sources (w and v). An improved

    setup which takes care of both of these problems is shown below.

  • 7/27/2019 The Scalar Kalman Filter

    4/20

    4

    Figure 5

    This figure is much like the previous one. The first difference noted is that the original estimate of

    xj is now called .; we will refer to this as thea prioriestimate = + Equation 4

    We use this a prioriestimate to predict an estimate for the output, . The difference betweenthis estimated output and the actual output is called the residual, or innovation.

    = =

    Equation 5

    If the residual is small, it generally means we have a good estimate; if it is large the estimate is not

    so good. We can use this information to refine our estimate of xj; we call this new estimate the a

    posteriori estimate, . If the residual is small, so is the correction to the estimate. As theresidual grows, so does the correction. The pertinent equation is (from the block diagram):

    = + = + Equation 6The only task now is to find the quantity kthat is used to refine our estimate, and it is this process

    that is at the heart of Kalman filtering.

    Note: We are trying to find an optimalestimator, and thus far we are only optimizing the value for

    the gain, k. We have assumed that a copy of the original system (i.e., the gains a, b, and h

    arranged as shown) should be used to form the estimator. This begs the question: "Is the

    estimator as developed above optimal?" In other words, should we simply copy the original

    system in order to estimate the state, or is there perhaps a better way? The answer turns out, is

  • 7/27/2019 The Scalar Kalman Filter

    5/20

  • 7/27/2019 The Scalar Kalman Filter

    6/20

    6

    To find the value of k that minimizes the variance we differentiate this expression with respect to k

    and set the derivative to zero. Be patient here, the expression gets much messier before it

    becomes simple.

    Equation 10

    We take this last expression and use it to solve for k.

    Equation 11

    This expression is still quite complicated. To simplify it we will consider the numerator and the

    denominator separately.

    We start with the numerator, and substitute in equation 3 for zj.

  • 7/27/2019 The Scalar Kalman Filter

    7/20

    7

    The measurement noise, v, is uncorrelated to either the input or the a priori estimate of x, so:

    Equation 12

    This simplifies the expression for the numerator.

    Equation 13

    Now, in the same way, consider the denominator.

    Equation 14

    Again, we can use the orthogonality condition from equation 12 to set the last term to zero, so:

  • 7/27/2019 The Scalar Kalman Filter

    8/20

    8

    Equation 15

    where we used the simplification from equation 13 for the first term in the expression, and using

    the definition of the measurement noise for the second term.

    Using the expression for numerator and denominator, we finally get a simple expression for k:

    Equation 16

    However, there is still a problem because this expression needs a value for the a priori covariance

    which in turn requires a knowledge of the system variable xj. Therefore our next task will be to

    come up with an estimate for the a priori covariance.

    Before we move on, let's look at this equation in detail. First not that the "constant", k, changes

    at every iteration. For this reason it should really be written with a subscript (i.e., kj). We'll be

    more careful about this later.

    Next, and more significantly, we can examine what happens as each of the three terms in

    equation 16 are varied.

    If the a priori error is very small, k is correspondingly very small, so our correction is also very

    small. In other words we will ignore the current measurement and simply use past estimates to

    form the new estimate. This is as expected -- if our first estimate (the a priori estimate) is good

    (i.e., with small error) there is very little need to correct it.

    If the a priori error is very large (so that the measurement noise term, R, in the denominator is

    unimportant) then k=1/h. This, in effect, tells us to throw away the a priori estimate and use the

    current (measured) value of the output to estimate the state. This is made clear by substitution

  • 7/27/2019 The Scalar Kalman Filter

    9/20

    9

    into equation 6. Again, this is as expected -- if the a priori error is large then we should disregard

    the a priori estimate, and instead use the current measurement of the output to form our

    estimate of the state.

    If the measurement noise, R, is very large, k is again very small, so we disregard the current

    measurement in forming the new estimate. This is as expected -- if the measurement noise is

    large, then we have low confidence in the measurement and our estimate will depend more upon

    the previous estimates.

    Finding the a priori covariance

    Finding the a priori covariance is straightforward starting with its definition.

    The middle term drops out as before because the process noise is uncorrelated with previous

    values of the either the state or its a priori estimate.

    Equation 17

  • 7/27/2019 The Scalar Kalman Filter

    10/20

    10

    so

    Equation 18

    We are still not finished, however, because we need an expression for pj, the a posteriori estimate.

    Finding the a posteriori covarianceAs with the a priori covariance, we find the a posteriori covariance by starting with its definition.

    Equation 19

    The middle term drops out as before because the measurement noise is uncorrelated with the

    current values of the either the state or its a priori estimate.

    Equation 20

  • 7/27/2019 The Scalar Kalman Filter

    11/20

    11

    So

    Equation 21

    We can simplify this by using our previous definition for k (Equation 16 rearranged)

    Equation 22

    Substituting Equation 22 into Equation 21 yields

    Equation 23

    Review of Pertinent ResultsAny Kalman filter operation begins with a system description consisting of gains a, b and h. The

    state isx, the input to the system is u, and the output is z. The time index is given byj.

    The process has two steps, a predictor step (which calculates the next estimate of the state based

    only on past measurements of the output), and a corrector step (which uses the current value of

    the estimate to refine the result given by the predictor step).

  • 7/27/2019 The Scalar Kalman Filter

    12/20

    12

    Predictor StepWe form the a priori state estimate based on the previous estimate of the state and the current

    value of the input.

    We can now calculate the a priori covariance

    Note that these two equations use previous values of the a posteriori state estimate and

    covariance. Therefore the first iteration of a Kalman filter requires estimates (which are often just

    guesses) of the these two variables. The exact estimate is often not important as the values

    converge towards the correct value over time; a bad initial estimate just takes more iterations to

    converge.

    Corrector Step

    To correct the a priori estimate, we need the Kalman filter gain, k.

    This gain is used to refine (correct) the a priori estimate to give us the a posteriori estimates.

    We can now calculate the a posteriori covariance

    Notes about the Kalman filter gain, kj.

    If the a priori error is very small, k is correspondingly very small, so our correction is also very

    small. In other words we will ignore the current measurement and simply use past estimates to

  • 7/27/2019 The Scalar Kalman Filter

    13/20

    13

    form the new estimate. This is as expected -- if our first estimate (the a priori estimate) is good

    (i.e., with small error) there is very little need to correct it.

    If the a priori error is very large (so that the measurement noise term, R, in the denominator is

    unimportant) then k=1/h. This, in effect, tells us to throw away the a priori estimate and use the

    current (measured) value of the output to estimate the state. This is made clear by substitution

    into equation 6. Again, this is as expected -- if the a priori error is large then we should disregard

    the a priori estimate, and instead use the current measurement of the output to form our

    estimate of the state.

    If the measurement noise, R, is very large, k is again very small, so we disregard the current

    measurement in forming the new estimate. This is as expected -- if the measurement noise is

    large, then we have low confidence in the measurement and our estimate will depend more upon

    the previous estimates.

    Alternate, More Common, NotationThe notation used in this document was taken from [1]. More common notation is given below.

    Variable Notation in this Document More Common Notation

    time variable j kstate xj x(k)

    system gains a, b, h a, b, h (note: b is often 0)

    input uj u(k) (note: often there is no input)

    output zj z(k)

    gain kj Kk

  • 7/27/2019 The Scalar Kalman Filter

    14/20

    14

    a priori estimate

    The notation

    can be read as "The estimate of x at time k, based on the information from time k-1"; in other

    words, the estimate based only upon the past values of the output, or the a priori estimate. Thenotation

    can be read as "The estimate of x at time k, based on the information from time k"; in other words

    the estimate based on past andcurrent values of the output, or the a posteriori estimate

    Examples

    Example of estimating a constant (along with Matlab code).

    Example of estimating a first order process (along with Matlab code).

    Going further

    A matrix based (higher order system) Kalman filter is a simple extension of the scalar case

    presented here. The results are given here, a full description of the mathematics can be found in

    the reference [3].

  • 7/27/2019 The Scalar Kalman Filter

    15/20

    15

    Is the Kalman Filter Optimal?This document is split into several sections:

    Defining the Problem

    Finding the Constants

    Reconciling with Previous Document

    Defining the Problem

    In the previous document we assumed that the best linear estimate for the state, xj, was given by

    where

    The question to be answered is: Can we prove that the second statement is true?

    If we want to estimate the state we can use only the three quantities that we know, the previous

    estimate, the current input and the current measured output. We use these three variables to

    form a linear estimate of the state:

    where aj and bj are two unknowns to be chosen to minimize the error between the value of the

    stat and its estimate. In other words we want to minimize the expected value of the error ej with

    respect to the variables aj and bj.

  • 7/27/2019 The Scalar Kalman Filter

    16/20

    16

    Finding the ConstantsTo do the minimization with respect to each variable we simply differentiate and set the result to

    zero. At this point we will only try to fin

    which can be rewritten

    These last two expressions are often referred to as the orthogonality conditions; i.e., the error is

    orthogonal to the previous estimated state, the current input and the current value of the

    measured output.

    Let's use the first condition to find an expression for aj that minimizes the expected value of the

    error. If we add and subtract ajxj-1 from the equation (why we do this will become clear shortly),

    we get:

    Now we can use the facts that

  • 7/27/2019 The Scalar Kalman Filter

    17/20

    17

    to write

    Note that because of the orthogonality relationships the first term on the right can be rewritten as

    We also know that the previous estimate is uncorrelated with the current value of the

    measurement noise:

    So we can simplify the equation to the following

    This is a complicated expression that we can use a bit later, but first we need to derive one more

    expression. By following the same sequence of steps as is done above (but starting with the

    secondequation in which we set the derivative to zero), it is easily shown that

  • 7/27/2019 The Scalar Kalman Filter

    18/20

    18

    We can rewrite the last two equations

    or, in matrix form

    For a matrix equation

    we know that either

    So for the matrix equation above, either

  • 7/27/2019 The Scalar Kalman Filter

    19/20

  • 7/27/2019 The Scalar Kalman Filter

    20/20

    20

    Reconciling with Previous DocumentNow recall Equation 6 from the previous document

    Equation 6

    From the previous document we know that the a priori estimate of the state is given by

    and if we let

    we can rewrite our last equation (at the end of the previous section)

    which matches Equation 6.

    We have shown that the Kalman filter represents the optimal linear filter. The other document

    goes on to derive the optimal value for kj.