The General Data Assimilation Problem Ronald Errico Goddard Earth Sciences Technology and Research...

Preview:

Citation preview

The General Data Assimilation Problem

Ronald Errico

Goddard Earth Sciences Technology and Research Center at Morgan State University

and

Global Modeling and Assimilation Office at NASA Goddard Space Flight Center

Outline

1. Definition of the general DA problem2. DA as an application of Bayes’s Theorem3. Interpretation of the background4. The “observation operator” or “forward model”5. The dynamical balance problem6. Recommendations

The General Problem

Produce an estimate of the state of some system (e.g., the ocean or atmosphere) consistent with:

1. observations,2. known physical relationships,3. prior information,

accounting for statistics of possible errors in each.

The Character of Information

A. Observations are:1. always imperfect, sometimes with gross errors2. often indirect, or highly processed3. usually inadequate to completely define what we want

B. Physics is useful to:1. provide constraints2. relate what is observed to what is analyzed3. interpolate or extrapolate information

C. Background (prior) information is required to:1. account for previously analyzed observations2. provide additional “knowns” to determine all “unknowns”3. fill observational voids

“The most general way of describing information:”

As a probability density function (PDF or pdf)

Bayes’s Theorem (1763)

PDFs of the Information in our DA problem

Interpretation of the background

1. The background is an estimate of the state to be analyzed prior to consideration of any new observations.2. It is defined in the same state space as the analysis and thus presents a complete description of the state to be analyzed.3. The best background is generally provided by a temporal extrapolation of the most recent past analysis to the new analysis time using a good, physically-based, forecast (e.g., NWP) model.4. This background can be considered an estimate valid at the analysis time based on all observations considered during the recent past and based on our physical understanding of the system analyzed. 5. For the above reason, the background is generally a better estimate of the state to be analyzed than is provided by many new observations. 6. Mathematically, the background is the same as an observation, but with an observation operator that is the identity operator; i.e., it is just another piece of information to be considered.7. Due to the dynamics of the forecast model, background errors are generally correlated in space and time.

The observation operator H1. Generally, we do not observe where or what we want to analyze.2. We thus need to relate the observation to what we want to analyze, using some quantitative relationship that may be either statistically or physically formulated. 3. This relationship is called an “observation operator” or “forward model”, y=H(x). 3. Two examples of H are spatial interpolation and radiative transfer algorithms.5. In general, these operators are imperfect: Even if x=truth, H(x) would not yield the true y.6. The difference y(true)-H(x(true)) is termed the representativeness error.7. It is best considered as the error in the formulation of the observation operator, e.g., in the interpolation and radiative transfer algorithms.8. Analysis at lower resolutions will tend to yield greater spatial interpolation errors and thus greater representiveness errors.9. The representativeness error is often larger than the instrument error.

A Bayesian Example

T2

T1

T2 T2

T1 T1

Analysis pdf

Prior pdf

Model

Observ.& Modelpdfs

R

q

T

Analysis pdf

Errico et al.QJRMS 2000

Gaussian PDFs

Solution to the analysis problem for Gaussian errors

Solution for linear H

Reasonableness of the unbiased Gaussian assumption

1. Real errors are likely biased and non Gaussian to some degree.

2. If observations having gross errors can be identified and eliminated through quality control, errors for the remaining observations may be more approximately Gaussian.

3. If observational error biases can be estimated well, they can be eliminated by adding a correction to all observation values.

4. Many observations are a result of an averaging process. If this effectively averages errors, the net error will tend to be Gaussian according to the central limit theorem of statistics.

Implications of the Bayesian Approach

1. Unless the underlying distributions are simple, the problem is computationally intractable for large problems.2. We see how the different information should be optimally

combined.3. We see what statistical knowledge is required as input.4. Results may depend on shapes of distributions, not only their means

and variances.5. We see that selection of a “best” analysis can be somewhat

ambiguous.6. Multi-modality of the PDF can occur, particularly due to model non-linearity.7. Any analysis has associated error statistics.8. While an explicit Bayesian approach may be impractical, the Bayesian implications of other techniques should be considered.

Daley 1992

Consideration of Balance: Example of Geostrophic Adjustment

0 hour

Consideration of primitive equations

dg/dt (t=0) = 0

NonlinearNormalModeInitialization

Why does balance matter in data assimilation?

1. Large initial imbalances will tend to create less accurate backgrounds

2. Balance can be exploited to relate u, v, T, ps (esp. in extra-tropics)

3. Errors in balanced initial conditions will tend to create balanced background errors, so the error statistics should reflect that; i.e.,

background errors of u, v, T, ps tend to be correlated, esp. in the extra-tropics.

Implications of geostrophic adjustment

1. If wind, temperature, and pressure are not considered properly together, unrealistic gravity waves will be propagated and informationwill not be retained.

2. The problem is generally aggravated at high altitudes where the massdensity is low. It is also very scale-dependent, especially vertically.

3. The fundamental balance is nonlinear, which is not straightforward to implement in a linear analysis scheme.

4. Many ways to mitigate this problem have been developed, each with its own advantages and disadvantages and computational issues.

5. Theoretical aspects of the problem are best described in the context ofthe balancing technique called “nonlinear normal mode initialization.”

6. Beware of claims that balancing is a solved problem!

Character of the DA Problem

1. A well-developed body of theory exists. Control theory, Inverse modeling, Bayesian analysis It is fundamental and foundational.

2. This theory is currently insufficient. The computational demand can be overwhelming. The required input statistics are not well known.

3. Gross approximations or unsupported assumptions may be required Although “wrong,” they can be useful. Sometimes they create confusion.

4. Many techniques are available Most are similar in a very general sense Results are affected by details

Basics

1. Fundamentals are foundational.2. Statistical theory is critical.3. Quality control is critical.4. Consideration of covariances is critical.5. Consideration of dynamic balance is critical.6. Model error is not negligible.7. Much model physics is not linear.8. Model error is probably not white noise.9. Experience counts!10. Data assimilation is as much art as science.

Tarantola, A., 1987: Inverse problem theory: Methods for data fitting and model parameter estimation. Elsevier Science B. V. (See chapter 1 in the 1st edition, which is now out of print).

Baker, N., 2000: Observation adjoint sensitivity and the adaptive targeting problem. Thesis, Naval Postgraduate School. (Very good explanation of howdata assimilation utilizes observations).

Daley, R., 1991: Atmospheric Data Analysis. Cambridge University Press, 420 pp.(Somewhat dated, but good and accurate presentation of many basics).

Ghil, M., K. Ide, A.Bennett, P. Courtier, M. Kimoto, M. Nagata, M.Saiki. M. Sato, Eds., 1997: Data assimilation in meteorology and oceanography: Theory and practice. Meteorological Society of Japan. 386pp. (Several short tutorial-like paperson various aspects of data assimilation).

Some References

Linear Normal-Mode Initialization

Temperton and Williamson 1979

Daley 1991

Structures

of two

normal modes

g(t=0) = 0

NNMI

Errico 1997

Harmonic Dial for External m=4 Mode, Period=3.7hWithout NNMI With NNMI

Errico 1997

Recommended