Notes on Power Spectral Density (PSD) Estimation Using Matlab

8/19/2019 Notes on Power Spectral Density (PSD) Estimation Using Matlab

1/10

Notes on power spectral density (PSD) estimation using Matlab

I applied three different methods to analyze the power spectral density of the acquired hot

wire signal:

1. Periodogram estimate

2.

Welch’s power spectral density estimate

3. Yule-Walker method – Autoregressive power spectral density estimate

These methods can be further classified into two groups: nonparametric methods and

parametric methods.

Nonparametric methods

Periodogram

Periodogram is the most basic and complete nonparametric method of transforming the signal

from time space to frequency space. It’s the direct conversion from time space to frequencyspace. Although periodogram is consider as the estimation method, the output of this method

losses no information of the original signal.

Basically one takes the Fourier transform (discrete in time) of the signal in time-space, then

take the square of the magnitude of the signal (or multiply by the conjugate of the), scale the

power properly (Nyquist criteria and energy conservation) and normalize the power by the

number of data points (length of the signal multiply by sampling frequency) – this gives an unit

of (m2/s2)/Hz – if the raw data is velocity (m/s).

Welch’s method

Welch’s method applies segmentation, windows, and weighting – series of preprocessing

techniques.

The method does the following thing:

a. Separating the acquired signal of length N into K segments with each segment has length L.

b. Multiplying each segment with a window function (Hamming, for instance).

c. Performing a Fourier transform to “each” segment.

d. Take the arithmetic mean of these segments.

Something noteworthy is that these actions are performed to the data points with no

information involving “time”. In other word, these actions assume that the sampling fr equency

is 1 Hz. Therefore, it’s necessary to scale the processed data with respect to the actually

sampling rate.

Hamming window algorithm:


2/10

wn 0.54 0.46cos 2 , 0 ≤ ≤ In order to show that, I tested the acquired data using hot-wire film:

Sampling rate: 25000/s

Sampling time: 1 s

Nyquist criteria: 12500 Hz

Maximum data points: 25000

Matlab function:

[pxx,f] = pwelch(x,window,noverlap,f,fs)

Output: pxx is the power density of x, f is the frequency calculated from the function pwelch.

Input: x is the raw signal, window is the number of samples each segment contains, noverlap isthe number of the samples overlaps in each of the two adjacent segments, f is the frequency

introduce to the fast Fourier transform sequence buried within pwelch (the common name for

this variable is nfft) or cyclical frequencies, and fs is the sampling rate in Hz.

Variable f is the tricky part because in the document both seems to be applicable…

[pxx,f] = pwelch(x,window,noverlap,f,fs)

[Pxx,f] = pwelch(x,window,noverlap,nfft,fs)

This is indeed irritating and must be investigate with different testing:

pwelch by default separates the data into eight segments with each overlapping no more than

50% of the data. This is very important information and took me very long time to discover this

from the matlab website.

To see how this black box, pwelch, work in practice, let’s start from the simplest form:

[pxx,f] = pwelch(x)


3/10

The lowest frequency for this Welch estimation is 3.1415…its ! Now based on our previous

understanding that if no designated sampling rate is provided, the algorithm assume this is 1 Hz.

In other word, the algorithm will assume its “1”. If we recover the actual frequency range from

Nyquist criteria, its 2. So why is that? Well, this is actually in the unit of radian. The algorithm

thinks the sampling rate is equivalent to 2 radians and proceed with this information.

Therefore, it’s inappropriate to plot them on the same figure because the unit of the two is

essentially different.

Number of elements in the frequency output is 4097, excluding the first point of 0 frequency

(DC), we have 4096 elements. Recall that if no information regarding segmentation is provided,

the algorithm will separate the signal into 8 segments with 50% overlapping. Therefore, 25000

samples will give 5555.555… samples in each segment with 50% overlapping. In order to have

fast Fourier transform operates with optimal performance, number of elements N to be

processed must be log2(N) = P and P must be an integer. In addition to that, 2^P has to be

greater than N. It’s quite evident that 8192 is the number we are looking for in this case. Apply

Nyquist criteria to avoid aliasing the total number of elements is 8192/2 = 4096! This explains

the mystery of where this 4097 elements came from.

The formula for finding number of segments and length of each segment is as follows:

( 12 ) where N is number of segments, l is length of segment and L is length of the raw signal.

As you probably noticed, this algorithm separates the entire data to 8 segments each have the

length of 5555 elements in time space and then perform the 8192 points discrete Fourier


4/10

transform (DFT) to achieve fast Fourier transform which end up having 8193 points (including

DC) in frequency space. If this statement confuses you please recall the definition of discrete

Fourier transform:

∙ −/−=

where N is 5555 and N is 8192 in this case.

Here is the irritating fact about pwelch method:

pwelch “scales” the frequency range of 2 with 8192 points. This implies that these windowed,

averaged 5555 points segments in time space are used to represent the 8193 points (including

DC) power spectral density in frequency space over the entire sampling frequency region. This

is a problem because we are interest in the whole range of frequency. The fewer the elements

each segment has the more information loss in the low frequency region (although we reduces

the variation). Resolution in frequency domain is another cost we paid.

Now let’s introduce more input variables:

[Pxx,f] = pwelch(x,[],0,[],fs)

where fs (sampling rate) in our case is 25000/s.

What this does is to perform Welch method with 8 segments (by default), no overlapping and

with sampling rate provided.


5/10

This example has 2049 elements with a resolution of 25000/4096 = 6.1035 Hz which is exactly

the value of the second element in the frequency output (first is zero). 25000/8 = 3125 and thus

4096 is the right number of elements used for fast Fourier transform. Last element of the

frequency output is also correct, it’s 12500 Hz. This method of no overlapping while averaging is

still performed is called Bartlett’s method.

Next is an example with default overlapping setting:

[Pxx,f] = pwelch(x,[],[],[],fs)

This is the first example “stretched (scaled)” with respect to the sampling rate of 25000/s. It has

exactly the same shape of the first example and it’s physically reasonable (because information

regarding sampling rate has been introduced)! This is the simplest form of the Welch method

and easiest to understand. It has 4097 elements and the resolution is 25000/8192 = 3.0518

which again is correct.

To prove this is how the pwelch works, I tested the following two sequence:

[Pxx,f] = pwelch(x,[],[],[],fs) - default

[Pxx,f] = pwelch(x,5555,2777,[],fs) – manual

where 5555 is number of elements per segment (total of 8 segments) and 2777 is 50% of 5555.


6/10

They match each other perfectly (output file checked!).

We can further compare how the number of segments affects the data:

As can be seen from the plot, PSD estimation is significantly smoothed if the number of

segments increases, of course, at the cost of resolution. Please noted that it’s not uncommon to

have N greater than 100.


7/10

This leaves us the last variable, f. Based on my understanding of fft in Matlab, what f does is

acting as the NFFT at which it determines the resolution of the frequency “mathematically”.

This is the variable that I feel uncomfortable to use. One can definitely have a very large value

of f to extend the low frequency section but without the source code it ’s hard to justify the

validity of the signal.

The extended frequency curve at low frequency is the main difference of these two curves.

The best way to remedy and improve the quality of the Welch method is to acquire a much

longer period of data to retain the low frequency information.

Parametric method

Yule-Walker method

Yule-Walker is a totally different approach of estimating the power spectral density. Unlike the

previous two method of directly convert the acquired signal, Yule-Walker is an autoregressive

method that sometimes called autocorrelation method. The idea of an autoregressive model is

to predict the evolution of a function based on the time history of the function itself.

The idea of autoregressive model is as follows:

− = x is the function or variable of interest, a is the AR coefficients, p is the order of the model and

is the error with a mean of zero.


8/10

The algorithm for the Yule-Walker model is as follows:

⋯ −⋮ ⋱ ⋮

−

⋯

…

…

where R is the autocovariance function:

≡ 1 −

=+ This series of linear equations solves a1 to ap as well as knowing the fact that has the

property of zero mean and variance 2.

The goal of this algorithm is to minimize by fitting a series of a. This is achieved by solving the

system of linear equations at which y is known prior. Once a and 2 is established, the power

spectral density can be calculated directly:

|1 ∑ exp2 = |

The derivation of this equation is very tedious so please refer to the textbook. Please note that

this equation is applicable to any parametric models.

The following is the symtax used in Matlab:

[Pxx,f] = pyulear(x,p,nfft,fs)

Identical to pwelch, nfft and fs both determines the range of the frequency.


9/10

This figure is for the case of fixed NFFT = 256 (default) and varying the order of the model.

Many literatures I have read argue that Yule-Walker method requires higher order p to get a

better result (including one of the papers cited by Bruno). I cannot distinguish any significant

difference from current data set, perhaps the number of samples is insufficient to see this

difference.

Summary

Based on my experience with data acquisition and signal processing so far, I would stick to

nonparametric method instead of entering the world of autoregressive models due to the fact

that I am lack of training in statistics.


10/10

It’s very hard to say which of the method is superior. Nevertheless the Welch method is

definitely widely accepted and understood in the fluid mechanics community while the

autoregressive modelling is being adopted by Poinsot/Veynante group and others.

Documents

Notes on Power Spectral Density (PSD) Estimation Using Matlab