FFT’s, Ensembles and Correlations

FFT’s, Ensembles and Correlations

B. A. Grierson∗Department of Applied Physics and Applied Mathematics

Columbia University, New York, NY 10027(Dated: August 30, 2006)

Statistics using the FFT and Correlation Analysis. The FFT (in IDL) takes a time series, andattempts to decompose it into a sum of sine and cosine functions. This monograph investigatesthe FFT as implemented in the programming language IDL. The simple FFT is then extended tolooking at ensemble spectra, cross coherence, etc . . . The Correlation function is also examined inrelation to the time series created by spatially separated diagnostics.

I. INTRODUCTION

Fourier Analysis is the most standard signal processing technique. It transforms a function via an integral (orsum) of sine and cosine function which can represent the function in an orthogonal basis. A periodic function can berepresented by

f(x) =A0

2+

∞∑m=1

Am cos mkx +∞∑

m=1

Bm sinmkx (1)

The coefficients are computed using

Am =2λ

∫ λ

0

f(x) cos mkx dx (2)

Bm =2λ

∫ λ

0

f(x) sinmkx dx (3)

The Discrete Fourier Transform of an N-element, one-dimensional function is given by

F (u) =1N

N−1∑x=0

f(x) exp[−j2πux/N ] (4)

and is the method most often used numerically.

II. THE FFT

The FFT is one of the crowning achievements of computational mathematics. It allows a fast, accurate method ofextracting the dominant frequencies of a time series. Classically, the Fourier Transform of a signal signal=SIN(2 *!PI * f0) will give a delta function in frequency space. The delta function exists at ± f0 with an amplitude of 0.5at each peak, and the spectrum is symmetric. The FFT method will be implemented as follows in the IDL language.

• The time series will be represented by the variable signal with corresponding time vector time

• The number of points in the time series will be denoted by nt

• The time step will be defined as dt=time(1)-time(0)

• The Digitizing Frequency is df=1.0/dt

• The Nyquist Frequency is Nyquist=df/2.0

• The signal must have a zero mean, accomplished by signal=signal-MEAN(signal)

• The signal will be windowed using a simple triangular window, which is like [0,. . . ,1,. . . ,0], to eliminate edgeeffectswindow=[FINDGEN(nt/2)/(nt/2),REVERSE(FINDGEN(nt/2-1))/(nt/2)]

Then signal=signal*window

2

Now the FFT is taken, and this is where is gets tricky. The function result=FFT(X) in IDL returns a complexvalued array, with a real and complex part. The real part has a peak of 0.25 at ± f0, while the imaginary part isantisymmetric. So, one quarter of the power is in each part. One quarter in the real part at + f0, one quarter in thereal part at − f0,. . . etc. Because the function is symmetric we only take the positive half. Then we take the absolutevalue ABS(Z)≡SQRT(Z*CONJ(Z)). This function is multiplied by 4.0 to achieve power of 1.0 at the peak.

It is of note that we are defining the amount of power at one frequency for a pure, infinite sinusoid with amplitudeon [-1,1] to be 1.0. If the sinusoid amplitude is different, then the power will reflect that.

Also, because we are not working with an infinite time series, the FFT cannot resolve frequencies lower than 1/∆twhere ∆t is the length of the time sample, and these values are thrown out.

III. ABSOLUTE VALUE FFT CODE

The code for this function is included:

FUNCTION MY ABS FFT,signal,time,FVALS=fvals,FATMAXP=fatmaxp; Returns the absolute value of the FFT in kHz; called by; fspec=MY ABS FFT(sig,nt,dt,FVALS=fvals,FATMAXP=fmax)nt=N ELEMENTS(time)delT=time(nt-1)-time(0)dt=time(1)-time(0)lowFreqCutoff=1.0/delT/1000.0 ;in kHzwindow=[FINDGEN(nt/2)/(nt/2),REVERSE(FINDGEN(nt/2-1))/(nt/2)]fvalues=FINDGEN(nt/2)/(dt*nt)/1000.0 ;in kHzfsignal=signal-MEAN(signal)fsignal=signal*windowff=FFT(fsignal,-1)ff=ABS(ff) ;Comment out this line for MY FULL FFTff2=4.0*ff(0:nt/2-1)ff2(WHERE(fvalues LE lowFreqCutoff))=0.0*ff2(WHERE(fvalues LE lowFreqCutoff))fatmaxp=fvalues(WHERE(ff2 eq MAX(ff2))) ;in kHzreturn,ff2END

IV. ENSEMBLE AVERAGES

When one is interested in the statistics of a fluctuating quantity over a long period of time, it can be advantageousto examine ‘Ensemble Averages’ of functions like the FFT. An ensemble average of a quantity A (usually denoted as< A >) can be expressed as

< A >=1M

M∑i=1

A (5)

where A can be a function of one or more variables. An ensemble can be calculated ‘in place’, or ’after the fact’. Iprefer the latter, because it allows contour plots on the function in time.

For example, the ensemble spectrum of a time series can be performed two equally valid ways. Say we have a timeseries on [0,1] seconds. We could perform 100 FFTs, add them up, and divide by 100. This would be accomplishedby the following code.

delT=0.01tStart=0.0tStop=delT

FOR i=0,99 DO BEGIN

frameTime=time(WHERE(time GE tStart AND time LE tStop))

3

frameSignal=signal(WHERE(time GE tStart AND time LE tStop))

frameFFT=MY ABS FFT(frameSignal,frameTime,FVALS=fvals)

IF i EQ 0 THEN ensembleFFT=frameFFT/100.0 ELSE ensembleFFT+=frameFFT/100.0

tStart=tStart+delTtStop=tStop+delTENDFOREND

This is a completely valid method, but I prefer to store each FFT in a rectangular grid rather than adding to avector. Storing each FFT and contour plotting creates a spectrograph, or TFD (time-frequency domain plot), wherethe power is the z-axis of the contour. The spectrograph (Fig.1) will reveal any frequency-sweeping, which is a timeevolution of the frequency where the power of the signal resides. One example of such a phenomena is a Hot ElectronInterchange burst in a dipole confined plasma. The mode creates quickly rising tones, which the ensemble spectrumwould ‘smear out’ when adding.

Once the TFD grid has been formed, the ensemble spectrum can be formed by taking the mean of each FFT at thespecific frequency, i.e., integrating in time. Furthermore, the grid can be refined by taking small steps in time, lessthan the time window delT, and create quite beautiful plots.

-2.42•10-1

2.93•10-3

2.48•10-1

0.1680 0.1688 0.1695 0.1702 0.1709Time (s)

083

167250333417500

Freq

(kHz

)

0.0

0.2

0.4

0.6

0.8

1.0

Four

ier A

mpl

itude

FIG. 1: A time-frequency domain (TFD) contour plot, consisting of individual FFTs placed in a rectangular array.

V. PHASE INFORMATION

When one wishes to extract phase information when comparing two signals, the full FFT is needed. The phasebetween two signals is defined as

α = tan−1(=(Z)/<(Z)) (6)

Numerically, Z is a function, and not a simple ordered pair of values. Therefore α is a function in frequencyspace. In order to extract the phase difference between two signals, we need a way to pick out which values of ϕ(f)correspond to a high correlation in the frequency domain. This is accomplished as follows.

Two signals, signal1 and signal2 are needed to extract phase information. We also need the full FFT spectra foreach signal, which means taking out the ABS( ) in the FFT code. This retains the real and imaginary part of eachspectrum.

Next the cross power frequency spectrum (Z) is formed by specCorr=spectrum1*CONJ(spectrum2).The phase function is formed by phaseFun=ATAN(IMAGINARY(specCorr),REAL PART(specCorr))The code for extracting the phase information from two signals is given below

4

VI. PHASE CODE

FUNCTION GET PHASE,signal1,signal2,time,$FVALS=fvals,$ABSSPECCORR=AbsPowerCorr,$PHVAL=phaseValue,$PHASEVALX=phaseValueX,$PHASEVALAMP=phaseValueAmp,$

dt=time(1)-time(0)delT=time(N ELEMENTS(time)-1)-time(0)lowFreqCutoff=1.0/delT/1000.0 ;In kHz

spec1=MY FULL FFT(signal1,time,FVALS=fvals)spec2=MY FULL FFT(signal2,time,FVALS=fvals)

spec1(WHERE(fvals LE lowFreqCutoff))=0.0*spec1(WHERE(fvals LE lowFreqCutoff))spec2(WHERE(fvals LE lowFreqCutoff))=0.0*spec2(WHERE(fvals LE lowFreqCutoff))

powerCorr=spec1*CONJ(spec2)AbsPowerCorr=ABS(powerCorr)phaseFun=-ATAN(IMAGINARY(powerCorr),REAL PART(powerCorr))gdIndex=WHERE( AbsPowerCorr EQ MAX(AbsPowerCorr) )phaseValue=phaseFun(gdIndex)phaseValueX=fvals(gdIndex)phaseValueAmp=MAX(powerCorr)

RETURN,phaseFunEND

The code will return the value of the phase between the signals, in radians, to the value phaseValue, occurring atfrequency phaseValueX, named so for plotting purposes, with cross power amplitude phaseValueAmp. These are thethree variables to track phase shifts in time, and record their relative strengths.

In practice, when signals are not pure sinusoids, an average needs to be taken of the phase values near where thepower correlation is a maximum. Therefore, one could grab the phase values where powerCorr is grater than 90% ofit’s maximum value, and use and average of those values to define the phase.

If the process being analyzed is quasi-stationary, then new, more sophisticated tools need to be used to extract thephase shift and other relevant statistics in time. We define the following notation and statistical quantities:

Fast Fourier Transform S(t) →FFT→ S(ω) (7)

Power Correlation C1,2(ω) ≡ S1S∗2 (8)

Cross Phase α1,2 ≡ tan−1

(=[C1,2(ω)]<[C1,2(ω)]

)(9)

Power Weighted Ensemble Cross Phase < α1,2 >≡∫

α1,2(ω)|C1,2(ω)|dω∫|C1,2(ω)|

(10)

Power Weighted Ensemble Cross Coherence < γ21,2 >≡

< |C21,2| >

< |S1| >< |S2| >(11)

Bispectrum B(ω1, ω2) = S(ω1)S(ω2)S∗(ω1 + ω2) (12)

Bicoherence b2(ω1, ω2) =< |B(ω1, ω2)|2 >

| < S(ω1)S∗(ω1) > |2| < S(ω1, ω2) > |2(13)

5

VII. CORRELATION ANALYSIS

The Correlation Function is a function of one signal (auto) or between signals (cross) which allows extraction oflag time and correlation time. The method integrates one signal (the refrence signal), and another signal shiftedin time. The ‘shift’ parameter is defined as τ , and is the variable which the function C1,2 is plotted against. Theequation is

C1,2 =∫

S1(t)S2(t− τ)dt (14)

and can be normalized, bounding C1,2(τ) ∈ [−1, 1], via

C1,2 =∫

S1(t)S2(t− τ)dt√∫S2

1(t)dt∫

S22(t)dt

(15)

Here we have defined the ‘Cross Correlation Function’, where the ‘Auto Correlation Function’ is defined as C1,1(τ).Consider the following illustrative example: If one wished to measure the velocity of a light source moving in a

straight line, and one only had two photodiodes, how would one accomplish the task?Placing the photodiodes in the path of the light source and recording the light intensity in time, together with

correlation analysis could yield the velocity. Two photodiodes, placed ∆x apart, and two light intensity traces withlag time τLag from cross correlation, gives the velocity of the source as ∆x/τLag. The sample principle can be appliedto any sensors separated in space.

The correlation time τCorr defines the time over which the correlation function decays in amplitude by half, andcan determine the ‘lifetime’ of an ‘event’ in an oscillating time series.

The method for creating the correlation function is defined as follows:

• Two signals are read into the routine, using signal1 as the ‘refrence’ signal.

• The middle half (from nt/2 to 3*nt/2) of signal1 times the middle half of signal2 is integrated over the timewindow which signal1 exists. This is the correlation coefficient.

• Now, signal2 is shifted forward and backward in time, by the digitizing time step, and integrated at each shift.

• It is of note that this is not a circular shifting routine, because we cannot assume that the function is circularor strictly periodic. We are ‘padding’ the function with more real data, hence the oversampling by ×2

• Because only the middle half of the signal sample is used, the time width must be much longer than thecharacteristic frequency of fluctuations (1/delT> few × characteristic frequency). If the fluctuations residenear 1kHz, than a time window of 5-10ms would be sufficient not to ‘miss’ the event.

As an example of auto and cross correlation, consider the oscillatory, enveloped functions

S1(t) = cos[2πf1t] exp[−50(t− 0.5)2]S2(t) = cos[2πf1(t− 0.05)] exp[−50((t− 0.05)− 0.5)2] (16)

where S2(t) has been shifted forward in time 0.05 time units, i.e. there is a lag time between S1 and S2. If S1 is takenas reference, then S2 lags S1, and τLag > 0. This is a very important definition, because it can determine directionof propagation. The signals (Fig.2) and correlation functions (Fig.3) are plotted below.

The correlation analysis, just as the Fourier Spectrum, can be extended to examine fluctuations in time. Includedin Fig.4 are plots of the auto and cross correlation for two probes spatially separated measuring floating potential ina plasma device. The correlation functions are for the same instability bursts seen in the above TFD Fig.1.

∗ Electronic mail: [email protected]; http://www.ap.columbia.edu/ctx/ctx.html; We gratefully acknowledge the sup-port of your mom.

mailto:[email protected]

http://www.ap.columbia.edu/ctx/ctx.html

6

Signal 1

0.0 0.2 0.4 0.6 0.8 1.0Time(s)

-1.0

-0.5

0.0

0.5

1.0Si

gnal

Signal 2

0.0 0.2 0.4 0.6 0.8 1.0Time(s)

-1.0

-0.5

0.0

0.5

1.0

Sign

al

FIG. 2: S1(t) and S2(t): S1 is overplotted against S2 in red. The green bars determine the middle half of the time window. Itcan be seen from the plot that S2 lags S1, i.e. occurs at a later time.

Auto Correlation

-0.2 -0.1 0.0 0.1 0.2! (s)

-1.0

-0.5

0.0

0.5

1.0

Corre

latio

n Am

plitu

de

*

** ** ** * * * * ** ** **

!Corr=0.140140Cross Correlation

-0.2 -0.1 0.0 0.1 0.2! (s)

-1.0

-0.5

0.0

0.5

1.0

Corre

latio

n Am

plitu

de

*

*** ** ** * * * * ** ** **

!Corr=0.141141!Lag=0.0510511

FIG. 3: Correlation Functions: The autocorrelation lag time τALag ≡ 0, whereas the cross correlation lag time τC

Lag ≈ +0.05, as

expected. Furthermore, τCLag < τA

Corr, τCCorr, which guarantees we are looking at the same ‘event’.

Auto Correlation C1,1(t,!)

0.168049 0.168794 0.169539 0.170284 0.171029Time (s)

-0.020-0.015-0.010-0.0050.0000.0050.0100.0150.020

! (m

s)

Cross Correlation C1,2(t,!)

0.168049 0.168794 0.169539 0.170284 0.171029Time (s)

-0.020-0.015-0.010-0.0050.0000.0050.0100.0150.020

! (m

s)

FIG. 4: Auto (left) and Cross (right) Correlation functions in time. It can be seen in the second figure that there exists apeak(red) which evolves in time, but maintains a positive τC

Corr for the duration of the instability burst. The second probe waslocated ‘downstream’ of the first probe, and the positive lag time confirms the direction of ‘propagation’.

Documents

FFT’s, Ensembles and Correlations